Active PAR

P3777

Standard for Benchmarking and Performance Metrics of Artificial Intelligence (AI) Agents

Express Interest in this Project

This standard establishes a unified framework for benchmarking Artificial Intelligence (AI) agents, including autonomous, collaborative, and task-specific agents. It defines core performance metrics, evaluation protocols, and reporting requirements to enable transparent, reproducible, and comparable assessment of AI agent capacities, capabilities, and performance. The framework supports multiple classification criteria for AI agents and emphasizes objective, binary evaluation measures wherever possible. Benchmarking dimensions may include efficiency, robustness, adaptability, ethical compliance, and interoperability options without prescribing agent-to-agent interoperation.

Standard Committee: C/AISC - Artificial Intelligence Standards Committee
Status: Active PAR
PAR Approval: 2025-12-10

Working Group Details

Society: IEEE Computer Society
Standard Committee: C/AISC - Artificial Intelligence Standards Committee
Working Group: 3777 - AI Agent Benchmarking and Performance Metrics
IEEE Program Manager: Christy Bahn
Contact Christy Bahn
Working Group Chair: wugeng geng

Other Activities From This Working Group

Current projects that have been authorized by the IEEE SA Standards Board to develop a standard.

No Active Projects

Standards approved by the IEEE SA Standards Board that are within the 10-year lifecycle.

No Active Standards

These standards have been replaced with a revised version of the standard, or by a compilation of the original active standard and all its existing amendments, corrigenda, and errata.

No Superseded Standards

These standards have been removed from active status through a ballot where the standard is made inactive as a consensus decision of a balloting group.

No Inactive-Withdrawn Standards

These standards are removed from active status through an administrative process for standards that have not undergone a revision process within 10 years.

No Inactive-Reserved Standards

Featured Links

Quick Links

Most Viewed Pages

Featured Links

Quick Links

Most Viewed Pages

P3777

Standard for Benchmarking and Performance Metrics of Artificial Intelligence (AI) Agents

Working Group Details

Other Activities From This Working Group

Featured Links

Quick Links

Most Viewed Pages

P3777

Standard for Benchmarking and Performance Metrics of Artificial Intelligence (AI) Agents

Working Group Details

Other Activities From This Working Group

Subscribe to our Newsletter