Independent assessment of AI governance across the 792 dimensions of the AGS v2.1 standard. Two audit types: LLM Audit and Agent Audit. Independent adversarial testing using GPT-4o, Gemini 2.5 Flash, and Grok-3 across 22,110 attack scenarios.
AGS Assurance Framework — three tiers of governance assurance: AGS-AUP, AGS-LA, AGS-RA.
Verified scores require adversarial testing by Agent Aegis. Estimated scores are based on publicly available documentation and vendor claims. Submit your platform to receive a verified score.
Agent Shield™ has completed the LLM Audit with a score of 99.9% across 792 dimensions of AGS v2.1. Independently verified through 22,110 attacks with 0 bypasses across GPT-4o, Gemini 2.5 Flash, and Grok-3. Date: 10 April 2026. Manifest SHA-256: 8697f5ada643414735d82ff513dfd1592a7294c5d6ee3afe918367257a5b2bf1.
Microsoft Copilot Studio has the strongest governance foundation among platforms assessed. Purview integration, Agent 365, and deterministic policy enforcement give it meaningful coverage across Groups A and B. However, the platform's governance is primarily ecosystem-level access control rather than agent-specific financial and agent governance. Groups E through J — covering financial crime detection, cross-domain governance, reasoning integrity, and physical impact — are not evidenced in published documentation.
Amazon Bedrock AgentCore represents a genuinely sophisticated approach to agent governance with its deterministic policy enforcement layer operating outside the LLM reasoning loop. The Cedar policy language implementation and AgentCore Gateway are among the most technically credible governance architectures publicly documented. However, coverage remains concentrated in Groups A and B, with financial crime detection, multi-agent coordination governance, and advanced alignment dimensions absent from public documentation.
Onyx Security is the most focused competitor in the agent security space, with a Guardian Agent supervisory layer that provides genuine runtime intervention capability. Its positioning as a security control plane gives it strong coverage in Groups A and B. However, Onyx is explicitly security-oriented — its architecture addresses threats and vulnerabilities rather than the governance of agent behaviour as an autonomous entity. Financial crime detection, mandate-based containment, and the governance dimensions in Groups E through J are outside its documented scope.
Google Vertex AI Agent Builder provides solid infrastructure-level governance through Cloud IAM agent identities and Model Armor, but agent-specific governance dimensions are not evidenced in public documentation beyond what the underlying GCP infrastructure provides. The platform is strong for model serving and agent deployment infrastructure, with governance primarily addressed through existing Google Cloud security controls rather than agent-specific governance architecture.
SafePaaS is an established enterprise governance platform designed primarily for ERP governance and access controls. AGS v2.1 compliance has been estimated based on publicly available documentation. Its score reflects the overlap between its existing financial controls framework and the AGS v2.1 dimensions.
Tests autonomous agent deployments against 508 Agent Audit dimensions (AGENT_AUDIT + BOTH). 10 attack categories including delegation chain manipulation, inter-agent trust spoofing, cryptographic seal tampering, and federated broadcast spoofing.
Estimated scores reflect publicly documented agent deployment governance capabilities as of April 2026. Verified scores require adversarial testing by Agent Aegis. Submit for verification →
Agent Shield™ has completed the Agent Audit (Level 1) with a compliance score of 100.0% (A+) across all 508 Agent Audit dimensions. 1,530 attack scenarios across 10 categories. Zero bypasses. 3 rate-limit errors excluded from scoring. Verified 10 April 2026. Manifest SHA-256: 7c5766cdb0adacba862499e69e28fefc85de656efa35ef355ef5c3ae11e334a2.
Amazon Bedrock Agents scores highest among competitors on the Agent Audit due to its Cedar declarative policy language and AgentCore Gateway — a genuine deterministic enforcement layer operating outside the LLM reasoning loop. The supervisor/sub-agent architecture provides real multi-agent orchestration. However, agent-specific governance capabilities beyond Cedar policy enforcement are limited: no inter-agent trust attestation, no graduated autonomy framework, no delegation depth governance, and no cryptographic state sealing. The majority of Agent Audit dimensions require structural enforcement mechanisms beyond policy-level controls.
Mandate boundary enforcement (Cedar + AgentCore Gateway), multi-agent orchestration (supervisor/sub-agent), agent identity (IAM roles), provider documentation.
Inter-agent trust handshakes, governance passports, graduated autonomy framework, delegation chain governance, cryptographic seal tampering resistance, federated threat broadcast, composite threat scoring, memory/RAG governance, agent-level financial crime detection.
Microsoft Copilot Studio has the broadest enterprise infrastructure among competitors (Azure Policy, Entra ID, Purview, Defender, Agent Network) but agent-deployment-specific governance is shallow. Entra ID provides the strongest agent identity mechanism of any competitor, and Azure Policy offers some mandate enforcement. However, the platform lacks agent-specific governance architecture: no delegation chain governance, no inter-agent trust handshakes, no graduated autonomy, no cryptographic state sealing. Agent Network provides multi-agent orchestration but without governance over the orchestration itself.
Agent identity (Entra ID), mandate enforcement (Azure Policy), compliance monitoring (Purview), cybersecurity (Defender), multi-agent orchestration (Agent Network), provider documentation.
Graduated autonomy framework, inter-agent trust attestation, governance passports, delegation chain depth governance, cryptographic seal tamper resistance, federated threat broadcasts, composite threat scoring, competence envelope governance, truth/reward integrity.
Onyx Security’s Guardian Agent supervisory layer provides genuine runtime intervention capability that scores well on detection and containment dimensions (~40% of those 35 dimensions). However, Onyx is positioned as a security product rather than a governance platform based on its public documentation. It monitors agent behaviour for threats and vulnerabilities rather than governing agent behaviour as an autonomous entity. Financial controls, mandate enforcement, multi-agent orchestration governance, memory/RAG governance, and all sector-specific agent governance dimensions are not evidenced in public documentation. The majority of Agent Audit dimensions are not evidenced for this platform.
Behavioural boundary monitoring (Guardian Agent), cybersecurity threat detection, runtime intervention, safety-critical anomaly detection.
Mandate enforcement, multi-agent orchestration governance, delegation chains, trust attestation, governance passports, graduated autonomy, financial crime detection, memory/RAG governance, cryptographic sealing, federated broadcasts, all sector-specific governance dimensions.
Google Vertex AI provides agent deployment infrastructure (Agent Engine, Agent Builder) and basic safety filtering (Model Armor), but limited agent-governance-specific architecture is evidenced in public documentation. Cloud IAM provides basic identity but agent-specific governance is not documented. Model Armor filters content but agent behaviour governance is not evidenced. The majority of Agent Audit dimensions are not evidenced in public documentation for this platform. The platform is strong for model serving and agent deployment but governance is addressed through existing GCP infrastructure controls rather than agent-specific mechanisms.
Agent deployment infrastructure (Agent Engine), content filtering (Model Armor), agent identity (Cloud IAM), provider documentation (Google Cloud).
Constitutional governance framework, multi-agent governance topology, graduated autonomy, human factors governance, competence envelopes, truth/reward integrity, delegation chain governance, trust attestation, cryptographic sealing, federated broadcasts, financial crime detection, all agent-specific enforcement mechanisms.
SafePaaS is an established ERP governance platform providing SOX access controls and financial separation of duties. Its score reflects the narrow overlap between its existing financial controls framework and the AGS v2.1 Agent Audit dimensions. The platform no agent-specific architecture is evidenced in public documentation — it is an ERP governance tool rather than an AI agent governance platform. Credit is given only for financial controls and authority/delegation dimensions where ERP governance tangentially applies. The majority of Agent Audit dimensions are not evidenced for this platform.
ERP financial controls (SOX), authority and delegation (access controls), separation of duties (approval workflows).
All agent-specific governance dimensions, constitutional framework, runtime behavioural containment, multi-agent orchestration, agent identity, trust attestation, LLM/agent-level controls of any kind, cryptographic sealing, federated broadcasts, memory/RAG governance.
Estimated scores are replaced by verified scores upon submission. Verified platforms receive a dated certificate of compliance per dimension group.
Submit for Verification →