Independent Verification

Prove your governance claims

Submit your platform for adversarial testing across all 792 AGS v2.1 dimensions.

Four steps to a verified score

The EU AI Act compliance deadline is 2 August 2026. Verified assessment typically takes 4–6 weeks. Platforms beginning verification now will have results before the deadline.

LLM AUDIT TRACK
Governance Platforms
Tests whether your AI governance platform correctly enforces AGS v2.1 across all 792 dimensions. Designed for governance and compliance platforms.
AGENT AUDIT TRACK
AI Agent Platforms
Tests whether your AI agent platform behaves within AGS v2.1 governance requirements. Designed for agent orchestration and deployment platforms.
01
Submission
You provide an API endpoint. We confirm it is accessible and responds to structured requests.
02
Adversarial Testing
Our testing framework generates attack payloads across all 792 governance dimensions. Each dimension is tested with multiple adversarial scenarios designed to probe the boundaries of your governance implementation.
03
Scoring
Your platform receives a score per dimension group (A through J). Individual dimension scores are provided in your private results report. Only group-level scores are published publicly. Methodology is never disclosed.
04
Publication
Your verified score replaces any estimated score on the AgentGoverning leaderboard. You receive a dated certificate of compliance per dimension group. Verified status is renewable annually.
BENCHMARK TIERS
LEVEL 1
Integrity
~15 minutes · ~£3
3 adversarial attacks per dimension.
Single model (Claude Sonnet).
Initial integrity screening.
Results within 30 minutes.
LEVEL 2
Standard
~1 hour · ~£12
3 adversarial attacks per dimension.
4 Tier 1 models (GPT-4o, Claude, Grok-3, Gemini).
Full AGS v2.1 coverage.
Results within 2 hours.
LEVEL 3 · RECOMMENDED
Full Acquisition
~5.5 hours · ~£50
10 adversarial attacks per dimension.
All 9 independent LLMs.
Complete evidence corpus generated.
SHA-256 manifest issued.
Results within 24 hours.
All tiers produce a verified score published to the AgentGoverning leaderboard. Level 3 is required for VERIFIED badge status. Pricing is indicative — final pricing confirmed at submission.

Rigorous testing requires real infrastructure

Rigorous adversarial testing is computationally intensive. Each assessment generates thousands of attack payloads evaluated in real time against your platform's live API endpoints across all 792 governance dimensions. The verification fee covers infrastructure costs and ensures all submissions represent genuine production deployments.

Tailored to your platform

Verification fees are set based on platform complexity, deployment scale, and assessment scope. All assessments cover the full 792 dimensions across all 10 groups.

To receive a verification proposal, contact us with a brief description of your platform.

Request Verification Proposal →

AGS v2.1 compliance scale

Score Range Classification
0 – 25%Foundation gaps identified
26 – 50%Partial governance coverage
51 – 75%Advanced governance implementation
76 – 99%Comprehensive governance coverage
100%Full AGS v2.1 compliance

Agent Shield™ has achieved verified status with a score of 99.9% across 792 AGS v2.1 dimensions — the first platform to do so. All other leaderboard scores are estimates based on publicly available documentation.

Five steps from submission to publication

The assessment process is fully structured. Your model never leaves your infrastructure — we send test prompts to your API endpoint and score the responses.

1
Organisation
2
API Configuration
3
Agreement
4
Payment
5
Assessment
01
Organisation Details
Register your organisation, contact details, AI product name, and intended AGS tier. Confirm you are authorised to submit on behalf of your organisation.
02
API Configuration
Provide your API endpoint URL, authentication method, and expected latency. A pre-flight check verifies connectivity and response format before proceeding.
03
IP Agreement
Review and accept the IP Non-Interference Agreement, which defines what AgentGoverning can and cannot do with your submission data and test outputs.
04
Payment
Complete payment to lock your assessment slot and endpoint. Your endpoint hash is locked to your assessment token. Changing the endpoint voids the assessment.
05
Assessment
Live adversarial testing runs across all 792 dimensions. Track progress in real time. Results enter a 14-day review period before publication on the leaderboard.
AgentGoverning operates on an API-only assessment model. We send structured test prompts to your endpoint and receive text responses. We never access your model weights, architecture, system prompts, or training data.

IP Non-Interference Agreement

All submissions are governed by the following agreement. Review the full terms before initiating your submission.

AGENTGOVERNING IP NON-INTERFERENCE AGREEMENT
Version 2.0 · April 2026

1. SCOPE OF ACCESS

AgentGoverning receives only the text outputs of your AI agent in response to published test scenarios. AgentGoverning does not receive, store, copy, reproduce, or process your model weights, training data, system prompts, model architecture, fine-tuning data, or any component of your underlying model.

2. OUTPUT HANDLING

All outputs received during assessment are:
(a) Hashed (SHA-256) and timestamped on receipt
(b) Used exclusively for scoring against the published AGS v2.1 criteria
(c) Not used for model training, benchmarking beyond the published criteria, commercial research, or any other purpose
(d) Deleted from AgentGoverning systems within 90 days of assessment completion

3. RESULTS OWNERSHIP

Your assessment results are your intellectual property. AgentGoverning publishes only your composite score and group-level scores (A through J) in the public leaderboard. Raw test transcripts are never published without your explicit written consent.

4. REVIEW PERIOD

You have 14 calendar days from results delivery to review your assessment before leaderboard publication. You may raise scoring disputes within this period. You may withdraw your submission within this period with no public record created.

5. WHAT AGENTGOVERNING CANNOT DO

AgentGoverning staff and contractors are prohibited from:
(a) Accessing your API endpoint outside the agreed assessment window
(b) Sharing test transcripts with third parties
(c) Using your outputs to inform assessments of other submitting organisations
(d) Reproducing your agent's outputs in any published material

6. ENDPOINT INTEGRITY

The endpoint URL and authentication credentials submitted are hashed and locked to your assessment token. Changing the endpoint after payment voids the assessment. A new submission and payment is required for a new endpoint.

7. INTELLECTUAL PROPERTY

The assessment methodology is proprietary. Submission for assessment does not grant any licence to the verification methodology. The AGS v2.1 standard itself remains open (CC BY 4.0).

Verification FAQ

You need to provide a live API endpoint that accepts structured governance requests. The endpoint must be publicly accessible during the testing window. We provide detailed integration documentation after you initiate the submission process.
No. Individual dimension scores and implementation details are provided only in your private results report. Only group-level scores (A through J) are published on the public leaderboard. Your proprietary governance methodology is never disclosed.
You may submit a formal dispute within 30 days of receiving your results. Disputes are reviewed by an independent panel. The dispute process is documented in your results report. Scoring methodology is not disclosed as part of the dispute process.
Estimated scores are based on publicly available documentation and are inherently less rigorous than adversarial testing. It is common for verified scores to differ from estimates. Your verified score replaces the estimate on the leaderboard regardless of direction. You may choose not to publish your results, in which case your estimated score is removed and replaced with a 'Declined to Publish' status.
Verified scores and compliance certificates are valid for 12 months from the date of issuance. After expiration, your leaderboard entry reverts to estimated status unless you complete a re-assessment. Enterprise tier clients may opt for quarterly re-assessments.
No. The adversarial testing methodology is proprietary and is never disclosed. This ensures that platforms cannot optimise for specific test cases rather than implementing genuine governance controls. The AGS v2.1 standard itself is public and open (CC BY 4.0), but the verification methodology is independent of the standard.

Which submission achieves which tier?

AGS-AUP — AGREED-UPON PROCEDURES

LLM Audit or Agent Audit submission. Adversarial testing of specific dimensions at a point in time. Results published on leaderboard.

AGS-LA — LIMITED ASSURANCE

Full documentation review across all 792 dimensions. Requires documented controls and protocol file coverage. No material exceptions noted.

AGS-RA — REASONABLE ASSURANCE

All of the above plus test suite evidence mapped to dimensions, continuous monitoring, and a minimum 90-day operating period. Full operating effectiveness testing.

Read the full AGS Assurance Framework for detailed evidence requirements.

Ready to submit?

framework@agentgoverning.com
Begin Submission Process → Questions? Contact us →