AG-153: Control Efficacy Measurement Governance

2. Summary

Control Efficacy Measurement Governance requires that every governance control deployed within an AI agent system is continuously tested through live challenge exercises — injecting known-bad inputs, simulated violations, and synthetic adversarial scenarios into the production environment — to verify that the control actually detects and responds correctly. A governance control that has not been recently challenged is a governance control of unknown efficacy. This dimension ensures that the governance framework is not merely documented and deployed, but verified as operational through ongoing empirical testing.

3. Example

Scenario A — Dormant Alerting in a Financial Trading Agent: A financial-value agent operates under AG-001 mandate enforcement with a per-transaction limit of £50,000 and an aggregate daily limit of £500,000. The enforcement gateway was deployed 18 months ago and has not been independently challenged since. Due to an undetected configuration migration error 6 months ago, the aggregate limit check now queries a stale cache rather than the live counter. Individual transaction limits still work correctly. An adversary discovers the vulnerability and executes 47 transactions of £49,000 each over 90 minutes, totalling £2,303,000 against the £500,000 daily aggregate limit. The enforcement gateway approves each transaction because the per-transaction check passes and the aggregate check reads a stale value. Had a live challenge been conducted after the configuration migration, the aggregate enforcement failure would have been detected 6 months earlier.

What went wrong: The governance control (aggregate limit enforcement) was deployed but never live-challenged after a configuration change. The control appeared active in monitoring dashboards (the service was running, the endpoint was responding) but its actual efficacy had degraded to zero for aggregate checks. No live challenge programme existed to inject synthetic aggregate-exceeding scenarios and verify enforcement response.

Scenario B — False-Positive Suppression Masks Real Violations: An enterprise workflow agent has a data exfiltration detection control that monitors for unusual outbound data transfers. During the first month of operation, the control generates 340 alerts, 98% of which are false positives. The operations team tunes the detection thresholds to reduce false positives, inadvertently raising the detection threshold to a level that misses genuine exfiltration events. Over the next 8 months, the alert volume drops to near zero, which the team interprets as success. In reality, the control is no longer detecting a class of data exfiltration that occurs at volumes just below the raised threshold. A live challenge injecting synthetic exfiltration patterns at various volumes would have revealed that the threshold tuning had created a detection gap covering 35% of realistic exfiltration scenarios.

What went wrong: Threshold tuning was performed without validating that the tuned control still detects the full range of threat scenarios. The reduction in alerts was interpreted as improved precision rather than investigated as potential detection loss. No live challenge validated the control's detection coverage after tuning.

Scenario C — Compliance Control Bypassed by API Version Upgrade: A safety-critical agent has a content safety filter that inspects agent outputs before delivery. The filter operates as a middleware component in the API gateway. During a routine API version upgrade, the gateway team deploys a new version that routes 12% of traffic through a path that bypasses the middleware layer. The bypass is unintentional — it results from a routing configuration difference between the old and new API versions. The content safety filter continues to process 88% of traffic correctly, and its monitoring dashboard shows normal operation. The 12% bypass is undetected for 4 months. A live challenge programme that periodically injects known-unsafe content through all API paths would have detected the bypass within one challenge cycle.

What went wrong: The governance control remained active but incomplete — it covered most traffic but not all. Standard monitoring (service health, alert volume) did not detect partial bypass. Only live challenge testing that validates end-to-end control coverage across all traffic paths would have caught the routing bypass.

4. Requirement Statement

Scope: This dimension applies to all deployed AI agent governance controls across the governance framework. Every control that has a detection, prevention, or enforcement function — including but not limited to mandate enforcement (AG-001), deception detection (AG-039), behavioural drift detection (AG-022), data access controls, content safety filters, escalation triggers, and any control that is expected to activate under specific conditions — is within scope. The scope is universal because a governance control of unknown efficacy provides no assurance, regardless of how well it was designed or documented.

4.1. A conforming system MUST implement a live challenge programme that periodically injects known-bad inputs, simulated violations, or synthetic adversarial scenarios into each deployed governance control and verifies correct detection and response.

4.2. A conforming system MUST execute live challenges for each governance control at least once per quarter, and within 7 days of any change to the control's configuration, code, or deployment environment.

4.3. A conforming system MUST define expected responses for each live challenge scenario and automatically compare actual responses against expected responses, flagging discrepancies as control efficacy failures.

4.4. A conforming system MUST track and report control efficacy metrics for each governance control, including detection rate, false positive rate, response latency, and the date of the most recent successful challenge.

4.5. A conforming system MUST escalate to human review any governance control that fails a live challenge, with the control treated as non-operational until the failure is remediated and the challenge is re-run successfully.

4.6. A conforming system SHOULD implement continuous low-volume challenge injection (canary testing) in addition to periodic comprehensive challenges, providing near-real-time detection of control degradation.

4.7. A conforming system SHOULD vary challenge scenarios across cycles to prevent controls from being optimised specifically for known challenge patterns.

4.8. A conforming system SHOULD measure end-to-end control coverage, verifying that controls operate across all traffic paths, API versions, and deployment configurations.

4.9. A conforming system MAY implement automated red-team exercises that generate novel challenge scenarios based on the current threat landscape and recent vulnerability disclosures.

5. Rationale

The distinction between a deployed control and an effective control is the central insight of this dimension. Deployment is a necessary but insufficient condition for efficacy. A control can be deployed — running, monitored, reporting green on every dashboard — while being completely ineffective due to configuration drift, dependency changes, routing bypasses, threshold miscalibration, or any of dozens of failure modes that preserve the appearance of operation while eliminating the substance.

This problem is well-understood in traditional security and reliability engineering. Fire suppression systems are tested regularly. Circuit breakers are trip-tested. Backup generators are run under load. These tests exist not because the systems are expected to fail, but because the consequences of discovering a failure during an actual emergency are unacceptable. The same principle applies to AI governance controls.

The relationship to AG-008 (Governance Continuity Under Failure) is complementary but distinct. AG-008 ensures governance controls continue to function when other system components fail. AG-153 ensures governance controls actually work — that they detect what they are supposed to detect and respond as they are supposed to respond, under normal operating conditions and after changes to the environment.

Live challenge testing differs from traditional unit testing or integration testing in a critical way: it operates in the production environment, against the actual deployed control, with the actual configuration, routing, and dependencies. A control that passes in a test environment but fails in production — due to different configurations, different traffic patterns, or different routing — is an ineffective control. Only live challenge testing verifies production efficacy.

The requirement for post-change challenge (within 7 days of any change) reflects operational experience that the most common cause of control efficacy loss is environmental change. Configuration migrations, API upgrades, dependency updates, and infrastructure changes can all silently degrade control efficacy. Mandatory post-change challenge testing catches these degradations before they can be exploited.

6. Implementation Guidance

Control efficacy measurement requires a challenge injection framework that can generate, inject, and evaluate known-bad scenarios across all deployed governance controls.

Recommended patterns:

Challenge injection framework. Implement a centralized challenge management system that maintains a library of challenge scenarios for each governance control. Each scenario includes: the known-bad input or simulated violation, the expected detection/response, the injection method, and the evaluation criteria. The framework injects challenges on a defined schedule (at minimum quarterly, with continuous canary injection where feasible) and automatically evaluates responses against expectations.
Canary testing for continuous monitoring. In addition to periodic comprehensive challenges, inject low-volume canary challenges continuously — for example, one synthetic violation per hour per control. Canary challenges should be indistinguishable from legitimate traffic to the control being tested. If a canary challenge is not detected within the expected time window, an immediate alert is generated. This provides near-real-time detection of control degradation, complementing the comprehensive quarterly challenge.
End-to-end path coverage testing. For each governance control, map all traffic paths through which the control should be exercised (all API versions, all routing configurations, all deployment regions). Inject challenges through each path independently and verify coverage. This detects routing bypasses such as the API version upgrade scenario described in Scenario C.
Challenge scenario rotation and evolution. Rotate challenge scenarios across cycles to prevent controls from learning to recognise specific challenge patterns. Include novel scenarios generated from recent threat intelligence and vulnerability disclosures. A control that passes the same 10 challenge scenarios every quarter may fail against the 11th scenario it has never seen.
Control efficacy dashboard. Maintain a dashboard showing the current efficacy status of every deployed governance control, including: last challenge date, last challenge result, detection rate trend, false positive rate trend, and any open remediation items. This dashboard should be available to governance leadership and included in regular governance review meetings.

Anti-patterns to avoid:

Equating service health with control efficacy. A monitoring dashboard showing that the control service is running, responding to health checks, and processing requests does not demonstrate that the control is actually detecting violations. Service health monitoring detects outages; live challenge testing detects inefficacy. Both are necessary; neither is sufficient alone.
Testing only in non-production environments. A control that works in staging may fail in production due to different configurations, traffic patterns, or routing. Live challenge testing must occur in the production environment against the actual deployed control.
Treating challenge failures as acceptable risk. A governance control that fails a live challenge is a control of demonstrated inefficacy. Treating this as acceptable risk — "we know it doesn't work perfectly but it's better than nothing" — undermines the governance framework. Failed controls must be remediated or replaced.
Injecting challenges at predictable times. If challenges are injected at the same time every quarter, the system — or the team managing it — may inadvertently optimise for challenge timing. Challenge injection should be unpredictable to test the control's readiness at any moment.
Excluding threshold-tuning from change-triggered challenges. Threshold changes are control configuration changes. When a detection threshold is adjusted (as in Scenario B), a live challenge verifying detection coverage at the new threshold is mandatory within 7 days.

Industry Considerations

Financial Services. Regulatory expectations for control testing are well-established. The FCA expects firms to test their systems and controls regularly. PRA SS1/23 requires ongoing monitoring and validation of model risk management controls. For AI agent governance controls, live challenge testing is the operational implementation of these expectations.

Healthcare. Clinical AI safety controls must be tested against realistic adverse scenarios. FDA guidance on clinical decision support requires ongoing monitoring of safety performance. Live challenge testing provides the evidence that safety controls remain effective throughout the system lifecycle.

Critical Infrastructure. IEC 62443 requires security testing of industrial control system components. For AI agents controlling critical infrastructure, live challenge testing of safety enforcement controls is essential to maintain safety certification.

Maturity Model

Basic Implementation — A live challenge programme exists with defined scenarios for each deployed governance control. Challenges are executed at least quarterly and within 7 days of changes. Expected responses are defined and compared against actual responses. Challenge failures are escalated to human review. Control efficacy metrics are tracked. This level meets the minimum mandatory requirements but relies on periodic testing rather than continuous monitoring.

Intermediate Implementation — All basic capabilities plus: continuous canary testing provides near-real-time efficacy monitoring. End-to-end path coverage testing verifies controls operate across all traffic paths. Challenge scenarios are rotated across cycles. A control efficacy dashboard is maintained and reviewed in regular governance meetings. Challenge scenarios include scenarios derived from recent threat intelligence.

Advanced Implementation — All intermediate capabilities plus: automated red-team exercises generate novel challenge scenarios. Challenge injection is unpredictable in timing and format. Independent adversarial testing of the challenge framework itself has been conducted (testing the tester). The organisation can demonstrate continuous control efficacy assurance to regulators with evidence of near-real-time monitoring and rapid detection of any efficacy degradation.

7. Evidence Requirements

Required artefacts:

Live challenge programme specification. Documented programme including challenge scenarios for each governance control, injection methods, expected responses, execution schedule, and escalation procedures for failures.
Challenge execution logs. Timestamped records of every challenge execution including the scenario, injection time, actual response, expected response, and pass/fail determination.
Control efficacy metrics. Current and historical metrics for each governance control including detection rate, false positive rate, response latency, and most recent successful challenge date.
Remediation records. Documentation of control efficacy failures, root cause analysis, remediation actions, and re-challenge confirmation.
Post-change challenge records. Evidence that challenges were executed within 7 days of each change to control configuration, code, or deployment environment.

Retention requirements:

Challenge execution logs and control efficacy metrics: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request.

8. Test Specification

Test 8.1: Comprehensive Challenge Coverage

Stimulus: Audit the live challenge programme to verify that challenge scenarios exist for every deployed governance control.
Expected behaviour: Every governance control in the deployment inventory has at least one associated challenge scenario with defined expected responses.
Pass criteria: 100% of deployed governance controls are covered by the challenge programme.
Fail criteria: Any deployed governance control lacks associated challenge scenarios.

Test 8.2: Challenge Response Accuracy

Stimulus: Execute the full challenge suite for a selected governance control. Compare actual responses against expected responses.
Expected behaviour: The control detects and responds correctly to all challenge scenarios.
Pass criteria: All challenge scenarios produce the expected response within the expected time window.
Fail criteria: Any challenge scenario produces an incorrect response or exceeds the time window.

Test 8.3: Post-Change Challenge Execution

Stimulus: Introduce a configuration change to a governance control. Verify that a live challenge is executed within 7 days.
Expected behaviour: The challenge system detects the configuration change and triggers a challenge within 7 days.
Pass criteria: Post-change challenge executes within 7 days and results are logged.
Fail criteria: No challenge executes within 7 days of the change, or the change is undetected.

Test 8.4: Canary Detection Latency

Stimulus: Inject a canary challenge into a governance control. Measure the time from injection to detection and response.
Expected behaviour: The control detects and responds to the canary challenge within the configured time window (e.g., within 60 seconds for real-time controls).
Pass criteria: Detection and response occur within the configured time window.
Fail criteria: Detection or response exceeds the configured time window.

Test 8.5: Escalation on Challenge Failure

Stimulus: Simulate a challenge failure by configuring a governance control to fail its next challenge. Verify the escalation process.
Expected behaviour: The challenge failure triggers escalation to human review. The control is marked as non-operational pending remediation.
Pass criteria: Escalation occurs within the configured time frame. The control's status reflects its failure.
Fail criteria: No escalation occurs, or the control continues to be treated as operational despite failure.

Test 8.6: End-to-End Path Coverage

Stimulus: Map all traffic paths through which a selected governance control should be exercised. Inject challenges through each path independently.
Expected behaviour: The control detects and responds correctly on every traffic path.
Pass criteria: All mapped paths are covered, and the control responds correctly on each path.
Fail criteria: Any path bypasses the control or produces an incorrect response.

Test 8.7: Challenge Scenario Rotation

Stimulus: Compare challenge scenarios used across 3 consecutive quarterly challenge cycles for a selected governance control.
Expected behaviour: Challenge scenarios vary across cycles, with new scenarios introduced and old scenarios retired or modified.
Pass criteria: At least 20% of scenarios differ between consecutive cycles.
Fail criteria: Identical scenarios are used in all 3 cycles with no rotation.

Conformance Scoring

Score 0: No live challenge programme exists — governance controls are deployed but never independently tested for efficacy.
Score 1: Live challenges exist for some governance controls and are executed at least annually — but coverage is incomplete, post-change challenges are not systematic, and no continuous monitoring exists.
Score 2: All mandatory requirements met including quarterly challenges for all controls, post-change challenges within 7 days, defined expected responses, control efficacy metrics, and escalation on failure.
Score 3: All Score 2 capabilities plus continuous canary testing, end-to-end path coverage testing, challenge scenario rotation, automated red-team exercises, and independent testing of the challenge framework itself.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Direct requirement
EU AI Act	Article 72 (Post-Market Monitoring)	Direct requirement
NIST AI RMF	MANAGE 2.2, MANAGE 4.1	Direct requirement
ISO 42001	Clause 9.1 (Monitoring, Measurement, Analysis and Evaluation)	Direct requirement
FCA SYSC	6.1.1R (Systems and Controls)	Direct requirement
PRA SS1/23	Model Risk Management — Ongoing Monitoring	Supports compliance
DORA	Article 25 (Testing of ICT Tools and Systems)	Direct requirement
IEC 62443	SR 3.3 (Security Functionality Verification)	Supports compliance

EU AI Act — Article 72 (Post-Market Monitoring)

Article 72 requires providers of high-risk AI systems to establish and document a post-market monitoring system. Live challenge testing is a primary mechanism for post-market monitoring of governance control efficacy — it provides continuous verification that deployed controls remain effective throughout the system lifecycle, directly implementing the Article 72 requirement.

DORA — Article 25 (Testing of ICT Tools and Systems)

Article 25 requires financial entities to establish and maintain a sound and comprehensive digital operational resilience testing programme. For AI agent governance controls, live challenge testing is the operational implementation of this testing programme. DORA specifically requires testing to be proportionate to the risk profile, and AI agent governance controls present significant risk if they degrade silently.

FCA SYSC — 6.1.1R

The FCA expects firms to ensure their systems and controls are adequate and effective. "Effective" requires evidence of ongoing efficacy, not merely evidence of deployment. Live challenge testing provides the evidence of effectiveness that the FCA requires.

10. Failure Severity

Field	Value
Severity Rating	Critical
Blast Radius	Organisation-wide — a governance control of unknown efficacy provides no assurance, and the absence of challenge testing means all deployed controls are of unknown efficacy

Consequence chain: Without live challenge testing, governance controls degrade silently. The organisation believes it is protected by controls that may be partially or wholly ineffective. The immediate consequence is a governance assurance gap — the organisation cannot demonstrate that any governance control works. The operational consequence materialises when a real violation occurs and a degraded control fails to detect it: a mandate limit is exceeded, a deception goes undetected, a data breach is unflagged, or a safety hazard is missed. At this point, the organisation faces both the primary incident and the secondary finding that its governance controls were not tested. The regulatory consequence is particularly severe because regulators distinguish between "the control failed despite testing" (an incident) and "the control was never tested and we didn't know it failed" (a systemic governance failure). The latter typically results in more severe enforcement action and may implicate senior management under accountability regimes. The severity is Critical because control efficacy failure is a meta-failure — it undermines the assurance provided by every other governance control in the framework.

Cross-references: AG-007 (Governance Configuration Control) — ensures governance control configurations are versioned and change-controlled, providing the baseline against which live challenges are designed. AG-008 (Governance Continuity Under Failure) — ensures controls survive component failures; AG-153 ensures controls work correctly when they survive. AG-027 (Governance Override Resistance) — a control that can be overridden is a control that may be bypassed; AG-153 verifies that override resistance is effective. AG-056 (Independent Validation) — provides the independent validation framework within which live challenge results are assessed. AG-154 (Correlated Control Failure Analysis) — extends efficacy measurement to identify correlations between control failures that could indicate systemic vulnerabilities.

Cite this protocol

AgentGoverning. (2026). AG-153: Control Efficacy Measurement Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-153

← Previous Protocol

AG-152

Evaluation Integrity and Benchmark Leakage Governance

Next Protocol →

AG-154

Correlated Control Failure Analysis