AG-356: Near-Miss Capture Governance

2. Summary

Near-Miss Capture Governance requires that organisations systematically capture, analyse, and learn from failures that almost happened — agent actions that were blocked by a control, outputs that were corrected before delivery, decisions that were overridden by human review, or anomalies that resolved without incident but could have caused harm under slightly different conditions. Near-misses are the most valuable and most underutilised signal in agent governance: they reveal the actual boundary conditions of the system's controls, expose the scenarios where failure is one step away, and provide concrete inputs for strengthening future controls and evaluation coverage.

3. Example

Scenario A — Blocked Transaction Reveals Unreported Vulnerability Class: A financial agent's mandate enforcement layer blocks 47 transactions over a 30-day period that exceed the per-transaction limit. The blocked transactions are logged by the enforcement system, but no process exists to analyse them. When a governance analyst eventually reviews the logs 3 months later, they discover that 31 of the 47 blocked transactions share a common pattern: the agent was responding to a specific type of supplier invoice that embedded inflated amounts in a structured data field the agent parsed differently from the enforcement layer. The agent was consistently attempting to approve transactions at the inflated amount, and the enforcement layer was consistently blocking them. The near-miss revealed a systematic parsing vulnerability — a vulnerability that would have caused material financial loss if the enforcement layer had a corresponding parsing bug or if the value had been below the per-transaction limit.

What went wrong: The blocked transactions were logged but not analysed. The near-miss signal — 31 transactions sharing a common exploit pattern — was available in the data but no one looked at it for 3 months. Had the analysis occurred in real time, the parsing vulnerability would have been identified and remediated immediately. Consequence: The vulnerability persisted for 3 additional months during which an attacker could have exploited it with below-limit amounts. When finally discovered, remediation cost £42,000 and required a 2-week accelerated fix cycle.

Scenario B — Human Override Pattern Reveals Systematic Bias: A customer-facing agent for a government housing service provides eligibility assessments that are reviewed by human caseworkers before being communicated to applicants. Over 6 months, caseworkers override the agent's assessment in 8.3% of cases. No near-miss analysis is conducted. When a discrimination complaint triggers an investigation, analysis reveals that the override rate for applicants from certain postcodes is 23.7% — nearly three times the average. The agent has been systematically underassessing eligibility for applicants in areas with higher ethnic minority populations. The human review process caught the errors, but the near-miss signal — a dramatically elevated override rate for a specific subpopulation — was never captured or analysed.

What went wrong: Human overrides were treated as a normal part of the workflow rather than as near-miss signals. No analysis was conducted on override patterns. The elevated override rate for a specific subpopulation was detectable months before the discrimination complaint but was never detected because no one measured it. Consequence: Discrimination complaint and investigation, retrospective review of 2,400 assessments costing £156,000, mandatory agent retraining, and significant reputational damage with affected communities.

Scenario C — Near-Miss in Safety-Critical System Ignored: An industrial inspection agent analyses sensor data from a chemical processing plant and recommends maintenance schedules. The agent's recommendation is reviewed by a human engineer before implementation. On three occasions over two months, the agent recommends extending a maintenance interval by 40% beyond the manufacturer's specification. Each time, the human engineer overrides the recommendation and schedules maintenance per the manufacturer's specification. No near-miss report is filed. Two months later, a shift change introduces a new engineer who is less experienced and trusts the agent's recommendations more readily. The agent recommends a 45% extension; the new engineer approves it. The delayed maintenance contributes to a sensor calibration drift that causes an incorrect temperature reading, leading to a minor process excursion that is caught by a secondary safety system.

What went wrong: Three near-misses were available as early-warning signals. The experienced engineer recognised the inappropriate recommendations but did not report them as near-misses. No system existed to capture override events as investigation triggers. When a less experienced operator encountered the same issue, the control (human review) failed because the human reviewer lacked the experience to catch the error. Consequence: Process safety excursion, mandatory safety investigation, near-miss reporting system implementation under regulatory pressure, and temporary restriction to manual maintenance scheduling.

4. Requirement Statement

Scope: This dimension applies to all AI agent deployments where controls exist that can intercept, block, override, or modify agent actions or outputs before they reach their final destination. This includes: infrastructure-layer enforcement that blocks mandate violations (AG-001), human review processes that can override agent decisions, output filtering that modifies or suppresses agent outputs, escalation mechanisms that redirect agent decisions to human decision-makers, and any automated control that intervenes between the agent's intended action and the executed action. The scope extends to detected anomalies that resolved without intervention — for example, an agent that briefly exhibited unusual behaviour before self-correcting. The scope does not include normal agent operation where no control intervention occurred.

4.1. A conforming system MUST capture and log every instance where a control intervenes to block, override, modify, or escalate an agent action or output, including the original agent action, the control that intervened, the reason for intervention, and the final outcome.

4.2. A conforming system MUST analyse captured near-miss events for patterns at least monthly, identifying clusters that indicate systematic vulnerabilities, recurring failure modes, or emerging threats.

4.3. A conforming system MUST classify each near-miss by severity (what would have happened if the control had not intervened) and frequency (how often similar near-misses occur).

4.4. A conforming system MUST generate a new scenario for the scenario library (AG-349) from every near-miss that reveals a previously unidentified failure mode or attack vector.

4.5. A conforming system MUST escalate near-miss patterns to governance leadership when the frequency or severity of a near-miss category exceeds defined thresholds — for example, when the same vulnerability class produces more than 10 near-misses in a 30-day period, or when a single near-miss would have caused critical harm if the control had failed.

4.6. A conforming system MUST track remediation of root causes identified through near-miss analysis, using the same finding lifecycle as red-team findings (AG-355).

4.7. A conforming system SHOULD automate near-miss detection for infrastructure-layer controls, capturing blocked actions, filtered outputs, and escalated decisions without requiring manual reporting.

4.8. A conforming system SHOULD implement near-miss reporting channels for human reviewers, making it easy for humans who override or correct agent outputs to report the near-miss with context.

4.9. A conforming system SHOULD analyse near-miss rates by subpopulation (user demographic, input category, deployment context) to detect differential failure patterns.

4.10. A conforming system MAY implement real-time near-miss dashboards that display near-miss rates, categories, and trends to operational and governance teams.

5. Rationale

Near-misses are the governance equivalent of the canary in the coal mine. Every near-miss represents a scenario where the system's controls worked — but also a scenario where the system attempted to take an incorrect or harmful action. The control intervened; the harm was prevented. But the attempted action reveals information about the agent's behaviour that is at least as valuable as the control's success in catching it.

In safety-critical industries (aviation, nuclear, chemical processing), near-miss reporting is a foundational safety practice. The reasoning is straightforward: near-misses outnumber actual incidents by a large factor (Heinrich's Triangle suggests ratios of 300:29:1 for near-misses, minor incidents, and major incidents). Analysing near-misses provides a much larger dataset of failure signals than waiting for actual incidents. Every actual incident was preceded by near-misses that, if captured and analysed, could have enabled preventive action.

The same logic applies to AI agent governance. An agent whose mandate enforcement blocks 47 over-limit transactions in a month is producing 47 signals about its behaviour. If those signals are ignored, the organisation learns nothing until a transaction slips through. If those signals are analysed, the organisation can identify the pattern, diagnose the root cause, and remediate before a failure occurs.

The subpopulation analysis requirement (4.9) is particularly important for fairness and equity. Differential near-miss rates across user demographics can reveal bias before it manifests as actual harm. If a human review process overrides the agent's output 3 times more often for one demographic group than another, the agent has a differential failure rate that near-miss analysis can detect. This is actionable intelligence that post-incident analysis cannot provide because post-incident analysis requires an incident to occur.

The scenario generation requirement (4.4) creates a feedback loop between production experience and evaluation coverage. Every novel near-miss that generates a new scenario improves future evaluation. Over time, this feedback loop causes the scenario library to converge toward the actual failure modes of the system, making evaluation increasingly predictive of real-world risk.

6. Implementation Guidance

Effective near-miss capture requires both automated detection (for infrastructure-layer controls) and human reporting (for human-in-the-loop controls), combined with systematic analysis and feedback into the governance programme.

Recommended patterns:

Automated near-miss pipeline. Instrument all infrastructure-layer controls (mandate enforcement, output filtering, escalation triggers) to emit structured near-miss events on every intervention. Each event includes: timestamp, agent ID, intended action, control that intervened, intervention reason, and final outcome. Route events to a dedicated near-miss analysis store. The pipeline should be automatic — it should not depend on anyone deciding to report a near-miss.
Human near-miss reporting interface. Provide a simple reporting mechanism for human reviewers who override or correct agent outputs. The mechanism should require minimal effort — a single button or form with structured fields: "I corrected/overrode the agent because [reason]. The original output was [X]. The corrected output was [Y]." Integrate this with the human's existing workflow so that reporting is a natural step, not an additional burden. Aim for a near-miss reporting rate of at least 80% of overrides within the first year.
Pattern analysis cadence. Conduct formal near-miss pattern analysis at least monthly. The analysis should: (1) aggregate near-misses by category and vulnerability class; (2) identify clusters — groups of near-misses sharing a common root cause; (3) calculate near-miss rates over time — increasing rates may indicate degrading agent behaviour or new threat activity; (4) compare rates across subpopulations; (5) identify novel near-misses that reveal previously unknown failure modes. Generate an analysis report distributed to governance leadership and the development team.
Severity classification framework. Classify each near-miss by counterfactual severity: "If the control had not intervened, what would the consequence have been?" Use a consistent scale: (1) negligible — minor inconvenience, no material impact; (2) moderate — material operational impact, financial loss under £10,000, or reputational risk; (3) significant — substantial financial loss (£10,000-£100,000), regulatory exposure, or harm to individuals; (4) critical — catastrophic financial loss (>£100,000), safety incident, or systemic regulatory failure. Near-misses classified as significant or critical trigger immediate investigation, not monthly review.
Scenario library feedback loop. When near-miss analysis identifies a novel failure mode, automatically generate a candidate scenario for the scenario library (AG-349). The candidate includes the near-miss event data, the counterfactual severity classification, and a draft scenario specification. A scenario reviewer validates the candidate and adds it to the library. Track the metric: "percentage of scenario library scenarios derived from near-miss events."

Anti-patterns to avoid:

Treating blocked actions as the control working correctly. A blocked action is both a control success and an agent failure. Celebrating the block without investigating the agent's attempt to take the action misses the diagnostic value of the near-miss.
Requiring manual reporting for all near-misses. Manual-only reporting guarantees under-reporting. Infrastructure-layer near-misses should be captured automatically. Human-layer near-misses should be as easy to report as possible, with the goal of making reporting the default rather than the exception.
Analysing near-misses in isolation. A single near-miss may appear innocuous. A pattern of 50 near-misses sharing a common characteristic reveals a systematic vulnerability. Pattern analysis, not individual event review, is where near-miss value is realised.
Capturing near-misses without feedback into evaluation. Near-miss data that sits in a log without generating new scenarios, triggering investigations, or informing risk assessments is wasted signal. The feedback loops (into scenario libraries, red-team scope, risk assessments) are where near-miss capture creates value.
Ignoring near-miss rates as a metric. A declining near-miss rate may indicate improving agent behaviour — or it may indicate that controls are being bypassed, that reporting is declining, or that the agent's actions have shifted to areas without controls. Near-miss rates must be interpreted in context.

Industry Considerations

Financial Services. Near-miss capture aligns with FCA expectations for incident and near-miss reporting in trading operations. Blocked transactions, overridden recommendations, and escalated decisions should be captured with the same rigour as actual incidents. The FCA expects firms to learn from near-misses, not just incidents.

Healthcare. Clinical near-misses (e.g., agent recommendations overridden by clinicians) must be reported through clinical governance channels. Clinical near-miss patterns should be reviewed by clinical governance, not just technical governance, because the severity classification requires clinical expertise.

Safety-Critical / CPS. Near-miss reporting is often a regulatory requirement for safety-critical systems. Near-miss analysis should feed into hazard analysis (HAZOP) and failure mode analysis (FMEA). The safety case for the AI agent should incorporate near-miss evidence as demonstration of control effectiveness.

Maturity Model

Basic Implementation — All infrastructure-layer control interventions are logged automatically. Human near-miss reporting channels exist. Near-miss events are analysed for patterns at least monthly. Each near-miss is classified by counterfactual severity. Novel failure modes generate candidate scenarios for the scenario library. Root-cause remediation is tracked. This level meets the minimum mandatory requirements but analysis may be largely manual.

Intermediate Implementation — Automated pattern detection identifies near-miss clusters in real time. Subpopulation analysis detects differential near-miss rates across demographics and contexts. Near-miss rates are tracked as a governance metric over time. Escalation thresholds automatically alert governance leadership when frequency or severity exceeds defined limits. Human near-miss reporting achieves at least 80% coverage of overrides.

Advanced Implementation — All intermediate capabilities plus: predictive analytics identify emerging near-miss patterns before they become frequent. Real-time dashboards display near-miss rates, categories, and trends. Near-miss data is correlated with production incident data to validate that near-miss analysis is predictive. The organisation shares anonymised near-miss intelligence with industry peers to improve collective defence. Machine learning identifies latent patterns across near-miss categories that human analysts might miss.

7. Evidence Requirements

Required artefacts:

Near-miss log. The automated log of all infrastructure-layer control interventions, including the intended action, intervention reason, and outcome.
Human near-miss reports. Collected reports from human reviewers documenting overrides and corrections, with context and severity classification.
Monthly analysis reports. Reports from each monthly near-miss pattern analysis, including: near-miss counts by category, identified clusters, novel failure modes, subpopulation analysis, and escalated findings.
Scenario generation records. Evidence that novel near-miss failure modes generated candidate scenarios for the scenario library, with the scenario status (draft, reviewed, active).
Remediation tracking records. Root-cause remediation tracking for near-miss patterns, following the red-team finding lifecycle.

Retention requirements:

Near-miss logs and analysis reports: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request.

8. Test Specification

Test 8.1: Automated Capture Completeness

Stimulus: Trigger 5 known control interventions (e.g., submit over-limit transactions, trigger output filters, trigger escalation mechanisms). Verify that all 5 are captured in the near-miss log.
Expected behaviour: All 5 interventions appear in the near-miss log with complete metadata (timestamp, agent ID, intended action, control, reason, outcome).
Pass criteria: 100% of triggered interventions are captured with complete metadata.
Fail criteria: Any triggered intervention is not captured, or captured events lack required metadata fields.

Test 8.2: Monthly Analysis Compliance

Stimulus: Request the last 6 monthly near-miss analysis reports.
Expected behaviour: Six reports exist, each covering the preceding month's near-miss data.
Pass criteria: All 6 reports are present, each dated within the expected monthly cadence. Each report includes near-miss counts, pattern analysis, and severity classification.
Fail criteria: Fewer than 6 reports exist in the last 6 months, or any report lacks pattern analysis or severity classification.

Test 8.3: Severity Classification Completeness

Stimulus: Select 20 near-miss events at random from the last 3 months. Verify that each has a counterfactual severity classification.
Expected behaviour: All 20 events have a severity classification (negligible, moderate, significant, or critical).
Pass criteria: 100% of sampled events have a severity classification.
Fail criteria: Any sampled event lacks a severity classification.

Test 8.4: Scenario Generation Feedback

Stimulus: Identify near-misses classified as revealing novel failure modes in the last 12 months. Verify that each generated a candidate scenario for the scenario library.
Expected behaviour: Every novel-failure-mode near-miss has a corresponding scenario library candidate.
Pass criteria: 100% of novel-failure-mode near-misses have a corresponding scenario candidate (status may be draft, reviewed, or active).
Fail criteria: Any novel-failure-mode near-miss lacks a corresponding scenario candidate.

Test 8.5: Escalation Threshold Enforcement

Stimulus: Identify any near-miss category that exceeded the defined escalation threshold in the last 12 months. Verify that escalation to governance leadership occurred.
Expected behaviour: Every threshold breach resulted in documented escalation.
Pass criteria: 100% of threshold breaches have escalation records including the date, the recipient, and the governance response.
Fail criteria: Any threshold breach did not result in documented escalation.

Test 8.6: Root-Cause Remediation Tracking

Stimulus: Select 5 near-miss patterns that triggered root-cause investigation in the last 12 months. Verify that each has a remediation plan, implementation evidence, and verification.
Expected behaviour: All 5 patterns have complete remediation tracking.
Pass criteria: 100% of sampled patterns have root-cause analysis, remediation plan, implementation evidence, and verification that the root cause was addressed.
Fail criteria: Any pattern lacks root-cause analysis, remediation evidence, or verification.

Conformance Scoring

Score 0: No near-miss capture exists — control interventions are logged (if at all) but not analysed or used for governance improvement.
Score 1: Near-misses are captured but not systematically analysed — logs exist but pattern analysis, severity classification, and feedback loops are absent or informal.
Score 2: Near-misses are captured, classified, analysed monthly, and fed back into the scenario library and remediation processes — all mandatory requirements are met.
Score 3: Verified by independent assessment — an independent party has validated the near-miss capture pipeline, analysis methodology, and feedback loops, confirming that near-miss governance is comprehensive and effective.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 72 (Post-Market Monitoring)	Direct requirement
NIST AI RMF	MEASURE 2.5, MANAGE 2.3	Supports compliance
ISO 42001	Clause 9.1 (Monitoring), Clause 10.1 (Continual Improvement)	Supports compliance
FCA SYSC	6.1.1R (Systems and Controls)	Supports compliance
DORA	Article 10 (ICT Incident Management)	Supports compliance
HSE (UK)	RIDDOR-adjacent near-miss obligations for safety-critical systems	Supports compliance

EU AI Act — Article 72 (Post-Market Monitoring)

Article 72 requires post-market monitoring systems that actively collect and analyse data about AI system performance throughout its lifetime. Near-miss capture is a core component of post-market monitoring — it provides ongoing data about the boundary conditions of the system's controls and the agent's failure modes. A post-market monitoring system that only captures actual incidents misses the most informative data source: the near-misses that reveal where the system is close to failing.

DORA — Article 10 (ICT Incident Management)

Article 10 requires financial entities to classify, report, and manage ICT-related incidents. While near-misses may not always meet the threshold for formal incident reporting, they represent the precursor events that effective incident management should capture. DORA's emphasis on learning from incidents extends naturally to learning from near-misses.

HSE — Near-Miss Obligations

For AI agents in safety-critical industrial environments, near-miss reporting may fall within the scope of HSE near-miss reporting obligations. While RIDDOR applies to specific reportable events, HSE guidance strongly recommends near-miss reporting as a proactive safety measure. Organisations deploying AI agents in safety-critical contexts should treat near-miss capture as aligned with their statutory health and safety obligations.

10. Failure Severity

Field	Value
Severity Rating	Medium-High
Blast Radius	Organisation-wide — failure to capture near-misses means the organisation cannot learn from its most informative safety signals

Consequence chain: Without near-miss capture governance, the organisation loses its most valuable early-warning system. The immediate consequence is that control interventions (blocked actions, overrides, corrections) are treated as routine events rather than diagnostic signals. The operational consequence is that systematic vulnerabilities persist undetected — vulnerabilities that near-miss analysis would reveal. The predictive consequence is that the organisation cannot anticipate failures: near-misses are leading indicators, while incidents are lagging indicators. The escalating consequence is that unremediated near-miss patterns eventually produce actual incidents when a control fails or a new operator does not recognise the risk. The regulatory consequence is inability to demonstrate a learning organisation — regulators increasingly expect evidence that organisations learn from near-misses, not just incidents.

Cross-references: AG-349 (Scenario Library Governance) receives new scenarios generated from near-miss analysis. AG-350 (Coverage Gap Tracking Governance) uses near-miss data to identify coverage gaps where controls are frequently intervening but evaluation scenarios are sparse. AG-355 (Continuous Red-Team Scheduling Governance) uses near-miss patterns to inform red-team scope. AG-153 (Control Efficacy Measurement) uses near-miss rates as a metric for control effectiveness. AG-103 (Red-Team Coverage Management) incorporates near-miss-derived attack vectors.

Cite this protocol

AgentGoverning. (2026). AG-356: Near-Miss Capture Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-356

← Previous Protocol

AG-355

Continuous Red-Team Scheduling Governance

Next Protocol →

AG-357

Challenge Set Localisation Governance