AG-423: Incident Learning Closure Governance

2. Summary

Incident Learning Closure Governance requires that every incident affecting an AI agent — whether operational, security, compliance, or safety in nature — produces a formal lessons-learned record that is translated into concrete control changes with named owners, enforceable deadlines, and verifiable acceptance criteria. Many organisations conduct post-incident reviews but fail to close the loop: findings are documented in narrative reports that are filed and forgotten, root causes are identified but never acted upon, and the same failure modes recur months later in different agents or different business units. This dimension mandates a structured closure pipeline that traces every lesson from identification through remediation design, implementation, verification, and permanent integration into the governance baseline, ensuring that the organisation's incident response capability genuinely improves after every significant event.

3. Example

Scenario A — Lessons Documented but Never Implemented: A customer-facing insurance agent hallucinates policy coverage that does not exist, resulting in a customer purchasing a policy based on false representations. The incident costs £127,000 in customer remediation, regulatory fine, and policy adjustments. A thorough post-incident review identifies three root causes: (1) the agent's grounding data was 14 months stale, (2) no hallucination detection mechanism existed for coverage assertions, and (3) the human escalation threshold was set too high for novel coverage queries. The review produces a 22-page report with 11 recommendations. The report is circulated, acknowledged by senior management, and filed in the incident repository. Eighteen months later, a different insurance agent hallucinates coverage terms for a commercial policy. Investigation reveals that none of the 11 recommendations from the first incident were implemented. The same three root causes are present. The second incident costs £341,000 — the original remediation cost plus additional regulatory penalties for failure to learn from the first incident.

What went wrong: The post-incident review produced findings but no structured closure mechanism existed. Recommendations had no assigned owners, no deadlines, no acceptance criteria, and no verification process. The organisation's incident learning system was write-only — it captured lessons but never applied them. Consequence: £341,000 in combined losses across two incidents with identical root causes, regulatory censure for governance failure, and a 23% increase in the regulator's supervisory intensity for the firm's AI operations.

Scenario B — Remediation Actions Implemented but Never Verified: A financial-value agent executing treasury operations incorrectly calculates foreign exchange exposure due to a stale rate feed, triggering an unhedged position of €2.3 million. Post-incident review identifies the root cause: the rate feed health-check was monitoring connectivity but not data freshness. The remediation action — implement a data freshness assertion with a 60-second staleness threshold — is assigned to the platform team with a 30-day deadline. The platform team implements the staleness check on time. However, they set the threshold at 600 seconds (10 minutes) rather than 60 seconds because the specification was communicated verbally rather than through a traceable requirement. No verification step confirms that the implementation matches the specification. Eight months later, a 4-minute rate feed stall causes a £890,000 unhedged position because the 600-second threshold does not trigger.

What went wrong: The remediation action was implemented but never verified against its specification. The closure process lacked a verification gate requiring independent confirmation that the implemented control matched the designed control. Consequence: £890,000 loss from a failure mode that was supposedly remediated, erosion of trust in the incident learning process, and material weakness finding in the subsequent SOX audit.

Scenario C — Siloed Learning Leaves Parallel Systems Exposed: A safety-critical industrial agent controlling a chemical mixing process receives a corrupted sensor reading and fails to trigger its safety interlock, resulting in a near-miss event. Post-incident investigation identifies that the agent's sensor validation logic does not detect gradual drift in sensor calibration — it only catches binary sensor failures. The finding is remediated for the specific agent involved. However, the organisation operates 7 other safety-critical agents with identical sensor validation logic across 3 manufacturing sites. The lesson is not propagated because the incident learning system has no mechanism for identifying analogous systems. Fourteen months later, a different agent at a different site experiences the same sensor drift failure, this time resulting in a process excursion that causes £1.7 million in equipment damage and a 3-week production shutdown.

What went wrong: The incident learning system closed the finding for the specific agent but had no propagation mechanism to identify and remediate analogous systems. Lesson closure was per-agent rather than per-failure-mode. Consequence: £1.7 million in equipment damage, 3-week production shutdown, regulatory investigation by the health and safety executive, and criminal liability assessment for the organisation's failure to apply known lessons.

4. Requirement Statement

Scope: This dimension applies to every organisation operating AI agents where incidents — defined as any unplanned event resulting in actual or potential harm to users, customers, the organisation, third parties, or the public — can occur. The scope covers the full lifecycle of incident lessons: identification during post-incident review, formalisation into structured findings, translation into remediation actions with owners and deadlines, implementation tracking, verification against acceptance criteria, propagation to analogous systems, and permanent integration into the governance baseline. It applies regardless of incident severity — even low-severity incidents may reveal systemic weaknesses that require structural remediation. The scope extends to incidents affecting agents in production, staging, and testing environments, as testing environment incidents can reveal design flaws that would manifest in production. Organisations that outsource agent development or operation to third parties must ensure that the third party's incident learning process meets these requirements or must incorporate third-party incidents into their own closure pipeline.

4.1. A conforming system MUST produce a structured lessons-learned record for every incident classified as Severity 3 (Moderate) or above under the organisation's adverse event severity matrix (per AG-419), containing at minimum: root cause analysis, contributing factors, failed or absent controls, and specific findings requiring remediation.

4.2. A conforming system MUST translate each finding from a lessons-learned record into one or more remediation actions, each with a named individual owner, an enforceable deadline, measurable acceptance criteria, and a traceability link back to the originating incident and finding.

4.3. A conforming system MUST implement a closure verification gate requiring independent confirmation — by a party other than the remediation action owner — that the implemented remediation satisfies its acceptance criteria before the finding is marked as closed.

4.4. A conforming system MUST maintain a lessons-learned register that tracks every finding from identification through closure, with status transitions timestamped and the current status of every open finding visible to governance leadership at all times.

4.5. A conforming system MUST implement a propagation assessment for every finding, determining whether the identified failure mode could affect other agents, systems, or business units, and extending remediation scope to all affected entities.

4.6. A conforming system MUST escalate findings that exceed their remediation deadline to governance leadership within 48 hours of the deadline breach, with documented justification for the delay and a revised remediation plan.

4.7. A conforming system MUST integrate closed findings into the governance baseline — updating control configurations, detection rules, escalation thresholds, or operational procedures as specified by the remediation — so that the improvement becomes a permanent part of the governance posture rather than a one-time fix.

4.8. A conforming system SHOULD implement recurrence detection that automatically flags new incidents whose root causes or contributing factors match previously closed findings, indicating potential remediation failure or incomplete propagation.

4.9. A conforming system SHOULD conduct periodic effectiveness reviews of closed remediations (at least annually) to verify that implemented controls remain effective and have not degraded since closure.

4.10. A conforming system MAY implement automated lesson correlation that identifies patterns across multiple incidents — such as the same contributing factor appearing in three or more incidents within 12 months — and escalates these systemic patterns for structural review.

5. Rationale

Incident learning is the mechanism through which organisations convert operational failures into permanent improvements. Without it, the same failure modes recur indefinitely, each occurrence costing time, money, reputation, and — in safety-critical contexts — physical harm. The governance challenge is not conducting post-incident reviews; most mature organisations already do this. The challenge is closing the loop: ensuring that the insights from reviews are translated into concrete, verified, and permanent changes to the governance posture.

Three systemic failures characterise immature incident learning programmes. First, the documentation trap: organisations produce thorough post-incident reports but treat the report itself as the deliverable. The report is filed, circulated, and occasionally referenced, but its recommendations are not tracked to implementation. This creates a dangerous illusion of learning — the organisation believes it has addressed the failure because it has analysed the failure, when in fact analysis without action changes nothing. Second, the verification gap: organisations that do track remediation actions to implementation often lack a verification step confirming that the implementation matches the specification. As Scenario B illustrates, the gap between intended and actual implementation can be significant, and without independent verification, the remediation provides false assurance. Third, the silo effect: organisations that successfully close findings for the specific system involved often fail to propagate the lesson to analogous systems. This is particularly dangerous in AI agent deployments where multiple agents share architectural patterns, data pipelines, or control logic — a failure mode in one agent is likely present in others.

The regulatory environment increasingly treats failure to learn from incidents as a governance deficiency separate from and additional to the original incident. The EU AI Act's post-market monitoring requirements (Article 72) explicitly mandate that providers implement corrective actions based on incident data. DORA's ICT incident management requirements (Article 17) require financial entities to have procedures for follow-up actions after incidents. The FCA has repeatedly censured firms for failing to learn from incidents, treating recurrence of known failure modes as evidence of inadequate governance. SOX auditors treat unresolved remediation actions as potential material weaknesses. In safety-critical domains, failure to propagate lessons from near-miss events creates criminal liability exposure under health and safety legislation.

The cost of incident recurrence is consistently higher than the cost of the original incident. Regulators impose enhanced penalties for failures that were previously identified and not remediated. Customers and counterparties lose trust when the same failure occurs repeatedly. Internal stakeholders lose confidence in the governance programme. The marginal cost of implementing a structured closure pipeline is trivial compared to the cost of a single recurrence of a known failure mode.

6. Implementation Guidance

Incident Learning Closure Governance requires a structured pipeline that moves every incident finding from identification through to permanent integration in the governance baseline. The pipeline should be implemented as a workflow with defined stages, gates, and escalation paths — not as a document-centric process where reports are produced and manually tracked.

Recommended patterns:

Structured finding registry with workflow automation. Maintain all findings in a registry (database, ticketing system, or purpose-built governance tooling) with defined status stages: Identified, Remediation Designed, Implementation In Progress, Verification Pending, Verified Closed, Integrated into Baseline. Each transition requires documented evidence. Automated alerts fire when findings approach or exceed their deadlines. Governance dashboards show the total open finding count, ageing distribution, and overdue count in real time. The registry is the single source of truth for incident learning status.
Acceptance criteria specification at finding creation. When a finding is formalised, define the acceptance criteria that must be met for closure. Acceptance criteria should be specific, measurable, and independently verifiable — for example, "staleness threshold set to 60 seconds as confirmed by configuration review and functional test" rather than "implement staleness check." This forces precision at design time and enables meaningful verification at closure time.
Independent verification gate. Require that closure verification is performed by a party independent of the remediation owner. For high-severity findings, this should be a member of the governance or assurance function. For moderate-severity findings, a peer reviewer from a different team is acceptable. The verifier must confirm: (1) the implementation exists, (2) it matches the acceptance criteria, (3) it is integrated into ongoing monitoring or testing, and (4) it addresses the root cause, not merely the symptom.
Propagation assessment using system taxonomy. When a finding is identified, cross-reference it against the organisation's agent inventory and architecture registry to identify analogous systems. If the root cause is architectural (e.g., sensor validation logic), identify all agents sharing that architecture. If the root cause is operational (e.g., stale data feed), identify all agents consuming that data source. Extend the remediation scope to all affected entities and track each as a separate remediation action with its own owner and deadline.
Baseline integration as a closure requirement. A finding is not truly closed until the remediation is integrated into the governance baseline: control configurations updated in the governed configuration store (per AG-007), detection rules added to monitoring systems (per AG-022), escalation thresholds adjusted (per AG-019), and operational procedures updated. One-time fixes that are not integrated into the baseline will erode as systems change. Baseline integration ensures permanence.

Anti-patterns to avoid:

Report-as-deliverable. Treating the post-incident report as the end product of the learning process. The report is the beginning, not the end. If the report produces no tracked remediation actions with owners and deadlines, the learning process has failed at its first step.
Owner-verifies-own-work. Allowing the remediation action owner to verify their own implementation and mark the finding as closed. This eliminates the independent verification that catches specification-implementation mismatches. Self-verification is the most common cause of the verification gap described in Scenario B.
Per-agent closure without propagation. Closing a finding once the specific affected agent is remediated, without assessing whether analogous systems are exposed. This is the silo effect that allows the same failure mode to recur in different agents.
Open-ended deadlines. Assigning remediation actions with vague deadlines such as "next quarter" or "when resources permit." Without enforceable deadlines, remediation actions accumulate indefinitely, creating a backlog that governance leadership cannot prioritise or manage.
Closing findings that are not yet integrated. Marking findings as closed when the fix has been deployed but not yet integrated into the governance baseline (configuration store, monitoring rules, test suites). These fixes are temporary and will degrade as systems evolve.

Industry Considerations

Financial Services. Financial regulators — the FCA, SEC, OCC, MAS, and others — treat incident recurrence as a strong indicator of governance failure. The FCA's Senior Managers and Certification Regime (SM&CR) creates personal accountability for senior managers who fail to ensure that lessons are learned. SOX Section 404 audits specifically examine whether known control deficiencies have been remediated. Financial institutions should implement a 15-business-day maximum remediation deadline for high-severity findings and a 30-business-day maximum for moderate-severity findings.

Healthcare and Life Sciences. Adverse event reporting requirements under FDA regulations (21 CFR Part 803) and EU Medical Device Regulation (Article 87) mandate corrective and preventive actions (CAPA) that closely parallel incident learning closure. Healthcare organisations should align their AI agent incident learning process with their existing CAPA framework to avoid duplication and leverage mature processes.

Safety-Critical and Industrial. In safety-critical domains, failure to propagate lessons from near-miss events to analogous systems creates potential criminal liability under health and safety legislation. The chemical industry's process safety management requirements (OSHA 1910.119, Seveso III Directive) mandate formal management of change processes that incorporate incident lessons. Industrial organisations operating safety-critical agents should treat every near-miss finding as a mandatory propagation trigger.

Crypto and Web3. The immutability of blockchain transactions means that incidents involving on-chain actions cannot be reversed. Incident learning in this domain must focus heavily on prevention — ensuring that every finding that could prevent a future irreversible loss is remediated with the highest urgency. Remediation deadlines for findings affecting on-chain operations should be compressed relative to other domains.

Maturity Model

Basic Implementation — The organisation produces structured lessons-learned records for all incidents at Severity 3 and above. Each finding has a named owner, a deadline, and acceptance criteria. A lessons-learned register tracks all findings from identification through closure. Overdue findings are escalated to governance leadership. Closure requires independent verification. This level meets the minimum mandatory requirements.

Intermediate Implementation — All basic capabilities plus: propagation assessments identify analogous systems for every finding. Remediation actions are tracked for all affected entities, not just the originally affected agent. Closed findings are integrated into the governance baseline with configuration, monitoring, and procedural updates. Recurrence detection flags new incidents matching previously closed findings. Periodic effectiveness reviews verify that closed remediations remain effective.

Advanced Implementation — All intermediate capabilities plus: automated lesson correlation identifies systemic patterns across multiple incidents. The incident learning pipeline is integrated with the organisation's risk register, updating risk assessments based on incident findings. Closure verification includes regression testing confirming that the remediation prevents the original failure mode. Cross-organisational lesson sharing (anonymised where necessary) incorporates industry-wide incident data into the learning process. The mean time from finding identification to verified closure is tracked as a key governance metric with defined improvement targets.

7. Evidence Requirements

Required artefacts:

Lessons-learned records. Structured post-incident analysis documents for all incidents at Severity 3 and above, containing root cause analysis, contributing factors, failed or absent controls, and formalised findings. Format: structured data records, not free-text narrative only.
Remediation action register. The complete register of all remediation actions, showing for each: originating incident and finding, named owner, deadline, acceptance criteria, current status, status transition history with timestamps, and verification records.
Closure verification records. Independent verification evidence for each closed finding, showing: verifier identity (must differ from owner), verification date, verification method, confirmation that acceptance criteria are met, and baseline integration confirmation.
Propagation assessment records. Documentation of the propagation assessment for each finding, showing: analogous systems identified, remediation scope extension decisions, and tracking of propagated remediation actions.
Escalation records. Documentation of all deadline breach escalations, showing: finding identifier, original deadline, breach date, escalation recipient, revised remediation plan, and revised deadline.
Baseline integration evidence. Records showing that closed findings have been integrated into the governance baseline — configuration changes committed to the governed configuration store, monitoring rules updated, procedures revised, or test suites extended.

Retention requirements:

Lessons-learned records, remediation action register, and closure verification records: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request. The full closure chain — from incident through finding through remediation through verification through baseline integration — must be traceable as a connected chain, not reconstructed from disparate sources.

8. Test Specification

Test 8.1: Lessons-Learned Record Completeness

Stimulus: Select 5 incidents classified at Severity 3 or above from the past 12 months. Retrieve the lessons-learned record for each. Verify that each record contains: root cause analysis, contributing factors, failed or absent controls, and at least one formalised finding with remediation action.
Expected behaviour: All 5 incidents have complete lessons-learned records with all required fields populated.
Pass criteria: 100% of sampled incidents have complete lessons-learned records. Every finding has a linked remediation action with owner, deadline, and acceptance criteria.
Fail criteria: Any sampled incident lacks a lessons-learned record, or any record is missing required fields (root cause, contributing factors, or formalised findings).

Test 8.2: Remediation Action Traceability and Deadline Enforcement

Stimulus: From the remediation action register, select 10 remediation actions: 5 that are currently open and 5 that have been closed within the past 12 months. For open actions, verify that each has a named owner, an enforceable deadline, and measurable acceptance criteria. For overdue open actions, verify that escalation occurred within 48 hours of the deadline breach.
Expected behaviour: All 10 actions have owner, deadline, and acceptance criteria. All overdue actions have escalation records.
Pass criteria: 100% of sampled actions have all required fields. 100% of overdue actions have escalation records within 48 hours of breach.
Fail criteria: Any action lacks a named owner, deadline, or acceptance criteria. Any overdue action was not escalated within 48 hours of its deadline breach.

Test 8.3: Independent Closure Verification

Stimulus: Select 5 findings marked as closed within the past 12 months. For each, verify that: (1) a closure verification record exists, (2) the verifier is a different individual from the remediation action owner, (3) the verification record confirms that acceptance criteria were met, and (4) the verification was performed before the finding was marked as closed.
Expected behaviour: All 5 closed findings have independent verification records that predate the closure timestamp.
Pass criteria: 100% of sampled closed findings have independent verification by a non-owner verifier, performed before the closure date, confirming acceptance criteria met.
Fail criteria: Any closed finding lacks a verification record, was verified by the owner, or was marked closed before verification was completed.

Test 8.4: Propagation Assessment Execution

Stimulus: Select 5 findings from the past 12 months. For each, verify that a propagation assessment was conducted to determine whether the failure mode could affect other agents or systems. For findings where analogous systems were identified, verify that remediation actions were created for those systems with their own owners and deadlines.
Expected behaviour: All 5 findings have propagation assessment records. Identified analogous systems have corresponding remediation actions.
Pass criteria: 100% of sampled findings have documented propagation assessments. Where analogous systems were identified, 100% have corresponding tracked remediation actions.
Fail criteria: Any finding lacks a propagation assessment, or any identified analogous system lacks a corresponding remediation action.

Test 8.5: Baseline Integration Verification

Stimulus: Select 5 findings that were closed and marked as "Integrated into Baseline" within the past 12 months. For each, verify that the integration is real: check the governed configuration store (per AG-007) for the specified configuration change, check monitoring systems for new detection rules, check operational procedures for specified updates, or check test suites for new test cases as applicable.
Expected behaviour: All 5 findings have verifiable evidence of baseline integration — the changes specified in the remediation are present in the current governance baseline.
Pass criteria: 100% of sampled "Integrated into Baseline" findings have verifiable evidence of the specified baseline changes in their current operational state.
Fail criteria: Any finding marked as baseline-integrated has no verifiable evidence of the specified changes in the current governance baseline, or the changes have been reverted or degraded since integration.

Test 8.6: Escalation Timeliness for Overdue Findings

Stimulus: Introduce a test remediation action with a deadline set to 24 hours in the past (simulating an overdue action). Verify that the system generates an escalation notification to governance leadership within 48 hours of the deadline breach, including the finding identifier, the original deadline, and a request for a revised remediation plan.
Expected behaviour: Escalation notification is generated and delivered to governance leadership within 48 hours of the deadline breach.
Pass criteria: Escalation notification is generated within 48 hours, contains all required information (finding ID, original deadline, request for revised plan), and is delivered to the correct governance leadership recipients.
Fail criteria: No escalation is generated within 48 hours, or the escalation is missing required information, or it is delivered to incorrect recipients.

Test 8.7: Recurrence Detection

Stimulus: Create a new test incident with root cause attributes that match a previously closed finding (same failure mode, same contributing factor category). Submit the incident to the learning system. Verify that the system flags the potential recurrence and links it to the prior closed finding.
Expected behaviour: The system detects the match between the new incident and the prior closed finding and generates a recurrence alert.
Pass criteria: Recurrence is detected and flagged, with a link to the prior closed finding, within one business day of the new incident being entered. The alert identifies the potential remediation failure or incomplete propagation.
Fail criteria: The recurrence is not detected, or no link to the prior finding is established, or the alert takes longer than one business day.

Conformance Scoring

Score 0: No structured incident learning process exists — incidents are investigated ad hoc, findings are documented in narrative reports (if at all), and no systematic tracking of remediation actions occurs.
Score 1: Lessons-learned records are produced for high-severity incidents. Findings have named owners and deadlines. A register tracks open findings. However, closure verification is not independent, propagation assessments are not performed, and baseline integration is not verified.
Score 2: All mandatory requirements are met: structured lessons-learned records for all qualifying incidents, remediation actions with owners, deadlines, and acceptance criteria, independent closure verification, propagation assessments for all findings, deadline escalation within 48 hours, and baseline integration as a closure requirement. Recurrence detection flags incidents matching prior findings.
Score 3: Verified by independent audit — an independent party has confirmed the completeness and effectiveness of the incident learning closure pipeline. Automated lesson correlation identifies systemic patterns. Effectiveness reviews verify that closed remediations remain effective. The mean time from finding to verified closure is tracked with improvement targets. Cross-organisational lesson sharing is implemented.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 72 (Post-Market Monitoring)	Direct requirement
EU AI Act	Article 9 (Risk Management System)	Supports compliance
SOX	Section 404 (Internal Controls)	Direct requirement
FCA SYSC	6.1.1R (Systems and Controls)	Direct requirement
NIST AI RMF	GOVERN 1.5, MANAGE 4.1	Supports compliance
ISO 42001	Clause 10.1 (Continual Improvement)	Direct requirement
DORA	Article 17 (ICT-related Incident Management)	Direct requirement

EU AI Act — Article 72 (Post-Market Monitoring)

Article 72 requires providers of high-risk AI systems to establish and document a post-market monitoring system that actively and systematically collects, documents, and analyses data on the performance of the system throughout its lifetime. This includes the obligation to implement corrective actions when necessary. AG-423 directly supports compliance by mandating that incident findings are translated into verified corrective actions and permanently integrated into the governance baseline. An organisation that conducts post-incident reviews but does not track findings to closure cannot demonstrate compliance with Article 72's corrective action requirements.

SOX — Section 404 (Internal Controls)

SOX auditors assess whether identified control deficiencies have been remediated. An open remediation action register with overdue items is a potential material weakness finding. AG-423 provides the structured pipeline that ensures remediation actions are tracked, deadlined, verified, and closed — the exact process that SOX auditors expect. The independent verification gate (Requirement 4.3) directly supports the auditor's need to confirm that remediations are real, not self-reported.

FCA SYSC — 6.1.1R (Systems and Controls)

The FCA expects firms to have systems and controls adequate to manage risks. The FCA has repeatedly demonstrated that it treats failure to learn from incidents as a separate and aggravating governance failure. Dear CEO letters and enforcement actions consistently cite firms that experienced repeat failures of the same type as evidence of inadequate systems and controls. AG-423's propagation assessment (Requirement 4.5) and recurrence detection (Requirement 4.8) directly address the FCA's expectation that firms identify and remediate systemic weaknesses, not just individual incidents.

NIST AI RMF — GOVERN 1.5, MANAGE 4.1

GOVERN 1.5 addresses mechanisms for ongoing monitoring and periodic review of AI risk management processes. MANAGE 4.1 addresses incident response plans. AG-423 bridges the gap between incident response (responding to an event) and risk management improvement (ensuring the response produces permanent improvement). The lessons-learned closure pipeline is the mechanism through which incident response data flows into risk management process improvement.

ISO 42001 — Clause 10.1 (Continual Improvement)

ISO 42001 Clause 10.1 requires organisations to continually improve the suitability, adequacy, and effectiveness of their AI management system. Incident learning closure is the primary mechanism for this improvement — each incident reveals a gap or weakness that, once closed, improves the management system. AG-423 provides the structured process that transforms the ISO 42001 aspiration of continual improvement into a verifiable operational practice.

DORA Article 17 requires financial entities to establish ICT-related incident management processes including procedures for follow-up actions. AG-423 provides the structured closure mechanism that ensures follow-up actions are not merely defined but tracked to verified completion. The escalation requirement (4.6) ensures that overdue follow-up actions receive governance leadership attention, preventing the backlog accumulation that DORA's incident management requirements are designed to prevent.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Organisation-wide — every agent deployment is exposed to recurrence of known failure modes when incident learning does not close

Consequence chain: Incidents occur but their lessons are not converted into permanent control improvements. The immediate effect is an accumulating backlog of unremediated findings — each representing a known vulnerability that the organisation has identified but not addressed. The operational consequence is incident recurrence: the same failure modes manifest repeatedly across different agents, business units, and time periods. Each recurrence costs more than its predecessor because regulators impose escalating penalties for repeat failures, customers lose trust in the organisation's ability to operate safely, and internal teams lose confidence in the governance programme. The regulatory consequence is severe: the FCA treats failure to learn as an aggravating factor in enforcement decisions, SOX auditors treat unremediated findings as potential material weaknesses, and the EU AI Act's post-market monitoring requirements create direct liability for providers that do not implement corrective actions. In safety-critical domains, the consequence chain extends to physical harm: a near-miss that is not propagated to analogous systems becomes an actual incident at a different site. The ultimate organisational consequence is governance programme erosion — stakeholders observe that incidents are investigated but nothing changes, and the entire governance programme loses credibility as a performative exercise rather than a functional risk management system.

Cross-references: AG-419 (Adverse Event Severity Matrix Governance) defines the severity classification that determines which incidents require lessons-learned records. AG-007 (Governance Configuration Control) governs the configuration store into which closed findings are integrated. AG-420 (Tabletop Exercise Governance) provides a mechanism for testing whether lessons have been effectively integrated. AG-424 (Notification Routing Governance) ensures that incident stakeholders are notified, creating the initial conditions for learning. AG-428 (Crisis Communication Approval Governance) governs external communications that may reference incident lessons. AG-023 (Audit Trail Governance) provides the evidentiary basis for post-incident review. AG-415 (Decision Journal Completeness Governance) captures the decision context that informs root cause analysis. AG-022 (Behavioural Drift Detection) may detect failure modes before they become incidents, feeding the same learning pipeline.

Cite this protocol

AgentGoverning. (2026). AG-423: Incident Learning Closure Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-423

← Previous Protocol

AG-422

Recovery Time Objective Governance

Next Protocol →

AG-424

Notification Routing Governance