Assurance Case Maintenance Governance requires that every AI agent deployment is supported by a living assurance case — a structured, evidence-based argument that the agent operates safely and within its governance requirements — and that this assurance case is actively maintained throughout the agent's operational lifecycle. The assurance case is not a one-time deployment artefact; it is a continuously maintained argument that must be updated whenever the agent, its environment, its governance controls, or the regulatory landscape changes. AG-076 treats the assurance case as a governance artefact subject to the same version control, review, and approval processes as the agent's operational configuration. An assurance case that was valid at deployment but has not been updated after 12 months of operational changes, model updates, and regulatory developments provides false assurance — which is worse than no assurance at all.
Scenario A — Assurance Case Stale After Model Update: An organisation deploys a financial advisory agent with a comprehensive assurance case documenting: the model's performance characteristics, the governance controls in place, the risk mitigations applied, and the regulatory compliance arguments. Six months later, the model provider releases a significant update that changes the model's reasoning behaviour, context window size, and output characteristics. The organisation applies the update to improve performance. However, the assurance case is not reviewed or updated. The performance claims in the assurance case reference test results from the previous model version. The behavioural assumptions underlying the risk arguments no longer hold. When the regulator requests the assurance case during a supervisory visit, the document references a model version that is no longer deployed. The regulator issues a finding for inadequate assurance.
What went wrong: The assurance case was treated as a static deployment document rather than a living artefact. No trigger existed to require assurance case review upon model update. The performance evidence in the assurance case was invalidated by the model change but nobody recognised this dependency. Consequence: Regulatory finding, mandatory remediation, operational restriction on the agent until the assurance case is updated and re-approved, 3-month delay to business objectives.
Scenario B — Evidence Expiry Without Detection: A safety-critical agent's assurance case includes a claim that the agent's response latency is below 200 milliseconds at the 99th percentile, supported by load test evidence dated 14 months ago. Infrastructure changes since the test — including a migration to a different cloud region and the addition of a new governance gateway — have increased latency to 340 milliseconds at the 99th percentile. The assurance case still cites the original test results. During a safety incident, the latency exceeds the safety threshold, and the post-incident review discovers that the assurance case claim has been unsupported for 8 months. The organisation cannot demonstrate that the safety argument was valid during that period.
What went wrong: The evidence supporting the assurance case claim had an implicit validity period that was never formalised. No mechanism existed to detect when supporting evidence became stale. The infrastructure changes were not linked to the assurance case as invalidating events. Consequence: Safety incident contributing factor, regulatory investigation, requirement to re-establish the entire assurance case from scratch, 6-month operational restriction.
Scenario C — Regulatory Change Invalidates Compliance Argument: An organisation's assurance case for a customer-facing agent includes a compliance argument that the agent meets data protection requirements under the current regulatory framework. A new regulatory requirement is enacted that imposes additional transparency obligations for AI-generated customer communications. The assurance case is not updated. The organisation continues to operate the agent under the assumption that the existing compliance argument remains valid. Eighteen months later, a data protection authority audit reveals that the agent has been operating without the required transparency measures. The organisation cannot demonstrate that it assessed the regulatory change's impact on the assurance case.
What went wrong: No mechanism existed to link regulatory change events to assurance case review triggers. The compliance argument in the assurance case was never re-evaluated against the new requirements. The assurance case provided false confidence that compliance was maintained. Consequence: Data protection enforcement action, potential fine, requirement to notify affected customers, mandatory operational changes, reputational damage.
Scope: This dimension applies to all AI agents deployed in production environments where the organisation makes claims — explicit or implicit — about the agent's safety, performance, compliance, or governance posture. This includes all agents in regulated sectors, all agents that interact with external parties, all agents that process personal data, and all agents whose failure could cause material harm. The scope extends to agents deployed under regulatory approvals, certifications, or compliance declarations that reference assurance arguments. An organisation that claims an agent is "compliant with the EU AI Act" or "operating within governance controls" has an implicit assurance case — AG-076 requires that this assurance case be explicit, structured, evidenced, and maintained. The only agents excluded from scope are internal experimental agents in sandbox environments with no access to production data or systems and no external interactions.
4.1. A conforming system MUST maintain a structured assurance case for each production-deployed agent, containing: claims about the agent's safety, performance, compliance, and governance posture; evidence supporting each claim; and arguments linking the evidence to the claims.
4.2. A conforming system MUST define explicit validity conditions for each piece of evidence in the assurance case, including: the evidence type, the date collected, the conditions under which it was collected, and the events that would invalidate it (e.g., model update, infrastructure change, regulatory change, drift detection event).
4.3. A conforming system MUST trigger an assurance case review within 30 calendar days of any invalidating event, including but not limited to: model version change, significant infrastructure change, governance control modification, regulatory change affecting the agent's domain, and behavioural drift detection by AG-022.
4.4. A conforming system MUST version-control the assurance case using the same change control mechanisms required by AG-007, including approval workflows, audit trails, and rollback capability.
4.5. A conforming system MUST conduct a full assurance case review at least annually, even if no invalidating events have occurred, to confirm that all claims remain valid, all evidence remains current, and all arguments remain sound.
4.6. A conforming system MUST designate an accountable owner for each assurance case who is responsible for initiating reviews, approving updates, and attesting to the assurance case's validity.
4.7. A conforming system SHOULD implement automated monitoring for invalidating events — model updates, infrastructure changes, regulatory publications, drift alerts — that triggers assurance case review workflows without requiring manual detection.
4.8. A conforming system SHOULD maintain a dependency map linking each assurance case claim to the specific evidence artefacts, governance controls, infrastructure components, and regulatory provisions that support it, enabling impact analysis when any dependency changes.
4.9. A conforming system SHOULD generate an assurance case status dashboard showing: the last review date, the next scheduled review date, the number of claims with current evidence, the number of claims with stale evidence, and any outstanding invalidating events awaiting review.
4.10. A conforming system MAY implement automated evidence refresh — periodic re-execution of tests, benchmarks, and evaluations that provide the evidence supporting assurance case claims — to keep evidence current without manual intervention.
Assurance Case Maintenance Governance addresses the gap between initial assurance and ongoing assurance. Most organisations that deploy AI agents with governance controls create some form of assurance documentation at deployment time — a risk assessment, a compliance checklist, a safety analysis. But the value of this documentation degrades from the moment it is created. Every change to the agent, its environment, its governance controls, or the regulatory landscape potentially invalidates one or more claims in the assurance case. Without active maintenance, the assurance case becomes a historical record of the deployment state rather than a current attestation of the operational state.
The meta-governance nature of AG-076 is deliberate. This dimension does not govern the agent's behaviour directly — it governs the governance artefact that provides confidence in the agent's behaviour. It is the governance of governance evidence. This meta-level is necessary because an unmonitored assurance case is worse than no assurance case at all. No assurance case creates honest uncertainty. A stale assurance case creates false confidence, which leads to decisions based on assumptions that no longer hold.
The structured assurance case methodology — claims, evidence, arguments — is well-established in safety engineering (Goal Structuring Notation, Claims-Arguments-Evidence notation). AG-076 applies this methodology to AI agent governance and adds the critical requirement of ongoing maintenance. The traditional assurance case in safety engineering is typically created at design time and updated at major lifecycle milestones. AI agents change more frequently than traditional safety-critical systems — model updates, prompt changes, infrastructure migrations, and regulatory changes occur on timescales of weeks to months, not years. The assurance case maintenance cadence must match the rate of change.
AG-076 intersects with AG-007 because the assurance case itself is a governance configuration artefact. It intersects with AG-022 because behavioural drift detection events are among the most important invalidating triggers for assurance case claims. It intersects with AG-078 (Benchmark Coverage Governance) because benchmark results are a primary form of evidence in the assurance case.
AG-076 requires organisations to treat the assurance case as a living document embedded in their operational processes, not a compliance artefact filed at deployment time and forgotten.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Assurance cases for financial agents should align with the FCA's expectations for model risk management documentation. The FCA's SS1/23 requires firms to maintain documentation demonstrating that AI models are fit for purpose throughout their lifecycle. The assurance case can serve as the central artefact for this requirement. Annual review should align with the firm's model validation cycle. Claims should explicitly address conduct risk, market integrity, and client outcomes.
Healthcare. Medical device-class agents require assurance cases aligned with IEC 62304 (medical device software lifecycle) and ISO 14971 (risk management for medical devices). The assurance case must demonstrate ongoing clinical safety throughout the agent's operational life. Changes to the model or clinical guidelines must trigger reassessment. FDA post-market surveillance requirements map directly to assurance case maintenance obligations.
Critical Infrastructure. Agents operating in safety-critical environments should maintain assurance cases aligned with IEC 61508 (functional safety) and sector-specific standards. Safety integrity level (SIL) claims must be re-validated whenever the agent or its operating environment changes. The assurance case should integrate with the plant's safety management system and be subject to independent safety assessment.
Basic Implementation — The organisation maintains a documented assurance case for each production agent, structured as claims, evidence, and arguments. The assurance case is stored in a version-controlled document repository. An accountable owner is designated for each assurance case. Annual reviews are scheduled and tracked. Invalidating events trigger manual review. This level meets the minimum mandatory requirements but relies on human detection of invalidating events, which creates gaps when changes are frequent.
Intermediate Implementation — Assurance cases are stored in a structured, machine-readable format. Evidence artefacts have explicit validity metadata. An automated monitoring system detects invalidating events — model updates, infrastructure changes, drift alerts, regulatory publications — and triggers review workflows. A dependency map links claims to evidence and system components. A status dashboard shows the current health of each assurance case. Reviews are completed within 30 days of invalidating events, evidenced by workflow records.
Advanced Implementation — All intermediate capabilities plus: automated evidence refresh re-executes tests and benchmarks on a scheduled basis, keeping evidence current without manual intervention. The assurance case is integrated with the CI/CD pipeline — deployments that invalidate assurance case claims are automatically blocked until the claims are re-validated. Independent third-party review of the assurance case is conducted annually. The organisation can demonstrate a complete, auditable history of every assurance case change, every invalidating event, and the response to each event, for every production agent.
Required artefacts:
Retention requirements:
Access requirements:
Testing AG-076 compliance requires verifying not only that assurance cases exist but that the maintenance process functions correctly.
Test 8.1: Assurance Case Completeness
Test 8.2: Invalidating Event Response
Test 8.3: Annual Review Execution
Test 8.4: Version Control and Audit Trail
Test 8.5: Evidence Validity Tracking
Test 8.6: Dependency Map Accuracy
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management System) | Direct requirement |
| EU AI Act | Article 11 (Technical Documentation) | Direct requirement |
| EU AI Act | Article 61 (Post-Market Monitoring) | Direct requirement |
| SOX | Section 404 (Internal Controls Over Financial Reporting) | Supports compliance |
| FCA SYSC | 6.1.1R (Systems and Controls) | Supports compliance |
| NIST AI RMF | GOVERN 1.2, MAP 1.1, MANAGE 4.1 | Supports compliance |
| ISO 42001 | Clause 9.1 (Monitoring, Measurement, Analysis, Evaluation) | Direct requirement |
| DORA | Article 9 (ICT Risk Management Framework) | Supports compliance |
| FDA 21 CFR Part 820 | Design Controls (820.30) | Supports compliance |
Article 9 requires a risk management system that is a "continuous iterative process" throughout the lifecycle of the high-risk AI system. The assurance case is the structured expression of this continuous process — it documents the risks identified, the mitigations applied, and the evidence that mitigations are effective. Article 9(2)(e) specifically requires that risk management measures are "tested with a view to identifying the most appropriate risk management measures." Ongoing assurance case maintenance ensures that test evidence remains current and that new risks are incorporated as they are identified.
Article 11 requires technical documentation to be kept up to date. The assurance case is a core component of technical documentation for AI governance. AG-076's maintenance requirements — triggered reviews, annual reviews, evidence validity tracking — directly implement the "kept up to date" obligation.
Article 61 requires providers to establish a post-market monitoring system. The assurance case maintenance process is a primary mechanism for post-market monitoring of governance effectiveness — it continuously evaluates whether the claims made about the agent at deployment remain valid during operation.
For AI agents in financial operations, the assurance case provides the documented basis for management's assertion that controls are effective. Section 404 requires ongoing assessment, not just initial establishment. The annual review and invalidating event response requirements of AG-076 support the continuous assessment obligation.
The FCA expects firms to demonstrate that AI systems are monitored for ongoing effectiveness. SS1/23 specifically addresses model risk management through the lifecycle. The assurance case provides the structured evidence base that the firm can present to supervisors to demonstrate ongoing governance effectiveness.
GOVERN 1.2 addresses processes for the ongoing management of AI risks. MAP 1.1 addresses the context and intended use documentation. MANAGE 4.1 addresses the regular monitoring and review of AI risk management. AG-076 supports all three by requiring structured, maintained assurance documentation with defined review cadences.
Clause 9.1 requires organisations to determine what needs to be monitored and measured, the methods used, when monitoring and measuring shall be performed, and when the results shall be analysed and evaluated. The assurance case with its evidence validity tracking and review cadence directly implements this clause for AI governance.
For AI agents classified as medical devices or components thereof, the assurance case maintenance process supports the design review and design validation requirements throughout the product lifecycle. Changes that affect the device's safety or effectiveness require re-validation, which maps directly to the invalidating event trigger mechanism.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Organisation-wide — affecting the credibility of the entire governance programme and regulatory standing |
Consequence chain: When assurance cases are not maintained, the organisation's confidence in its governance posture becomes unfounded. The immediate technical failure is that claims about agent safety, performance, and compliance are unsupported by current evidence. This creates two categories of risk. First, actual governance failures may go undetected because the assurance case that should have flagged them is stale — the organisation believes controls are effective because the assurance case says so, but the evidence is outdated. Second, when a regulator, auditor, or incident investigation requests the assurance case, the organisation produces a document that demonstrably does not reflect current reality. This destroys confidence in the entire governance programme, not just the specific agent. A regulator who discovers a stale assurance case will reasonably question whether any of the organisation's governance artefacts are current. The consequence extends to personal liability — senior managers who attest to governance effectiveness based on unmaintained assurance cases face personal regulatory exposure. The organisational consequence is a forced remediation programme, potential operational restrictions on all agent deployments, and reputational damage that affects the organisation's ability to deploy AI systems.
Cross-references: AG-007 (Governance Configuration Control) — assurance cases are governance configurations requiring version control. AG-022 (Behavioural Drift Detection) — drift events are primary invalidating triggers for assurance case claims. AG-078 (Benchmark Coverage Governance) — benchmark results are key evidence artefacts within the assurance case.