Decision Summary Provenance Governance requires that every human-readable summary of an AI agent's decision — whether produced for consumers, regulators, operators, or auditors — maintains a verifiable, tamper-evident link to the underlying evidence, reasoning chain, and decision record from which it was derived. Summaries are the primary artefact through which stakeholders understand, challenge, and oversee AI decisions, yet summaries are inherently lossy — they reduce complex multi-factor reasoning into digestible narratives. Without provenance controls, summaries can diverge from the actual decision logic through abstraction errors, post-hoc rationalisation, hallucinated justifications, or deliberate manipulation. This dimension mandates that every claim in a decision summary is traceable to a specific element in the underlying decision journal, that the derivation path from evidence to summary is recorded and auditable, and that any divergence between the summary and the underlying record is detectable through automated verification.
Scenario A — Summary Cites Non-Existent Evidence: A lending agent denies a small-business loan application for £175,000. The decision summary provided to the applicant states: "Your application was declined due to insufficient trading history (18 months versus the required minimum of 24 months)." The applicant challenges the decision, noting that their business has been trading for 37 months. An internal review reveals that the agent's actual decision was based on a high debt-to-income ratio (4.2:1 versus a maximum of 3.5:1) and a sector risk score. The "insufficient trading history" statement in the summary was generated by the explanation module using a template that selected the wrong explanatory factor — the trading history check passed, but the template incorrectly attributed the denial to it. The summary had no provenance link to the decision journal; it was generated by a separate natural-language module that interpreted the denial signal and selected a plausible-sounding reason. The applicant files a formal complaint. The regulator investigates and finds that 12% of denial summaries across the portfolio contain at least one factual claim not supported by the underlying decision record. Remediation requires manual review of 8,400 denial summaries at a cost of £410,000 and triggers a requirement to implement provenance controls.
What went wrong: The summary was generated independently from the decision record. No provenance link connected the summary's claims to the actual decision factors. The explanation module could select any plausible factor, not just the actual contributing factors. Consequence: £410,000 remediation, regulatory enforcement, 8,400 summary reviews, loss of applicant trust, potential discrimination claims if incorrect factors disproportionately affected protected groups.
Scenario B — Post-Hoc Rationalisation Masks Actual Reasoning: An insurance claims agent denies a £92,000 property damage claim. The decision journal records that the denial was driven by an anomaly detection model that flagged the claim as potentially fraudulent (fraud probability 0.73, threshold 0.65). The decision summary provided to the claimant states: "Your claim has been declined because the damage described is inconsistent with the policy coverage for the reported incident type." This is a post-hoc rationalisation — the actual reason (fraud suspicion) was replaced with a more neutral-sounding policy coverage explanation. The rationalisation was implemented deliberately by the product team, who decided that fraud-flagged claims should receive a generic policy-coverage explanation to avoid tipping off potential fraudsters. However, the summary is factually inaccurate: the damage is consistent with the policy coverage, and the actual reason for denial is fraud suspicion. The claimant hires a loss adjuster who demonstrates that the claim is covered. The insurer cannot explain the denial without revealing the fraud model, but the summary they provided is verifiably false. The claimant's solicitor obtains the decision journal through a subject access request and demonstrates the discrepancy between the summary and the actual reason. The insurer faces a £340,000 claim for bad-faith denial plus regulatory investigation for providing misleading reasons.
What went wrong: The summary was deliberately decoupled from the decision record. The organisation replaced the actual reason with a fabricated one. While the motivation (not tipping off fraudsters) had some operational logic, the result was a provably false summary. No provenance control would have permitted this — a provenance check would have flagged that the summary's stated reason (policy coverage inconsistency) did not correspond to any factor in the decision journal. Consequence: £340,000 bad-faith claim, regulatory investigation, systemic review of all fraud-flagged denial summaries.
Scenario C — Abstraction Error Inverts Causal Direction: A benefits eligibility agent determines that a citizen qualifies for a reduced housing benefit rate. The decision record shows: the citizen's income increased from £22,000 to £28,000 (crossing the £26,500 threshold for the full rate), causing a reduction from full benefit (£450/month) to partial benefit (£280/month). The summary states: "Your housing benefit has been reduced because your income is below the threshold for full support." The summary inverted the causal direction — income went above the threshold, not below it. The abstraction logic that converted numerical comparisons into natural language mishandled the direction of the inequality. The citizen, reading the summary, believes that earning more money would restore full benefit — the opposite of reality. They take on additional work, their income rises further, and their benefit is reduced again. They file a complaint arguing that the original explanation was misleading and caused them financial harm. An investigation finds that the abstraction error affected 2,300 summaries over 6 months. The local authority must reissue corrected summaries and faces potential liability for decisions made by citizens relying on incorrect explanations. Total remediation cost: £186,000 including reissuance, complaint handling, and financial redress for citizens who demonstrably acted on the incorrect explanation.
What went wrong: The summary was generated by an abstraction layer that converted decision record entries into natural language, but no verification confirmed that the natural-language summary faithfully represented the underlying numerical reasoning. A provenance check that validated the directional relationship between the summary claim ("income is below the threshold") and the decision record entry ("income = £28,000 > threshold = £26,500") would have caught the inversion immediately. Consequence: £186,000 remediation, 2,300 corrected summaries, citizen financial harm from acting on incorrect explanations.
Scope: This dimension applies to every AI agent deployment that produces decision summaries — human-readable descriptions of the agent's decision, reasoning, or action — for any audience. A decision summary is any artefact that communicates the agent's decision in natural language or structured human-readable format, including: denial letters, approval notifications, explanation responses to affected individuals, regulatory disclosures, operator dashboards showing decision reasons, audit reports referencing agent decisions, and any communication that attributes a decision to specific reasons or factors. The scope includes summaries generated automatically by the agent, summaries generated by a separate explanation module that interprets the agent's decision, and summaries composed by human operators using information derived from the agent's decision record. If any human-readable representation of the decision exists, this dimension governs its provenance linkage to the underlying evidence and logic. The scope does not govern the decision itself (that is AG-036 and AG-415) — it governs the fidelity of the summary to the decision record.
4.1. A conforming system MUST maintain a verifiable provenance link between every factual claim in a decision summary and its source element in the underlying decision journal (per AG-415), such that no claim in the summary exists without a corresponding, retrievable source in the record.
4.2. A conforming system MUST record the derivation path for each decision summary — the sequence of transformations (selection, abstraction, templating, natural-language generation) that converted the decision record into the summary — in sufficient detail to allow an auditor to understand why the summary says what it says.
4.3. A conforming system MUST implement automated verification that compares each decision summary against its source decision record, detecting: claims not supported by any element in the record (unsupported claims), claims that contradict elements in the record (contradictory claims), and material omissions where a significant decision factor is absent from the summary without justification.
4.4. A conforming system MUST prevent the delivery of any decision summary that fails automated provenance verification, routing failed summaries to a human review queue for correction before delivery.
4.5. A conforming system MUST ensure that the provenance linkage is tamper-evident per AG-006 — modifications to either the summary or the decision record after initial creation are detectable, and any modification that breaks the provenance chain triggers an alert.
4.6. A conforming system MUST retain provenance metadata (links, derivation paths, verification results) for at least as long as the decision summary and decision record themselves are retained, ensuring that provenance can be verified retroactively.
4.7. A conforming system MUST ensure that when a decision summary is produced for a legally significant context (adverse action notice, regulatory disclosure, rights-affecting decision), the provenance verification includes a completeness check confirming that all legally required information elements are present and accurately sourced.
4.8. A conforming system SHOULD implement provenance-aware summary generation — rather than generating a summary and then checking it against the record, the summary generation process should be constrained to draw only from verified decision record elements, making unsupported claims structurally impossible.
4.9. A conforming system SHOULD version decision summaries, retaining all versions with timestamps and derivation metadata, so that if a summary is corrected or updated, the original version and the correction are both preserved.
4.10. A conforming system MAY implement real-time provenance dashboards that display summary-to-record fidelity metrics across the agent portfolio, enabling trend detection and early warning of systematic abstraction errors.
Decision summaries are the human interface to AI decision-making. In most deployments, no stakeholder — consumer, regulator, operator, or auditor — interacts directly with the decision record. They interact with summaries: natural-language explanations, denial letters, dashboard displays, audit excerpts. The summary is not merely a communication convenience; it is the legally operative artefact. When a consumer challenges a decision, they challenge the summary. When a regulator assesses compliance, they assess the summary against the decision. When an auditor verifies controls, they trace from the summary to the evidence. If the summary is unfaithful to the decision record, every downstream activity — challenge, assessment, verification — is corrupted.
Three categories of summary infidelity pose governance risks. The first is unsupported claims: the summary states a reason that does not appear in the decision record (Scenario A). This can occur through template selection errors, hallucination in natural-language generation, or legacy template text that references factors no longer used by the model. The second is contradictory claims: the summary states something that is contradicted by the decision record (Scenario C). This typically occurs through abstraction errors — incorrect directional language, misattributed comparisons, or inverted inequalities. The third is deliberate decoupling: the summary is intentionally disconnected from the decision record to present a more palatable or strategically advantageous explanation (Scenario B). All three categories produce the same outcome: stakeholders make decisions based on false information about why the agent acted as it did.
The legal and regulatory consequences of summary infidelity are severe. Under the EU AI Act, Article 86 establishes the right to explanation of individual decisions — if the explanation is unfaithful to the actual decision, the right has not been satisfied regardless of how well-written the explanation is. Under consumer protection law, a demonstrably false reason for an adverse action may constitute deceptive practice. Under administrative law, a government decision accompanied by incorrect reasons is vulnerable to judicial review on the grounds of irrationality or error of law. Under data protection law, a subject access request may reveal the discrepancy between the summary and the actual processing logic, creating evidence of non-compliance with transparency obligations.
The provenance requirement is not merely a documentation exercise. It is a structural control that constrains what summaries can say. A summary with a provenance requirement cannot claim factors that do not exist in the decision record, because the provenance verification will reject unsupported claims. A summary with provenance verification cannot invert causal directions without detection, because the directional relationship between the claim and its source will be checked. A summary with tamper-evident provenance cannot be silently altered after delivery to match a revised narrative, because modifications are detectable. Provenance is the mechanism that makes summaries trustworthy.
The relationship between AG-450 and AG-415 (Decision Journal Completeness) is essential. AG-415 ensures that the decision record is complete — all inputs, reasoning steps, and outputs are captured. AG-450 ensures that the summary faithfully represents that record. Neither alone is sufficient: a complete decision record with an unfaithful summary (AG-415 met, AG-450 not met) produces well-documented decisions that are poorly explained. A faithful summary of an incomplete record (AG-450 met, AG-415 not met) produces accurate summaries of inadequate documentation. Both dimensions must be satisfied together to produce complete records that are accurately communicated.
Decision Summary Provenance Governance requires that the summary generation pipeline is structurally linked to the decision record, with verification ensuring that every summary claim has a source and that no source-claim relationship is broken or fabricated. The core architectural principle is that summaries are derived views of the decision record, not independent compositions about the decision.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Adverse action notices in lending (required under the Equal Credit Opportunity Act in the US, FCA MCOB in the UK, and equivalent regulations elsewhere) are among the highest-stakes decision summaries. A lending denial notice that cites incorrect factors can constitute unfair lending practice, trigger disparate impact claims if the cited factors correlate with protected characteristics differently than the actual factors, and undermine the regulatory purpose of adverse action notices (enabling consumers to improve their creditworthiness). Financial firms should implement claim-level provenance annotation for all adverse action notices and automated verification before issuance.
Public Sector. Government decision summaries carry particular weight because they may be the sole basis for judicial review. A benefits decision that states incorrect reasons can be overturned on judicial review as an "error of law" — the decision-maker took into account irrelevant considerations or failed to take into account relevant ones (even if the actual decision was sound, the stated reasons were wrong). Public-sector organisations should implement mandatory provenance verification for all rights-affecting decision summaries.
Insurance. Claims denial summaries must accurately reflect the actual denial reason. Under insurance contract law in most jurisdictions, an insurer that denies a claim for a stated reason and is later shown to have denied it for a different actual reason may face bad-faith claims (Scenario B). Provenance controls are essential for aligning denial summaries with actual decision factors.
Healthcare. Clinical decision summaries (e.g., treatment recommendations, diagnostic outputs) must accurately reflect the clinical evidence and reasoning chain. A summary that cites evidence not present in the clinical record, or that misrepresents the confidence level of a finding, can lead to inappropriate clinical decisions by the reviewing clinician.
Basic Implementation — Decision summaries are generated from templates with slots bound to decision record elements. Each summary carries a reference to its source decision journal entry per AG-415. Automated verification checks that template slot values match the referenced record elements. Summaries failing verification are routed to a human review queue. Provenance metadata (template used, slot bindings, verification result) is retained alongside each summary. This level meets the minimum mandatory requirements (4.1 through 4.7) and prevents the most common provenance failures.
Intermediate Implementation — All basic capabilities plus: claim-level provenance annotation links each factual claim in the summary to its specific source element in the decision record. Post-generation verification detects unsupported claims, contradictory claims, and material omissions — not just template slot mismatches. The provenance chain is tamper-evident per AG-006. Derivation path logging records the complete transformation pipeline from record to summary. Summaries are versioned, with all versions retained. Verification coverage metrics track the percentage of summaries that pass automated verification without human intervention.
Advanced Implementation — All intermediate capabilities plus: provenance-aware summary generation constrains the generation process to draw only from verified record elements, making unsupported claims structurally impossible. Real-time provenance dashboards display fidelity metrics across the portfolio, with trend detection and alerting for systematic abstraction errors. Natural-language generation, if used, is paired with semantic verification that checks not just factual accuracy but directional and causal accuracy of natural-language renderings. Independent provenance audit is conducted annually, sampling summaries and verifying end-to-end traceability from claim to source. Cross-agent provenance is maintained for multi-agent decisions where the summary must reflect reasoning contributions from multiple agents per AG-398.
Required artefacts:
Retention requirements:
Access requirements:
Test 8.1: Claim-to-Source Traceability
Test 8.2: Unsupported Claim Detection
Test 8.3: Contradictory Claim Detection
Test 8.4: Tamper-Evidence Chain Integrity
Test 8.5: Derivation Path Auditability
Test 8.6: Failed Summary Routing and Human Review
Test 8.7: Legally Significant Summary Completeness Verification
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 13 (Transparency and Provision of Information) | Direct requirement |
| EU AI Act | Article 86 (Right to Explanation of Individual Decision-Making) | Direct requirement |
| SOX | Section 302/404 (Internal Controls and Certification) | Supports compliance |
| FCA SYSC | 6.1.1R (Systems and Controls) | Supports compliance |
| NIST AI RMF | MEASURE 2.5, MANAGE 4.2 | Supports compliance |
| ISO 42001 | Clause 8.4 (AI System Operation and Monitoring) | Supports compliance |
| DORA | Article 10 (Detection) | Supports compliance |
Article 13 requires that high-risk AI systems provide information sufficient to interpret the system's output. Article 86 gives affected persons the right to "clear and meaningful explanations" of individual decisions. Both provisions implicitly require that explanations are faithful to the actual decision logic — an explanation that misrepresents the decision reasoning does not enable interpretation (Article 13) and is not meaningful (Article 86). AG-450 ensures that decision summaries — the primary vehicle for satisfying both articles — are provenance-linked to the underlying decision record, preventing the delivery of explanations that are technically clear but factually disconnected from what actually happened. The requirement for automated verification directly supports Article 13's operational transparency by ensuring that the summary generation pipeline itself is reliable, not just the individual summaries.
Management certification under SOX Section 302 and auditor attestation under Section 404 require that controls over financial reporting are effective. When AI agents make financial decisions, the summaries of those decisions are part of the audit trail. If summaries do not faithfully represent the actual decision logic, the audit trail is corrupted — auditors may certify controls as effective based on summaries that do not reflect actual system behaviour. AG-450's provenance controls ensure that the audit trail from decision summary to decision record is verifiable, supporting the integrity of SOX attestations.
The FCA requires firms to maintain adequate systems and controls. A system that generates decision summaries disconnected from actual decision logic is not an adequate control — it is a communication system with no integrity assurance. The FCA's Senior Managers and Certification Regime (SM&CR) also requires that senior managers can accurately describe the decisions made under their oversight. If decision summaries are unfaithful, senior managers cannot fulfil this responsibility. AG-450 provides the structural control that ensures summary integrity.
MEASURE 2.5 addresses the accuracy and reliability of AI system outputs and their documentation. MANAGE 4.2 addresses the management of AI system risks through ongoing monitoring. Decision summaries are a form of output documentation, and their fidelity to the actual decision process is a reliability concern. AG-450's automated verification directly supports MEASURE 2.5 by ensuring that documentation (summaries) accurately reflects system outputs (decisions). The provenance monitoring capability supports MANAGE 4.2 by detecting systematic summary-record divergence as an ongoing risk.
DORA Article 10 requires financial entities to detect anomalous activities, including ICT-related incidents. Systematic divergence between decision summaries and decision records — whether caused by abstraction errors, template failures, or deliberate manipulation — is an ICT-related anomaly that should be detected by monitoring systems. AG-450's automated verification and provenance monitoring directly support DORA's detection requirements by identifying summary infidelity as a detectable and monitorable condition.
ISO 42001 requires that AI systems are operated and monitored in accordance with the AI management system. Decision summary generation is an operational process that must be monitored for quality and reliability. AG-450 provides the monitoring framework — provenance verification, derivation path logging, and fidelity metrics — that enables ISO 42001-compliant monitoring of the summary generation process.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Every decision summary the agent produces — potentially thousands per day — with disproportionate impact on legally significant summaries (adverse action notices, regulatory disclosures, rights-affecting decisions) where summary infidelity carries direct legal liability |
Consequence chain: Summary provenance failure begins as a technical gap — summaries are generated without structural linkage to the decision record — and manifests as factual infidelity in the summaries themselves. The first-order consequence is that stakeholders receive explanations that do not accurately represent the agent's actual reasoning. For consumers, this means exercising rights (challenge, appeal, complaint) based on incorrect information about why a decision was made (Scenario A: applicant told "insufficient trading history" when the actual reason was debt-to-income ratio). For regulators, this means conducting oversight based on a false picture of how the agent operates. For operators, this means diagnosing system behaviour based on summaries that may not reflect actual system state. The second-order consequence is legal liability: provably false explanations constitute deceptive practice under consumer protection law, bad-faith denial under insurance law (Scenario B: £340,000 claim), and error of law under administrative law (Scenario C: judicial review vulnerability). The third-order consequence is systemic: when summary infidelity is discovered, the credibility of all summaries is destroyed — not just the specific summaries found to be inaccurate. Regulators, courts, and consumers can no longer trust any explanation the organisation provides, requiring retrospective verification of the entire summary corpus (Scenario A: 8,400 summaries reviewed at £410,000). The fourth-order consequence is governance failure: if summaries cannot be trusted to represent actual decisions, the entire explanation infrastructure — including AG-449 audience-specific explanations, AG-452 counterfactual explanations, and AG-453 adverse action notices — is undermined, because all of these depend on the fidelity of the underlying summary to the decision record.
Cross-references: AG-006 (Tamper-Evident Record Integrity), AG-415 (Decision Journal Completeness Governance), AG-449 (Audience-Specific Explanation Governance), AG-452 (Counterfactual Explanation Governance), AG-453 (Adverse Action Notice Governance), AG-416 (Evidentiary Chain-of-Custody Governance), AG-036 (Reasoning Integrity Governance), AG-398 (Cross-Agent Blame Attribution Governance).