AG-450: Decision Summary Provenance Governance

2. Summary

Decision Summary Provenance Governance requires that every human-readable summary of an AI agent's decision — whether produced for consumers, regulators, operators, or auditors — maintains a verifiable, tamper-evident link to the underlying evidence, reasoning chain, and decision record from which it was derived. Summaries are the primary artefact through which stakeholders understand, challenge, and oversee AI decisions, yet summaries are inherently lossy — they reduce complex multi-factor reasoning into digestible narratives. Without provenance controls, summaries can diverge from the actual decision logic through abstraction errors, post-hoc rationalisation, hallucinated justifications, or deliberate manipulation. This dimension mandates that every claim in a decision summary is traceable to a specific element in the underlying decision journal, that the derivation path from evidence to summary is recorded and auditable, and that any divergence between the summary and the underlying record is detectable through automated verification.

3. Example

Scenario A — Summary Cites Non-Existent Evidence: A lending agent denies a small-business loan application for £175,000. The decision summary provided to the applicant states: "Your application was declined due to insufficient trading history (18 months versus the required minimum of 24 months)." The applicant challenges the decision, noting that their business has been trading for 37 months. An internal review reveals that the agent's actual decision was based on a high debt-to-income ratio (4.2:1 versus a maximum of 3.5:1) and a sector risk score. The "insufficient trading history" statement in the summary was generated by the explanation module using a template that selected the wrong explanatory factor — the trading history check passed, but the template incorrectly attributed the denial to it. The summary had no provenance link to the decision journal; it was generated by a separate natural-language module that interpreted the denial signal and selected a plausible-sounding reason. The applicant files a formal complaint. The regulator investigates and finds that 12% of denial summaries across the portfolio contain at least one factual claim not supported by the underlying decision record. Remediation requires manual review of 8,400 denial summaries at a cost of £410,000 and triggers a requirement to implement provenance controls.

What went wrong: The summary was generated independently from the decision record. No provenance link connected the summary's claims to the actual decision factors. The explanation module could select any plausible factor, not just the actual contributing factors. Consequence: £410,000 remediation, regulatory enforcement, 8,400 summary reviews, loss of applicant trust, potential discrimination claims if incorrect factors disproportionately affected protected groups.

Scenario B — Post-Hoc Rationalisation Masks Actual Reasoning: An insurance claims agent denies a £92,000 property damage claim. The decision journal records that the denial was driven by an anomaly detection model that flagged the claim as potentially fraudulent (fraud probability 0.73, threshold 0.65). The decision summary provided to the claimant states: "Your claim has been declined because the damage described is inconsistent with the policy coverage for the reported incident type." This is a post-hoc rationalisation — the actual reason (fraud suspicion) was replaced with a more neutral-sounding policy coverage explanation. The rationalisation was implemented deliberately by the product team, who decided that fraud-flagged claims should receive a generic policy-coverage explanation to avoid tipping off potential fraudsters. However, the summary is factually inaccurate: the damage is consistent with the policy coverage, and the actual reason for denial is fraud suspicion. The claimant hires a loss adjuster who demonstrates that the claim is covered. The insurer cannot explain the denial without revealing the fraud model, but the summary they provided is verifiably false. The claimant's solicitor obtains the decision journal through a subject access request and demonstrates the discrepancy between the summary and the actual reason. The insurer faces a £340,000 claim for bad-faith denial plus regulatory investigation for providing misleading reasons.

What went wrong: The summary was deliberately decoupled from the decision record. The organisation replaced the actual reason with a fabricated one. While the motivation (not tipping off fraudsters) had some operational logic, the result was a provably false summary. No provenance control would have permitted this — a provenance check would have flagged that the summary's stated reason (policy coverage inconsistency) did not correspond to any factor in the decision journal. Consequence: £340,000 bad-faith claim, regulatory investigation, systemic review of all fraud-flagged denial summaries.

Scenario C — Abstraction Error Inverts Causal Direction: A benefits eligibility agent determines that a citizen qualifies for a reduced housing benefit rate. The decision record shows: the citizen's income increased from £22,000 to £28,000 (crossing the £26,500 threshold for the full rate), causing a reduction from full benefit (£450/month) to partial benefit (£280/month). The summary states: "Your housing benefit has been reduced because your income is below the threshold for full support." The summary inverted the causal direction — income went above the threshold, not below it. The abstraction logic that converted numerical comparisons into natural language mishandled the direction of the inequality. The citizen, reading the summary, believes that earning more money would restore full benefit — the opposite of reality. They take on additional work, their income rises further, and their benefit is reduced again. They file a complaint arguing that the original explanation was misleading and caused them financial harm. An investigation finds that the abstraction error affected 2,300 summaries over 6 months. The local authority must reissue corrected summaries and faces potential liability for decisions made by citizens relying on incorrect explanations. Total remediation cost: £186,000 including reissuance, complaint handling, and financial redress for citizens who demonstrably acted on the incorrect explanation.

What went wrong: The summary was generated by an abstraction layer that converted decision record entries into natural language, but no verification confirmed that the natural-language summary faithfully represented the underlying numerical reasoning. A provenance check that validated the directional relationship between the summary claim ("income is below the threshold") and the decision record entry ("income = £28,000 > threshold = £26,500") would have caught the inversion immediately. Consequence: £186,000 remediation, 2,300 corrected summaries, citizen financial harm from acting on incorrect explanations.

4. Requirement Statement

Scope: This dimension applies to every AI agent deployment that produces decision summaries — human-readable descriptions of the agent's decision, reasoning, or action — for any audience. A decision summary is any artefact that communicates the agent's decision in natural language or structured human-readable format, including: denial letters, approval notifications, explanation responses to affected individuals, regulatory disclosures, operator dashboards showing decision reasons, audit reports referencing agent decisions, and any communication that attributes a decision to specific reasons or factors. The scope includes summaries generated automatically by the agent, summaries generated by a separate explanation module that interprets the agent's decision, and summaries composed by human operators using information derived from the agent's decision record. If any human-readable representation of the decision exists, this dimension governs its provenance linkage to the underlying evidence and logic. The scope does not govern the decision itself (that is AG-036 and AG-415) — it governs the fidelity of the summary to the decision record.

4.1. A conforming system MUST maintain a verifiable provenance link between every factual claim in a decision summary and its source element in the underlying decision journal (per AG-415), such that no claim in the summary exists without a corresponding, retrievable source in the record.

4.2. A conforming system MUST record the derivation path for each decision summary — the sequence of transformations (selection, abstraction, templating, natural-language generation) that converted the decision record into the summary — in sufficient detail to allow an auditor to understand why the summary says what it says.

4.3. A conforming system MUST implement automated verification that compares each decision summary against its source decision record, detecting: claims not supported by any element in the record (unsupported claims), claims that contradict elements in the record (contradictory claims), and material omissions where a significant decision factor is absent from the summary without justification.

4.4. A conforming system MUST prevent the delivery of any decision summary that fails automated provenance verification, routing failed summaries to a human review queue for correction before delivery.

4.5. A conforming system MUST ensure that the provenance linkage is tamper-evident per AG-006 — modifications to either the summary or the decision record after initial creation are detectable, and any modification that breaks the provenance chain triggers an alert.

4.6. A conforming system MUST retain provenance metadata (links, derivation paths, verification results) for at least as long as the decision summary and decision record themselves are retained, ensuring that provenance can be verified retroactively.

4.7. A conforming system MUST ensure that when a decision summary is produced for a legally significant context (adverse action notice, regulatory disclosure, rights-affecting decision), the provenance verification includes a completeness check confirming that all legally required information elements are present and accurately sourced.

4.8. A conforming system SHOULD implement provenance-aware summary generation — rather than generating a summary and then checking it against the record, the summary generation process should be constrained to draw only from verified decision record elements, making unsupported claims structurally impossible.

4.9. A conforming system SHOULD version decision summaries, retaining all versions with timestamps and derivation metadata, so that if a summary is corrected or updated, the original version and the correction are both preserved.

4.10. A conforming system MAY implement real-time provenance dashboards that display summary-to-record fidelity metrics across the agent portfolio, enabling trend detection and early warning of systematic abstraction errors.

5. Rationale

Decision summaries are the human interface to AI decision-making. In most deployments, no stakeholder — consumer, regulator, operator, or auditor — interacts directly with the decision record. They interact with summaries: natural-language explanations, denial letters, dashboard displays, audit excerpts. The summary is not merely a communication convenience; it is the legally operative artefact. When a consumer challenges a decision, they challenge the summary. When a regulator assesses compliance, they assess the summary against the decision. When an auditor verifies controls, they trace from the summary to the evidence. If the summary is unfaithful to the decision record, every downstream activity — challenge, assessment, verification — is corrupted.

Three categories of summary infidelity pose governance risks. The first is unsupported claims: the summary states a reason that does not appear in the decision record (Scenario A). This can occur through template selection errors, hallucination in natural-language generation, or legacy template text that references factors no longer used by the model. The second is contradictory claims: the summary states something that is contradicted by the decision record (Scenario C). This typically occurs through abstraction errors — incorrect directional language, misattributed comparisons, or inverted inequalities. The third is deliberate decoupling: the summary is intentionally disconnected from the decision record to present a more palatable or strategically advantageous explanation (Scenario B). All three categories produce the same outcome: stakeholders make decisions based on false information about why the agent acted as it did.

The legal and regulatory consequences of summary infidelity are severe. Under the EU AI Act, Article 86 establishes the right to explanation of individual decisions — if the explanation is unfaithful to the actual decision, the right has not been satisfied regardless of how well-written the explanation is. Under consumer protection law, a demonstrably false reason for an adverse action may constitute deceptive practice. Under administrative law, a government decision accompanied by incorrect reasons is vulnerable to judicial review on the grounds of irrationality or error of law. Under data protection law, a subject access request may reveal the discrepancy between the summary and the actual processing logic, creating evidence of non-compliance with transparency obligations.

The provenance requirement is not merely a documentation exercise. It is a structural control that constrains what summaries can say. A summary with a provenance requirement cannot claim factors that do not exist in the decision record, because the provenance verification will reject unsupported claims. A summary with provenance verification cannot invert causal directions without detection, because the directional relationship between the claim and its source will be checked. A summary with tamper-evident provenance cannot be silently altered after delivery to match a revised narrative, because modifications are detectable. Provenance is the mechanism that makes summaries trustworthy.

The relationship between AG-450 and AG-415 (Decision Journal Completeness) is essential. AG-415 ensures that the decision record is complete — all inputs, reasoning steps, and outputs are captured. AG-450 ensures that the summary faithfully represents that record. Neither alone is sufficient: a complete decision record with an unfaithful summary (AG-415 met, AG-450 not met) produces well-documented decisions that are poorly explained. A faithful summary of an incomplete record (AG-450 met, AG-415 not met) produces accurate summaries of inadequate documentation. Both dimensions must be satisfied together to produce complete records that are accurately communicated.

6. Implementation Guidance

Decision Summary Provenance Governance requires that the summary generation pipeline is structurally linked to the decision record, with verification ensuring that every summary claim has a source and that no source-claim relationship is broken or fabricated. The core architectural principle is that summaries are derived views of the decision record, not independent compositions about the decision.

Recommended patterns:

Claim-level provenance annotation. Each factual claim in a decision summary is annotated with a pointer to its source element in the decision journal. For example, the claim "Your application was declined due to a debt-to-income ratio of 4.2:1 exceeding the maximum of 3.5:1" carries annotations linking "debt-to-income ratio of 4.2:1" to the feature value in the decision record and "maximum of 3.5:1" to the threshold parameter. These annotations may be stored as metadata alongside the summary text, invisible to the consumer but available for verification and audit. This is the most granular provenance approach and provides the strongest traceability.
Template-constrained generation with slot binding. Decision summaries are generated from templates where each variable slot is bound to a specific decision record element. The template "Your {decision_type} was {outcome} because {factor_1_description} ({factor_1_value}) {comparison} the {threshold_description} ({threshold_value})" can only be populated from verified decision record fields. The generation process cannot introduce information that does not exist in the record because the template has no free-text slots for unverified content. This approach trades flexibility for safety — templates are less natural than free-text generation but structurally prevent unsupported claims.
Post-generation provenance verification. After a summary is generated (whether by template, natural-language generation, or human composition), an automated verifier compares each claim against the decision record. The verifier checks: (1) every factual claim has a corresponding source (no unsupported claims); (2) no claim contradicts its source (no contradictory claims); (3) all material decision factors appear in the summary or are explicitly justified as omitted (no unjustified omissions). Summaries that fail verification are quarantined for human review. This approach works with any generation method, including free-text generation, but requires a robust verification engine.
Tamper-evident provenance chain. The provenance link between summary and decision record is protected by a cryptographic chain per AG-006. The decision record has a hash. The summary carries the hash of the decision record it was derived from. The provenance metadata (claim-to-source mappings, derivation path) carries hashes of both the summary and the record. Any modification to the summary, the record, or the provenance metadata that breaks the hash chain is detectable. This prevents post-hoc modification of summaries to match revised narratives, or modification of records to match incorrect summaries.
Derivation path logging. The complete transformation pipeline from decision record to summary is logged: which record elements were selected, which abstraction rules were applied, which template was used, which natural-language transformations were performed, and the final verification result. This log enables an auditor to understand not just that the summary matches the record, but how it was derived — essential for identifying systematic abstraction errors (like the directional inversion in Scenario C) that may affect multiple summaries.

Anti-patterns to avoid:

Independent explanation modules. Generating summaries using a separate module that receives only the decision outcome (approve/deny) and independently selects plausible reasons. This is the architecture that produced Scenario A — the explanation module had no structural connection to the decision record and could attribute the decision to any factor it deemed plausible. Explanation modules must be constrained to the actual decision record.
Deliberate summary-record decoupling. Intentionally replacing the actual decision reason with a different reason in the summary (Scenario B). Even when operationally motivated (e.g., not revealing fraud suspicion), this creates a provably false summary. If the actual reason cannot be disclosed, the summary should acknowledge the limitation ("We are unable to provide full details of the factors in this decision") rather than fabricate an alternative reason. Omission is defensible; fabrication is not.
Unverified natural-language generation. Using a language model to generate summaries from decision records without post-generation verification. Language models can hallucinate plausible-sounding reasons, rephrase factors in ways that alter their meaning, or omit material factors in favour of more narratively coherent explanations. Natural-language generation is a useful tool for producing readable summaries, but it must be followed by provenance verification.
Summary generation without decision journal reference. Generating a summary without referencing the decision journal entry — relying instead on the agent's real-time output or a cached decision object. If the summary is not derived from the authoritative decision journal, there is no guarantee that the summary reflects what was actually recorded. The decision journal per AG-415 is the single source of truth.
Provenance metadata stored separately from the summary. Storing provenance annotations in a separate system that can become disconnected from the summary. Provenance metadata should be structurally coupled to the summary — either embedded as metadata within the summary artefact or stored in a provenance record that carries a cryptographic reference to the summary it describes.

Industry Considerations

Financial Services. Adverse action notices in lending (required under the Equal Credit Opportunity Act in the US, FCA MCOB in the UK, and equivalent regulations elsewhere) are among the highest-stakes decision summaries. A lending denial notice that cites incorrect factors can constitute unfair lending practice, trigger disparate impact claims if the cited factors correlate with protected characteristics differently than the actual factors, and undermine the regulatory purpose of adverse action notices (enabling consumers to improve their creditworthiness). Financial firms should implement claim-level provenance annotation for all adverse action notices and automated verification before issuance.

Public Sector. Government decision summaries carry particular weight because they may be the sole basis for judicial review. A benefits decision that states incorrect reasons can be overturned on judicial review as an "error of law" — the decision-maker took into account irrelevant considerations or failed to take into account relevant ones (even if the actual decision was sound, the stated reasons were wrong). Public-sector organisations should implement mandatory provenance verification for all rights-affecting decision summaries.

Insurance. Claims denial summaries must accurately reflect the actual denial reason. Under insurance contract law in most jurisdictions, an insurer that denies a claim for a stated reason and is later shown to have denied it for a different actual reason may face bad-faith claims (Scenario B). Provenance controls are essential for aligning denial summaries with actual decision factors.

Healthcare. Clinical decision summaries (e.g., treatment recommendations, diagnostic outputs) must accurately reflect the clinical evidence and reasoning chain. A summary that cites evidence not present in the clinical record, or that misrepresents the confidence level of a finding, can lead to inappropriate clinical decisions by the reviewing clinician.

Maturity Model

Basic Implementation — Decision summaries are generated from templates with slots bound to decision record elements. Each summary carries a reference to its source decision journal entry per AG-415. Automated verification checks that template slot values match the referenced record elements. Summaries failing verification are routed to a human review queue. Provenance metadata (template used, slot bindings, verification result) is retained alongside each summary. This level meets the minimum mandatory requirements (4.1 through 4.7) and prevents the most common provenance failures.

Intermediate Implementation — All basic capabilities plus: claim-level provenance annotation links each factual claim in the summary to its specific source element in the decision record. Post-generation verification detects unsupported claims, contradictory claims, and material omissions — not just template slot mismatches. The provenance chain is tamper-evident per AG-006. Derivation path logging records the complete transformation pipeline from record to summary. Summaries are versioned, with all versions retained. Verification coverage metrics track the percentage of summaries that pass automated verification without human intervention.

Advanced Implementation — All intermediate capabilities plus: provenance-aware summary generation constrains the generation process to draw only from verified record elements, making unsupported claims structurally impossible. Real-time provenance dashboards display fidelity metrics across the portfolio, with trend detection and alerting for systematic abstraction errors. Natural-language generation, if used, is paired with semantic verification that checks not just factual accuracy but directional and causal accuracy of natural-language renderings. Independent provenance audit is conducted annually, sampling summaries and verifying end-to-end traceability from claim to source. Cross-agent provenance is maintained for multi-agent decisions where the summary must reflect reasoning contributions from multiple agents per AG-398.

7. Evidence Requirements

Required artefacts:

Provenance linkage records. For each decision summary, the provenance metadata showing: the decision journal entry reference (per AG-415), the claim-to-source mappings (at template-slot or claim level), the derivation path (template used, transformations applied, abstraction rules invoked), and the cryptographic hash chain linking summary to record per AG-006.
Automated verification results. Results of the post-generation provenance verification for each summary, including: verification pass/fail, any unsupported claims detected, any contradictory claims detected, any material omissions flagged, and the resolution of any failures (human review outcome, correction applied).
Summary generation pipeline documentation. Documented architecture of the summary generation pipeline showing: how summaries are derived from decision records, what templates or generation methods are used, where provenance annotations are injected, and how verification is performed. Must include the verification rules and thresholds.
Tamper-evidence chain records. Cryptographic hash records per AG-006 demonstrating that the provenance chain (summary, record, provenance metadata) has not been tampered with since creation. Must include hash values, timestamps, and the chain verification status.
Human review queue records. Records of summaries that failed automated verification and were routed to human review, including: the failure reason, the reviewer identity, the review outcome (correction, approval, or rejection), and the corrected summary if applicable.
Summary version history. Where summaries have been corrected or updated, the version history showing all versions with timestamps, derivation metadata, and the reason for each update.

Retention requirements:

Provenance linkage records and verification results: retain for the same duration as the decision summary and decision journal — minimum 7 years for regulated financial services; minimum 6 years for public sector rights-affecting decisions; minimum 5 years for other regulated sectors; minimum 3 years otherwise.
Summary generation pipeline documentation: retain current version and all prior versions for at least 5 years.

Access requirements:

Producible to regulators or auditors within 48 hours of request. For subject access requests, the provenance metadata supporting a specific individual's decision summary must be retrievable within 30 days (per GDPR Article 12(3) timeline).

8. Test Specification

Test 8.1: Claim-to-Source Traceability

Stimulus: Generate decision summaries for 15 decisions across at least 3 decision types. For each summary, extract every factual claim (decision outcome, cited factors, stated values, threshold references). Attempt to trace each claim to its source element in the decision journal.
Expected behaviour: Every factual claim in every summary resolves to a specific, retrievable source element in the decision journal. The provenance annotation or metadata provides the link without requiring manual search.
Pass criteria: 100% of factual claims across all 15 summaries trace to verified source elements. No claim is unsupported. Provenance links resolve programmatically.
Fail criteria: Any factual claim in any summary cannot be traced to a source element, or any provenance link fails to resolve.

Test 8.2: Unsupported Claim Detection

Stimulus: Inject 5 deliberately unsupported claims into decision summaries (e.g., citing a factor not present in the decision record, attributing the decision to a data source not consulted, or referencing a threshold that does not exist in the model configuration). Submit the modified summaries to the automated provenance verifier.
Expected behaviour: The automated verifier detects all 5 unsupported claims and flags the summaries for human review. The flagged summaries are not delivered.
Pass criteria: All 5 injected unsupported claims are detected. All flagged summaries are routed to the human review queue. No flagged summary is delivered without correction.
Fail criteria: Any injected unsupported claim passes verification, or any flagged summary is delivered without human review.

Test 8.3: Contradictory Claim Detection

Stimulus: Inject 5 deliberately contradictory claims into decision summaries (e.g., stating income is below a threshold when the record shows it is above, stating a claim was denied for policy exclusion when the record shows fraud flagging, or inverting the direction of a comparison). Submit to the automated verifier.
Expected behaviour: The automated verifier detects all 5 contradictions by comparing the summary claims against their source elements and identifying semantic or directional conflicts.
Pass criteria: All 5 contradictions are detected and flagged. Flagged summaries are quarantined for human review.
Fail criteria: Any contradiction passes verification undetected.

Test 8.4: Tamper-Evidence Chain Integrity

Stimulus: For 10 decision summaries with established provenance chains: (a) modify the summary text without updating the provenance metadata for 3 summaries; (b) modify the decision record without updating the provenance chain for 3 summaries; (c) modify the provenance metadata without updating the hash chain for 2 summaries; (d) leave 2 summaries unmodified as controls.
Expected behaviour: All 8 tampered summaries are detected through hash chain verification. The 2 unmodified summaries pass verification.
Pass criteria: 100% of tampered summaries detected (8 of 8). 100% of unmodified summaries pass (2 of 2). Zero false positives, zero false negatives.
Fail criteria: Any tampered summary passes verification, or any unmodified summary fails.

Test 8.5: Derivation Path Auditability

Stimulus: Select 5 decision summaries at random. Provide the derivation path logs to an auditor (or auditor-role tester). The auditor must reconstruct the summary from the decision record using only the derivation path documentation, without access to the summary generation system.
Expected behaviour: The auditor can reconstruct each summary by following the derivation path: identifying which record elements were selected, which abstraction rules were applied, and how the natural-language rendering was produced. The reconstructed summary matches the actual summary.
Pass criteria: All 5 summaries are reconstructable from the derivation path. Reconstructed summaries match the actual summaries in factual content (stylistic variation is acceptable).
Fail criteria: Any summary cannot be reconstructed from the derivation path, or the reconstruction produces materially different factual content.

Test 8.6: Failed Summary Routing and Human Review

Stimulus: Submit 20 decision summaries to the automated verifier, 6 of which contain provenance failures (2 unsupported claims, 2 contradictory claims, 2 material omissions). Verify that failed summaries are routed to the human review queue and that no failed summary is delivered to the intended recipient.
Expected behaviour: All 6 failed summaries are quarantined. The human review queue contains the failure reason, the source decision record reference, and the specific claims that failed. The 14 passing summaries are delivered normally.
Pass criteria: All 6 failures are caught and quarantined. Zero failed summaries are delivered. All 14 passing summaries are delivered. Human review queue entries include actionable failure details.
Fail criteria: Any failed summary is delivered, or any passing summary is incorrectly quarantined.

Test 8.7: Legally Significant Summary Completeness Verification

Stimulus: Generate summaries for 10 legally significant decisions (adverse action notices, rights-affecting determinations, regulatory disclosures). Verify that each summary includes all legally required information elements (as defined by the applicable regulation) and that each element is provenance-linked to the decision record.
Expected behaviour: Every legally required information element is present in every summary. Each element has a provenance link. The completeness check is performed automatically as part of the verification pipeline.
Pass criteria: 100% of legally required elements are present and provenance-linked across all 10 summaries. Completeness verification is evidenced in the verification log.
Fail criteria: Any legally required element is missing or lacks a provenance link.

Conformance Scoring

Score 0: No provenance controls exist — summaries are generated independently from the decision record with no traceability, verification, or tamper evidence.
Score 1: Summaries reference the decision journal entry (per AG-415) but claim-level provenance is not maintained. Verification is manual or absent. No tamper-evidence chain exists. Summaries can diverge from the record without detection.
Score 2: Claim-level provenance links every factual claim to its source in the decision record. Automated verification detects unsupported and contradictory claims. Failed summaries are quarantined for human review. The provenance chain is tamper-evident per AG-006. Derivation paths are logged. All mandatory requirements (4.1 through 4.7) are satisfied.
Score 3: Verified through independent audit confirming end-to-end provenance integrity across the portfolio. Provenance-aware generation constrains summaries to verified record elements. Semantic and directional verification catches abstraction errors. Real-time provenance dashboards with trend detection are operational. Cross-agent provenance is maintained for multi-agent decisions. Summary fidelity is independently tested at least annually.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 13 (Transparency and Provision of Information)	Direct requirement
EU AI Act	Article 86 (Right to Explanation of Individual Decision-Making)	Direct requirement
SOX	Section 302/404 (Internal Controls and Certification)	Supports compliance
FCA SYSC	6.1.1R (Systems and Controls)	Supports compliance
NIST AI RMF	MEASURE 2.5, MANAGE 4.2	Supports compliance
ISO 42001	Clause 8.4 (AI System Operation and Monitoring)	Supports compliance
DORA	Article 10 (Detection)	Supports compliance

EU AI Act — Article 13 (Transparency) and Article 86 (Right to Explanation)

Article 13 requires that high-risk AI systems provide information sufficient to interpret the system's output. Article 86 gives affected persons the right to "clear and meaningful explanations" of individual decisions. Both provisions implicitly require that explanations are faithful to the actual decision logic — an explanation that misrepresents the decision reasoning does not enable interpretation (Article 13) and is not meaningful (Article 86). AG-450 ensures that decision summaries — the primary vehicle for satisfying both articles — are provenance-linked to the underlying decision record, preventing the delivery of explanations that are technically clear but factually disconnected from what actually happened. The requirement for automated verification directly supports Article 13's operational transparency by ensuring that the summary generation pipeline itself is reliable, not just the individual summaries.

SOX — Section 302/404 (Internal Controls and Certification)

Management certification under SOX Section 302 and auditor attestation under Section 404 require that controls over financial reporting are effective. When AI agents make financial decisions, the summaries of those decisions are part of the audit trail. If summaries do not faithfully represent the actual decision logic, the audit trail is corrupted — auditors may certify controls as effective based on summaries that do not reflect actual system behaviour. AG-450's provenance controls ensure that the audit trail from decision summary to decision record is verifiable, supporting the integrity of SOX attestations.

FCA SYSC — 6.1.1R (Systems and Controls)

The FCA requires firms to maintain adequate systems and controls. A system that generates decision summaries disconnected from actual decision logic is not an adequate control — it is a communication system with no integrity assurance. The FCA's Senior Managers and Certification Regime (SM&CR) also requires that senior managers can accurately describe the decisions made under their oversight. If decision summaries are unfaithful, senior managers cannot fulfil this responsibility. AG-450 provides the structural control that ensures summary integrity.

NIST AI RMF — MEASURE 2.5 and MANAGE 4.2

MEASURE 2.5 addresses the accuracy and reliability of AI system outputs and their documentation. MANAGE 4.2 addresses the management of AI system risks through ongoing monitoring. Decision summaries are a form of output documentation, and their fidelity to the actual decision process is a reliability concern. AG-450's automated verification directly supports MEASURE 2.5 by ensuring that documentation (summaries) accurately reflects system outputs (decisions). The provenance monitoring capability supports MANAGE 4.2 by detecting systematic summary-record divergence as an ongoing risk.

DORA — Article 10 (Detection)

DORA Article 10 requires financial entities to detect anomalous activities, including ICT-related incidents. Systematic divergence between decision summaries and decision records — whether caused by abstraction errors, template failures, or deliberate manipulation — is an ICT-related anomaly that should be detected by monitoring systems. AG-450's automated verification and provenance monitoring directly support DORA's detection requirements by identifying summary infidelity as a detectable and monitorable condition.

ISO 42001 — Clause 8.4 (AI System Operation and Monitoring)

ISO 42001 requires that AI systems are operated and monitored in accordance with the AI management system. Decision summary generation is an operational process that must be monitored for quality and reliability. AG-450 provides the monitoring framework — provenance verification, derivation path logging, and fidelity metrics — that enables ISO 42001-compliant monitoring of the summary generation process.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Every decision summary the agent produces — potentially thousands per day — with disproportionate impact on legally significant summaries (adverse action notices, regulatory disclosures, rights-affecting decisions) where summary infidelity carries direct legal liability

Consequence chain: Summary provenance failure begins as a technical gap — summaries are generated without structural linkage to the decision record — and manifests as factual infidelity in the summaries themselves. The first-order consequence is that stakeholders receive explanations that do not accurately represent the agent's actual reasoning. For consumers, this means exercising rights (challenge, appeal, complaint) based on incorrect information about why a decision was made (Scenario A: applicant told "insufficient trading history" when the actual reason was debt-to-income ratio). For regulators, this means conducting oversight based on a false picture of how the agent operates. For operators, this means diagnosing system behaviour based on summaries that may not reflect actual system state. The second-order consequence is legal liability: provably false explanations constitute deceptive practice under consumer protection law, bad-faith denial under insurance law (Scenario B: £340,000 claim), and error of law under administrative law (Scenario C: judicial review vulnerability). The third-order consequence is systemic: when summary infidelity is discovered, the credibility of all summaries is destroyed — not just the specific summaries found to be inaccurate. Regulators, courts, and consumers can no longer trust any explanation the organisation provides, requiring retrospective verification of the entire summary corpus (Scenario A: 8,400 summaries reviewed at £410,000). The fourth-order consequence is governance failure: if summaries cannot be trusted to represent actual decisions, the entire explanation infrastructure — including AG-449 audience-specific explanations, AG-452 counterfactual explanations, and AG-453 adverse action notices — is undermined, because all of these depend on the fidelity of the underlying summary to the decision record.

Cross-references: AG-006 (Tamper-Evident Record Integrity), AG-415 (Decision Journal Completeness Governance), AG-449 (Audience-Specific Explanation Governance), AG-452 (Counterfactual Explanation Governance), AG-453 (Adverse Action Notice Governance), AG-416 (Evidentiary Chain-of-Custody Governance), AG-036 (Reasoning Integrity Governance), AG-398 (Cross-Agent Blame Attribution Governance).

Cite this protocol

AgentGoverning. (2026). AG-450: Decision Summary Provenance Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-450

← Previous Protocol

AG-449

Audience-Specific Explanation Governance

Next Protocol →

AG-451

Plain-Language Duty Governance