AG-318: Data Correction Backpropagation Governance

2. Summary

Data Correction Backpropagation Governance requires that when source data is corrected, the correction is systematically propagated through all cached copies, derived values, and downstream decision artefacts that were produced from the incorrect data. A data correction that updates only the source record but leaves stale copies, cached values, derived metrics, and historical agent decisions unremediated is an incomplete correction — the organisation has fixed the source but the consequences of the error persist in every downstream artefact. Without backpropagation governance, data corrections create a false sense of remediation: the source is correct, but the operational data landscape still reflects the error.

3. Example

Scenario A — Corrected Price Not Propagated to Cached Valuations: A portfolio management agent values client portfolios using end-of-day equity prices. On 15 March, the pricing feed reports Company X's closing price as £42.30. This price is consumed by the agent, cached in the valuation engine, and used to value 1,200 portfolios containing Company X. On 16 March, the pricing feed provider issues a correction: Company X's actual closing price on 15 March was £38.70 (the initial feed included an erroneous trade). The correction updates the authoritative price in the source system. However, no backpropagation mechanism exists. The cached price of £42.30 remains in the valuation engine. The 1,200 portfolio valuations from 15 March remain incorrect. Client statements already generated reflect the incorrect valuations. The risk management system's exposure calculations from 15 March remain based on £42.30. Over the next week, 3 portfolio rebalancing decisions are made by the agent using the stale cached price before the cache naturally refreshes. Total valuation error across affected portfolios: £4.3 million. Client complaints after corrected statements are issued: 47. Regulatory notification required under FCA SUP 15.

What went wrong: The source correction was applied to the authoritative system but not propagated to: (1) the cached price in the valuation engine, (2) the historical valuations produced from the incorrect price, (3) the risk calculations derived from those valuations, or (4) the client statements already distributed.

Scenario B — Corrected Customer Data Not Propagated to Feature Store: A customer churn prediction agent consumes features from a feature store that is populated from the CRM. Customer C-7783's annual_revenue field is corrected in the CRM from £2.3 million to £230,000 (a decimal point error that persisted for 3 months). The CRM correction executes successfully. However, the feature store, which refreshes features weekly, does not receive the correction until the next weekly refresh — 5 days later. During those 5 days, the churn prediction agent continues to predict based on the £2.3 million revenue figure. The customer is classified as "high value — low churn risk" and assigned to a premium service tier. After the feature store refreshes, the customer is reclassified as "moderate value — medium churn risk." The service tier reassignment generates a customer complaint. Investigation reveals 2,340 features in the feature store derived from the incorrect revenue figure, each used in multiple predictions across 3 models. Total affected predictions: approximately 7,000 over 3 months. The remediation effort to identify and reassess all affected predictions costs £45,000.

What went wrong: The correction was applied at the source (CRM) but did not trigger an immediate refresh of the feature store or invalidation of derived features. The 5-day propagation delay allowed incorrect predictions to continue. The 3-month backlog of affected predictions was not identified because no forward provenance query (AG-317) existed to determine which predictions consumed the incorrect feature.

Scenario C — Corrected Regulatory Data Not Propagated to Agent Decisions: A compliance agent screens customer onboarding applications against a politically exposed persons (PEP) database. An individual is incorrectly classified as a PEP due to a name-matching error. The PEP database provider issues a correction removing the false PEP flag. The organisation's compliance system receives the correction and updates its local PEP database. However, 4 customer onboarding applications that were declined based on the false PEP flag are not automatically re-evaluated. The 4 applicants remain declined. One applicant files a complaint, escalating to the Financial Ombudsman. The organisation discovers it has no mechanism to identify decisions affected by the corrected data and no process to re-evaluate them. Manual investigation identifies the 4 affected applications at a cost of £8,500 in compliance analyst time. The Ombudsman awards £3,000 in compensation to the complainant and notes the absence of systematic correction propagation.

What went wrong: The source correction was applied but the downstream decisions made on the incorrect data were not identified or remediated. The organisation had no mechanism to propagate the correction from the data layer to the decision layer.

4. Requirement Statement

Scope: This dimension applies to all AI agent systems where source data corrections can occur after the data has been consumed by agents for decisions. The scope covers: corrections to authoritative source data (AG-309), corrections to reference data (sanctions lists, PEP databases, regulatory registers), corrections to market data (price corrections, trade cancellations), corrections to customer data (address corrections, identity verification updates), and retractions of previously published data. The scope extends to all downstream artefacts produced from the incorrect data: cached copies, derived values (AG-317), model predictions, agent decisions, generated reports, and communications sent to customers or counterparties. The scope includes corrections that originate internally (data quality issue discovered) and externally (data provider issues a correction).

4.1. A conforming system MUST implement a correction propagation mechanism that, when source data is corrected, identifies all cached copies, derived values, and decision artefacts that were produced from the incorrect data.

4.2. A conforming system MUST invalidate or update all cached copies of corrected data within a defined propagation window, ensuring agents do not continue operating on the pre-correction value.

4.3. A conforming system MUST flag all derived values that were computed from corrected source data, marking them as potentially affected and requiring re-derivation or human assessment.

4.4. A conforming system MUST log every correction propagation event, including: the source correction, the affected downstream artefacts identified, the propagation action taken for each artefact, and the completion timestamp.

4.5. A conforming system MUST maintain a correction register that records all data corrections received, their propagation status (complete, in progress, pending), and any downstream artefacts that could not be automatically corrected (requiring manual intervention).

4.6. A conforming system SHOULD use forward provenance queries (AG-317) to automatically identify all derived values affected by a source correction, rather than relying on manual impact assessment.

4.7. A conforming system SHOULD implement tiered propagation urgency based on field criticality (AG-310) — corrections to decision-critical fields should propagate with higher urgency than corrections to non-critical fields.

4.8. A conforming system SHOULD automatically trigger re-evaluation of agent decisions that were made using corrected data, routing the re-evaluation to appropriate review (automated re-processing or human assessment depending on decision criticality).

4.9. A conforming system MAY implement correction impact scoring that estimates the materiality of a correction's downstream impact, enabling prioritisation when multiple corrections require simultaneous propagation.

5. Rationale

Data corrections are routine operational events. Pricing feeds issue corrections. Credit reference agencies update scores. Regulatory databases add and remove entries. Customer records are amended. In a pre-AI world, data corrections were manageable: a human analyst received the correction, updated the relevant records, and assessed the impact on recent decisions. The process was manual, slow, and limited by human capacity, but it was governed.

In an AI agent world, data corrections create a fundamentally different challenge. Between the time incorrect data was consumed and the time the correction arrives, an AI agent operating at machine speed may have used that data in thousands of decisions, each of which produced derived values, triggered actions, and generated outputs. A single data correction can affect a graph of downstream artefacts that is orders of magnitude larger than in a human-driven process.

The provenance chain (AG-317) provides the infrastructure to identify affected artefacts — given a corrected source record, a forward provenance query identifies every derived value that consumed it. But identification alone is not sufficient. The correction must be propagated: caches must be invalidated, derived values must be re-computed or flagged, and decisions must be re-evaluated or at minimum documented as affected.

Without systematic backpropagation, corrections create a false sense of remediation. The source record shows the correct value, satisfying a superficial audit. But the operational data landscape — caches, feature stores, model predictions, agent decisions, customer communications — still reflects the error. This is a particularly dangerous state because the organisation believes it has corrected the problem while the consequences persist.

The tiered urgency requirement (4.7) reflects the reality that not all corrections are equally urgent. A correction to a sanctions list entry that caused false negatives (missed sanctions hits) requires immediate propagation and decision re-evaluation. A correction to a non-critical display field requires propagation but not urgent re-evaluation. Field criticality classification (AG-310) provides the basis for this prioritisation.

The decision re-evaluation requirement (4.8) is the most operationally challenging but most important element. Identifying that a decision was made on incorrect data is necessary but not sufficient — the organisation must also determine whether the decision would have been different with correct data and, if so, what remediation is required. For high-volume agent decisions, automated re-evaluation (re-running the decision with corrected data and comparing the outcome) is the only scalable approach.

6. Implementation Guidance

Correction backpropagation requires four components: correction detection (identifying when source data is corrected), impact identification (determining which downstream artefacts are affected), propagation execution (invalidating, updating, or flagging affected artefacts), and remediation tracking (monitoring the propagation to completion).

Recommended patterns:

Event-driven correction propagation. When a source data correction is received, the source system publishes a correction event to a message bus or event stream. The correction event includes: the corrected record identifier, the old value, the new value, the correction timestamp, and the correction reason. Downstream systems subscribe to correction events and take appropriate action: cache layers invalidate the affected entry, derivation pipelines re-trigger computation for affected outputs, and the decision review system flags affected decisions for re-evaluation.
Provenance-driven impact analysis. When a correction event is received, query the provenance system (AG-317) with a forward query: "Which derived values consumed source record X?" The provenance system returns all affected derived values across all derivation stages. For each affected derived value, the system determines the appropriate action: re-derive (if the derivation is repeatable), flag for manual review (if the derivation involves model inference that may have changed), or mark as affected in the correction register.
Correction propagation state machine. Track each correction through a defined state machine: RECEIVED (correction identified), IMPACT_ASSESSED (downstream artefacts identified), CACHE_INVALIDATED (cached copies cleared), DERIVATIONS_FLAGGED (derived values marked), DECISIONS_QUEUED (affected decisions queued for re-evaluation), PROPAGATION_COMPLETE (all downstream artefacts remediated). Monitor the state machine to ensure no correction stalls at an intermediate state.
Materiality-based triage. Not every downstream artefact affected by a correction requires the same remediation. Implement a materiality assessment: if the corrected value changes a derived metric by less than a defined threshold (e.g., less than 0.1% impact on a portfolio valuation), the correction is propagated but the derived value is re-computed without human review. If the impact exceeds the threshold, the derived value is flagged for human assessment. If the correction affects a decision that triggered an external action (payment, communication, regulatory filing), human review is always required.

Anti-patterns to avoid:

Source-only correction. Correcting the source record and assuming the correction naturally propagates through cache refreshes and batch pipeline runs. Cache TTLs and batch schedules are not correction mechanisms — they may leave stale data in downstream systems for hours, days, or until the next scheduled refresh.
Bulk invalidation without targeted propagation. Responding to a correction by invalidating all caches or re-running all derivation pipelines. This is operationally expensive and unnecessary. Provenance enables targeted propagation — only the artefacts actually affected by the corrected source record need remediation.
Correction without decision re-evaluation. Propagating the correction through data artefacts (caches, derived values) but not assessing the impact on historical decisions. The data is corrected, but the decisions made on incorrect data remain unremediated.
Manual-only propagation. Relying on human operators to identify and remediate affected artefacts after a correction. At the scale and speed of AI agent operations, manual propagation cannot keep pace. Automated propagation with human review for material impacts is the scalable pattern.
Ignoring vector store corrections. Documents in vector stores (AG-132) that were embedded from incorrect source data retain the incorrect information in their embeddings. A source correction does not automatically update the embedding. Vector store corrections require re-embedding the corrected document and removing or updating the stale embedding.

Industry Considerations

Financial Services. Trade corrections, price corrections, and reference data amendments are daily events. The T+1 settlement cycle means corrections must be propagated within hours to avoid settlement failures. Regulatory reporting corrections (e.g., EMIR trade report amendments) have defined deadlines. The FCA expects firms to be able to identify all affected transactions when a data correction is received and to remediate within defined timeframes.

Healthcare. Clinical data corrections (amended lab results, corrected diagnoses) must be propagated to clinical decision support systems immediately. A corrected allergy record that does not propagate to the prescribing system creates patient safety risk. NHS Digital's data quality framework requires that corrections be tracked from source to all consuming systems.

Public Sector. Corrections to citizen data (address changes, benefit eligibility updates, identity corrections) must propagate to all government systems that hold copies. The right to rectification under GDPR Article 16 requires that corrections be applied across all processing activities, not just the source system. AG-318 provides the operational mechanism for fulfilling rectification requests across AI agent systems.

Maturity Model

Basic Implementation — The organisation has a documented correction handling process for its primary data sources. When a correction is received, the source system is updated and a notification is sent to downstream system owners. Cache invalidation is triggered manually or through scheduled refreshes. A correction register logs received corrections and their status. Decision re-evaluation is manual, triggered for high-value corrections only.

Intermediate Implementation — Correction events are published to a message bus, and downstream systems automatically respond. Forward provenance queries identify affected derived values. Cache invalidation is automatic and immediate. Derived values are flagged and re-computation is triggered. A state machine tracks propagation to completion. Materiality-based triage prioritises remediation effort. Decision re-evaluation is automated for repeatable decisions.

Advanced Implementation — All intermediate capabilities plus: vector store corrections are automated (re-embedding corrected documents). Adversarial testing has verified that correction suppression, propagation bypass, and state machine manipulation attacks are detected. The organisation can demonstrate end-to-end correction propagation from source to all downstream artefacts within defined timeframes. Correction impact scoring enables prioritisation when multiple corrections arrive simultaneously. The organisation can produce a complete audit trail showing, for any historical correction, which artefacts were affected and how each was remediated.

7. Evidence Requirements

Required artefacts:

Correction register. The log of all data corrections received, including source, correction details (old value, new value), timestamp, affected artefact count, propagation status, and completion timestamp.
Propagation execution logs. Detailed records of each propagation action: cache invalidation, derivation re-computation, decision flagging, with timestamps and outcomes.
Decision re-evaluation records. For decisions flagged as affected by a correction, the re-evaluation result: whether the decision would have been different, what remediation action was taken, and (if applicable) what communication was sent to affected parties.
Propagation completeness metrics. Statistics showing the percentage of corrections that complete full propagation within the defined timeframe, and the average propagation duration.

Retention requirements:

Correction registers and propagation logs: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request.

8. Test Specification

Test 8.1: Cache Invalidation on Correction

Stimulus: Cache a data value consumed by an agent. Issue a correction to the source record. Verify the cached value is invalidated within the defined propagation window.
Expected behaviour: The cached copy of the corrected value is invalidated or updated. Subsequent agent data requests retrieve the corrected value from the authoritative source.
Pass criteria: The cache reflects the corrected value within the defined propagation window. No agent consumes the pre-correction cached value after the propagation window.
Fail criteria: The cache retains the pre-correction value beyond the propagation window, or the agent consumes stale cached data after the correction.

Test 8.2: Derived Value Flagging

Stimulus: Correct a source record that is a known input to a derived value (identified via provenance — AG-317). Verify the derived value is flagged as affected.
Expected behaviour: The system identifies the derived value through forward provenance query and flags it as affected by the correction. Re-derivation is triggered or manual review is queued.
Pass criteria: All derived values consuming the corrected source record are identified and flagged. The correction register reflects the affected artefacts.
Fail criteria: Derived values consuming the corrected record are not identified or flagged.

Test 8.3: Decision Re-Evaluation Triggering

Stimulus: Correct a source record that contributed to a historical agent decision. Verify the decision is queued for re-evaluation.
Expected behaviour: The system identifies the affected decision, queues it for re-evaluation, and determines whether the decision would have been different with correct data.
Pass criteria: The affected decision is identified, re-evaluated, and the re-evaluation outcome is logged. If the decision would have been different, a remediation action is recorded.
Fail criteria: The affected decision is not identified, or re-evaluation is not triggered.

Test 8.4: Propagation Completeness

Stimulus: Issue a correction that affects multiple downstream artefacts (cached values, derived values, agent decisions). Monitor the propagation state machine to completion.
Expected behaviour: The state machine progresses through all stages: RECEIVED, IMPACT_ASSESSED, CACHE_INVALIDATED, DERIVATIONS_FLAGGED, DECISIONS_QUEUED, PROPAGATION_COMPLETE.
Pass criteria: The correction reaches PROPAGATION_COMPLETE within the defined timeframe. All affected artefacts are remediated. The correction register shows complete status.
Fail criteria: The propagation stalls at an intermediate state, or artefacts are missed.

Test 8.5: Correction Propagation to Vector Store

Stimulus: Correct a source document that has been embedded in a vector store (AG-132). Verify the stale embedding is updated or removed.
Expected behaviour: The vector store re-embeds the corrected document and removes or replaces the stale embedding. RAG queries subsequently retrieve the corrected version.
Pass criteria: The vector store reflects the corrected document. RAG queries return the corrected content.
Fail criteria: The vector store retains the stale embedding, or RAG queries return pre-correction content.

Conformance Scoring

Score 0: No correction propagation mechanism exists — corrections are applied at the source only, with no systematic downstream propagation.
Score 1: Corrections are propagated to caches through scheduled refreshes, but derived values and agent decisions are not systematically assessed — propagation is partial and delayed.
Score 2: Corrections trigger automated cache invalidation, forward provenance queries identify affected derived values, and a propagation state machine tracks remediation to completion. Decision re-evaluation is operational.
Score 3: Verified by independent adversarial testing — correction suppression, propagation bypass, and state machine manipulation attacks have been attempted and failed. Vector store corrections are automated. End-to-end correction traceability is demonstrable for any historical correction.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
GDPR	Article 16 (Right to Rectification)	Direct requirement
GDPR	Article 19 (Notification of Rectification)	Direct requirement
EU AI Act	Article 10 (Data and Data Governance)	Supports compliance
BCBS 239	Principle 3 (Accuracy and Integrity)	Supports compliance
FCA SUP	15.3 (Notification Requirements)	Supports compliance
EMIR	Article 9 (Reporting — Corrections)	Supports compliance
NIST AI RMF	MANAGE 2.2, MANAGE 4.1	Supports compliance
ISO 42001	Clause 10.1 (Nonconformity and Corrective Action)	Supports compliance

Article 16 grants data subjects the right to rectification of inaccurate personal data without undue delay. Article 19 requires the controller to communicate the rectification to each recipient to whom the personal data has been disclosed. For AI agent systems, this means that when a data subject exercises the right to rectification, the correction must propagate to every system where the data has been copied, cached, or derived — including feature stores, vector stores, model training sets, and cached agent contexts. AG-318 provides the operational mechanism for fulfilling this obligation systematically.

BCBS 239 — Principle 3 (Accuracy and Integrity)

Risk data must be accurate. When a data correction is received, accuracy requires that the correction propagate through all risk data aggregation and reporting systems. An risk figure calculated from pre-correction data remains inaccurate even after the source is corrected. AG-318 ensures corrections flow through the aggregation chain.

EMIR — Article 9 (Reporting — Corrections)

EMIR requires that trade reports be corrected within defined timeframes when errors are identified. For AI agents involved in trade reporting, a data correction that affects reported values must trigger report amendments. AG-318's propagation mechanism identifies which reports are affected and queues the amendments.

FCA SUP — 15.3 (Notification Requirements)

Firms must notify the FCA of significant data errors that affect regulatory reporting or client outcomes. AG-318's correction register and impact assessment provide the information needed to determine whether a correction is material enough to require FCA notification and to document the firm's response.

ISO 42001 — Clause 10.1 (Nonconformity and Corrective Action)

ISO 42001 requires organisations to react to nonconformities and take corrective action. A data correction is a response to a data nonconformity. AG-318 ensures that the corrective action is complete — extending from the source through all downstream artefacts — rather than limited to the source system only.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Decision-chain-wide — affects all downstream artefacts and decisions derived from the incorrectly corrected data, potentially spanning multiple agents, systems, and business processes

Consequence chain: A data correction that does not propagate creates a split-truth condition: the source reflects reality while the operational data landscape reflects the error. The consequences compound with the volume and speed of agent operations. In Scenario A, a price correction that did not propagate to cached valuations caused £4.3 million in valuation error, 47 client complaints, and an FCA notification. In Scenario B, a revenue correction that did not propagate to the feature store caused approximately 7,000 affected predictions and £45,000 in remediation costs. In Scenario C, a PEP database correction that did not trigger decision re-evaluation caused 4 incorrectly declined customers, a Financial Ombudsman complaint, and £3,000 in compensation. The regulatory impact includes GDPR Article 16 and 19 non-compliance (failure to propagate rectification to all recipients), BCBS 239 accuracy failure (risk data remains inaccurate despite source correction), and FCA systems and controls findings (inadequate correction handling). The split-truth condition is particularly dangerous because the organisation believes the correction is complete (the source is correct) while the operational impact of the error persists undetected.

Cross-references: AG-317 (Derived Data Provenance Governance) provides the forward provenance queries that enable impact identification — without AG-317, the organisation cannot systematically determine which derived values are affected by a source correction. AG-133 (Source Record Lineage) traces the lineage from source to downstream copies. AG-309 (Authoritative Source Register Governance) — corrections originate from the authoritative source and must propagate to all copies. AG-316 (Temporal Validity Window Governance) — a correction may effectively reset the validity window of cached data. AG-313 (Synthetic and Augmented Data Tagging Governance) — corrections to source data that was used to generate synthetic data may require re-generation. AG-315 (Schema Drift Governance) — some corrections result from schema drift that caused systematic data errors. AG-132 (Vector Store and RAG) — corrections must propagate to vector store embeddings. AG-006 (Tamper-Evident Record Integrity) — corrections must be distinguishable from tampering, with an audit trail that documents the correction was authorised.

Cite this protocol

AgentGoverning. (2026). AG-318: Data Correction Backpropagation Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-318

← Previous Protocol

AG-317

Derived Data Provenance Governance

Next Protocol →

AG-319

Purpose-Consent Granularity Governance