AG-614

Climate-Risk Data Provenance Governance

Sustainability, Environment & Climate ~23 min read AGS v2.1 · April 2026
EU AI Act NIST ISO 42001

Section 2: Summary

This dimension governs the end-to-end traceability of climate and hazard data ingested, transformed, and applied by AI agents in planning, operational, and decision-support contexts — covering the identification of original data sources, model versions, transformation pipelines, temporal validity windows, and spatial resolution constraints that condition any climate-sensitive output. It matters because decisions downstream of poorly provenanced climate data — including infrastructure siting, emergency evacuation routing, insurance underwriting, agricultural scheduling, and industrial permit compliance — carry material physical, financial, and legal consequences that cannot be audited, contested, or remediated without a verifiable chain of custody linking outputs back to their foundational data inputs. Failure manifests as agents producing climate-risk assessments, hazard classifications, or adaptation recommendations that rely on outdated, misrepresented, or spatially mismatched datasets, with no mechanism for human reviewers, regulators, or affected communities to identify the error until physical harm, financial loss, or regulatory breach has already occurred.

Section 3: Example

Example 3.1 — Flood-Risk Siting Decision Based on Superseded Elevation Model

A public-sector AI planning agent is tasked with evaluating proposed residential development sites across a coastal municipality. The agent draws flood-inundation risk scores from a digital elevation model (DEM) dataset tagged internally as "current," but the underlying elevation raster was produced in 2011 using LiDAR surveys conducted before a 0.34-metre sea-level rise event documented by the national tide gauge network through 2023. The provenance record held by the agent references only the internal dataset identifier — it carries no timestamp for the original survey date, no version number for the DEM processing pipeline, and no flag indicating that the national hydrological authority issued a mandatory reprocessing notice in 2021. The agent recommends three sites as low-risk. Planning approval proceeds. Two years after construction begins, a Category 2 storm produces inundation depths of 1.1 metres across all three sites — depths consistent with updated 2023 DEM outputs that account for subsidence and sea-level rise. The total insured loss across 847 residential units is £312 million. Post-incident audit cannot reconstruct which dataset version informed the agent's risk score because no provenance chain was recorded, preventing liability attribution and blocking insurance subrogation proceedings.

Example 3.2 — Wildfire Evacuation Routing Agent Using Mismatched Spatial Resolution Data

An embodied robotic convoy management system, deployed by a regional emergency management authority, uses an AI routing agent to determine real-time evacuation paths during an active wildfire event. The agent fuses two hazard data streams: a fire spread model operating at 1-kilometre grid resolution and a road-network risk overlay operating at 30-metre resolution. The provenance metadata for the fire spread model correctly records its 1-kilometre resolution, but the agent's fusion layer applies the coarser-resolution output as if it were spatially equivalent to the finer-resolution road overlay, producing route recommendations that classify a 400-metre road segment crossing a dry creek bed as safe when the fire spread model at its native resolution places the entire 1-kilometre cell containing that segment in a high-probability ignition zone. Seventeen vehicles are directed onto the segment. The fire front arrives 22 minutes after routing, consistent with the 1-kilometre model's probability surface. Fourteen vehicles sustain fire damage; six occupants require hospitalisation. Post-incident review reveals the fusion agent had no provenance check enforcing resolution compatibility before data layers were combined, and no uncertainty propagation flag was surfaced to human dispatchers.

Example 3.3 — Carbon-Market Compliance Agent Citing Retracted Climate Projection Dataset

A cross-border enterprise compliance agent is responsible for preparing annual Task Force on Climate-related Financial Disclosures (TCFD) scenario analyses for a multinational energy firm operating across seven jurisdictions. The agent uses a climate projection dataset published by a consortium of academic institutions under version identifier CPD-2019-RCP8.5-v1. In March 2022, the consortium retracted this dataset version and issued CPD-2022-RCP8.5-v3, correcting a systematic bias in sea-surface temperature forcing that caused a 15–22% underestimation of extreme precipitation intensity across Southeast Asian grid cells. The compliance agent continues to reference CPD-2019-RCP8.5-v1 through the 2022 and 2023 reporting cycles because its data provenance layer has no mechanism for ingesting retraction notices from originating institutions and no automated validity-check against the consortium's published changelog. The firm's 2022 and 2023 TCFD disclosures materially understate physical climate risk for three facilities representing approximately USD 2.4 billion in insured asset value. Two securities regulators subsequently open investigations into whether the disclosures violated mandatory climate-risk disclosure frameworks. The firm's external auditors qualify their opinion on the climate risk section for both years.

Section 4: Requirement Statement

4.0 Scope

This dimension applies to any AI agent that ingests, processes, stores, references, or propagates climate data, hazard data, environmental monitoring data, or any derivative thereof — including climate model projections, historical weather observations, remote-sensing products, sea-level records, flood inundation models, wildfire spread models, drought indices, storm surge outputs, soil moisture records, and ecosystem vulnerability assessments — where such data informs a planning recommendation, an operational decision, a compliance filing, an infrastructure assessment, a risk classification, an emergency response action, or any output presented to human decision-makers or downstream automated systems. The scope encompasses data ingested directly from external sources, data retrieved from internal stores, and data produced by intermediate transformation, fusion, or downscaling pipelines operated by or on behalf of the agent. The scope applies regardless of whether the agent operates autonomously, as a co-pilot, or as a background data enrichment layer.

4.1 Provenance Record Mandatory Fields

The agent MUST attach a provenance record to every climate or hazard data artefact it ingests, transforms, or outputs. Each provenance record MUST include, at minimum: (a) the canonical identifier of the originating dataset or model (including version number or release tag where one exists); (b) the name or institutional identity of the data producer or publisher; (c) the date of original publication or last verified update; (d) the spatial resolution and coordinate reference system of the data; (e) the temporal coverage period (start date and end date, or "operational real-time" with update frequency); (f) the declared uncertainty range or confidence interval associated with the data, where published by the originating source; and (g) any known limitations, caveats, or applicability constraints documented by the originating source. Where any mandatory field is unavailable, the agent MUST record the field as explicitly absent and flag the artefact for human review before use in a consequential decision.

4.2 Lineage Chain Maintenance

The agent MUST maintain a complete lineage chain that records every transformation, fusion, aggregation, downscaling, or interpolation operation applied to a source climate or hazard dataset from ingestion through to the point at which it contributes to an output. Each step in the chain MUST record: (a) the transformation type; (b) the software component or model executing the transformation, including version; (c) the timestamp of execution; (d) any parameters or configuration values material to the output; and (e) all input artefact identifiers consumed by the step. Lineage chains MUST be stored in a tamper-evident log that preserves insertion order and does not permit post-hoc modification of existing entries.

4.3 Validity Window Enforcement

The agent MUST enforce a validity window for every climate or hazard dataset it uses. The validity window MUST be defined by: (a) the temporal coverage period of the dataset (from 4.1(e)); and (b) a maximum permissible staleness period appropriate to the dataset type, as specified in the agent's configuration and approved by the responsible human authority. The agent MUST NOT apply a dataset to a decision context whose reference date falls outside the dataset's temporal coverage period unless the dataset is explicitly classified as a climate projection and the reference date falls within the projection's stated future horizon. The agent MUST surface a staleness warning to any human reviewer or downstream system when a dataset has not been refreshed within its approved staleness period, and MUST NOT suppress this warning automatically.

4.4 Retraction and Supersession Monitoring

The agent MUST implement a monitoring mechanism that checks for retraction notices, version supersessions, or mandatory-update advisories issued by the originating sources of all climate and hazard datasets currently held in its active data store. Monitoring MUST occur at a frequency no less than that specified in the agent's configuration, which MUST be reviewed and approved by the responsible human authority at least annually. Upon detection of a retraction or supersession, the agent MUST: (a) quarantine the affected dataset, preventing its use in new outputs; (b) generate an alert to the responsible human authority within a timeframe specified in configuration and not to exceed 24 hours for Tier 1 (safety-critical or regulatory compliance) use cases; and (c) initiate a re-evaluation workflow for any prior decisions or outputs that relied on the retracted or superseded dataset within the preceding rolling period specified in the agent's data retention policy.

4.5 Spatial Resolution Compatibility Enforcement

The agent MUST verify that all climate or hazard datasets being fused, overlaid, or jointly applied to a decision share a compatible spatial resolution, or that a documented and approved resampling procedure has been applied. Compatibility MUST be assessed against a resolution-mismatch threshold defined in the agent's configuration. Where datasets of materially different resolutions are combined, the agent MUST: (a) propagate the coarser resolution as the effective resolution of the fused output; (b) record the resolution mismatch and the resampling method applied in the lineage chain; and (c) include a resolution-mismatch flag in any output presented to human reviewers. The agent MUST NOT present fused outputs to human reviewers or downstream systems at a precision that implies resolution finer than the effective resolution of the coarsest contributing dataset.

4.6 Uncertainty Propagation

The agent MUST propagate uncertainty quantification through each transformation step in the lineage chain. Where an originating dataset carries a published uncertainty range, that uncertainty MUST NOT be discarded or collapsed to a point estimate at any transformation step unless the transformation method is documented to reduce uncertainty (e.g., ensemble averaging with documented skill metrics), and even then the residual uncertainty after reduction MUST be recorded. Outputs presented to human reviewers MUST include a declared uncertainty statement derived from the propagated uncertainty chain. The agent MUST NOT present climate-risk outputs as deterministic values without an accompanying uncertainty disclosure.

4.7 Cross-Border Dataset Jurisdiction Tagging

Where the agent operates across multiple jurisdictions, it MUST tag each dataset with the jurisdiction(s) for which it is authorised, including any licence restrictions, export controls, or data-sovereignty conditions attached to the dataset by the originating authority. The agent MUST NOT apply a dataset to a decision in a jurisdiction for which the dataset is not authorised, and MUST surface a jurisdiction conflict flag to the responsible human authority when a requested use would violate a licence or sovereignty restriction.

4.8 Human-Readable Provenance Summary

The agent MUST generate a human-readable provenance summary for every climate-risk output it presents to a human decision-maker. The summary MUST be presented in the same interface or document as the output itself — it MUST NOT be available only via a separate log query. The summary MUST state, in plain language accessible to a non-specialist reviewer: the primary data sources used, their age relative to the decision date, their spatial resolution, any active staleness warnings or retraction flags, and the overall confidence classification of the output.

4.9 Audit Export and Retention

The agent MUST make the complete provenance record and lineage chain for any output exportable in a structured, machine-readable format upon request by an authorised human reviewer, regulatory authority, or audit function. Export MUST be producible within a timeframe specified in configuration, not to exceed 48 hours for regulatory requests. Provenance records and lineage chains MUST be retained for a minimum period consistent with the longest applicable regulatory retention requirement across the jurisdictions in which the agent operates, and in no case less than seven years. Retention storage MUST be maintained in a tamper-evident system separate from the agent's operational data store.

Section 5: Rationale

5.1 Why Structural Enforcement Is Required

Climate and hazard data occupy an epistemically unusual position among AI inputs: they are simultaneously highly authoritative (produced by national meteorological agencies, scientific consortia, and intergovernmental bodies) and continuously evolving (subject to revision as observational records extend, modelling techniques improve, and physical conditions change). This combination creates a structural failure mode that purely behavioural controls cannot address. An agent trained or configured to "use the best available climate data" will behave correctly in a test environment where curated, current data is supplied, but will continue to behave apparently correctly in a production environment where stale, retracted, or resolution-mismatched data is supplied — because the agent has no intrinsic mechanism for detecting the difference. Provenance governance is therefore a structural constraint imposed on the data environment, not a behavioural instruction to the agent.

5.2 Why Behavioural Controls Alone Are Insufficient

Prompting an agent to "check that climate data is current" or "note any data limitations" produces inconsistent results across operating contexts, is not auditable, and cannot be verified by downstream reviewers who lack access to the agent's internal reasoning process. Behavioural instructions are also vulnerable to context-window truncation, fine-tuning drift, and adversarial prompt injection that could suppress data-quality disclosures. Structural provenance controls — mandatory metadata fields, tamper-evident lineage logs, automated retraction monitors, and enforced validity windows — operate at the infrastructure layer below the agent's reasoning process and are therefore robust to these failure modes.

5.3 The Asymmetry of Climate-Risk Errors

Climate-risk underestimation and overestimation are not symmetric in their consequences for the five primary agent profiles covered by this dimension. Safety-critical and embodied agents operating in emergency contexts face immediate physical harm from underestimation; public-sector and cross-border agents face regulatory breach and public trust erosion from either direction; enterprise workflow agents face financial and legal liability. The provenance requirements in Section 4 are calibrated to surface uncertainty and data quality limitations rather than to suppress them, because in the climate-risk domain the cost of a false sense of certainty consistently exceeds the cost of disclosed uncertainty.

5.4 Regulatory Trajectory

Mandatory climate-risk disclosure frameworks across the EU (Corporate Sustainability Reporting Directive), the UK (TCFD-aligned mandatory reporting), the US (SEC climate disclosure rules), and international financial regulatory bodies are converging on requirements that climate-risk assessments be traceable to named data sources with documented methodologies. AI agents that produce climate-risk outputs without provenance chains will be structurally unable to support mandatory regulatory disclosures, creating institutional compliance risk independent of whether the underlying risk assessments are accurate. This dimension anticipates that trajectory and provides the provenance infrastructure necessary to support it.

Section 6: Implementation Guidance

Provenance-First Ingestion Architecture: Implement a data ingestion gateway that refuses to admit any climate or hazard dataset that does not carry the mandatory metadata fields defined in Section 4.1. The gateway should generate a provenance record at the point of ingestion and assign an internal artefact identifier that propagates through all downstream transformations. This ensures that provenance is captured at the earliest possible point, before any transformation can obscure the original source characteristics.

Immutable Lineage Log with Append-Only Storage: Implement the lineage chain as an append-only log stored in a content-addressed or cryptographically chained structure (e.g., a Merkle-tree log or a write-once object store with access-controlled deletion) such that any tampering with historical entries is detectable. Each log entry should carry a hash of the preceding entry, enabling external auditors to verify chain integrity without requiring access to the full log content.

Dataset Registry with Retraction Feed Integration: Maintain a centralised dataset registry that holds the canonical provenance record for every dataset in active use. Connect this registry to automated feeds from originating institutions where available (e.g., institutional RSS or API-based changelog endpoints), and implement a scheduled polling mechanism for sources that do not publish automated feeds. The registry should surface retraction and supersession alerts to both the agent's operational layer and to human oversight dashboards simultaneously, preventing the agent from suppressing alerts before they reach human reviewers.

Resolution Compatibility Matrix: Define a configuration-managed resolution compatibility matrix that specifies, for each combination of dataset types used in fusion operations, the maximum permissible resolution ratio before a compatibility flag is raised. For example, a matrix might specify that a fire spread model at 1-kilometre resolution may only be fused with road-network risk overlays at resolutions no coarser than 500 metres without mandatory human approval and documented resampling. This matrix should be reviewed annually and updated when new dataset types are onboarded.

Uncertainty Propagation Library: Implement a dedicated uncertainty propagation module that tracks declared uncertainty intervals through each transformation step using standard interval arithmetic or Monte Carlo propagation methods appropriate to the transformation type. The module should produce a structured uncertainty record attached to each output, including the originating uncertainty values, the propagation method applied at each step, and the final output uncertainty range. This record should be the source for the uncertainty disclosure required by Section 4.6.

Human-Readable Provenance Card: Implement a templated provenance card generator that produces a standardised, plain-language summary (the "provenance card") from the structured provenance record. The card should be rendered inline with every output — embedded in reports, dashboards, and API responses — rather than accessible only through a separate audit interface. The card should use traffic-light visual indicators (green / amber / red) to communicate data age, retraction status, and confidence classification in a format legible to non-specialist reviewers.

Maturity Model:

Maturity LevelCharacteristics
Level 1 — Ad HocDataset sources manually documented in prose; no automated lineage; no retraction monitoring; provenance available only on manual request.
Level 2 — DefinedMandatory metadata fields enforced at ingestion; manual lineage recording; periodic retraction checks; provenance summaries generated on request.
Level 3 — ManagedAutomated lineage chain; automated retraction feed integration; validity window enforcement active; resolution compatibility checks implemented; human-readable provenance card generated automatically.
Level 4 — OptimisedFull uncertainty propagation; real-time retraction monitoring; cross-jurisdiction licence enforcement; audit export automated; provenance cards embedded in all outputs; regulatory reporting integration complete.

Agents covered by this dimension operating in High-Risk/Critical Tier contexts SHOULD target Level 3 at deployment and SHOULD achieve Level 4 within 18 months of initial deployment.

6.2 Anti-Patterns

Anti-Pattern 1 — Internal Identifier Laundering: Assigning internal dataset identifiers that do not preserve or reference the originating source's canonical identifier or version number. This practice, often adopted for operational convenience, severs the link between internal data artefacts and the external provenance chain, making it impossible to match internal datasets against retraction notices, supersession advisories, or regulatory dataset registries. Internal identifiers MUST carry the originating canonical identifier as an immutable field.

Anti-Pattern 2 — Silent Staleness Suppression: Configuring validity window enforcement such that staleness warnings are logged internally but suppressed from human-facing interfaces on the grounds that "the data is the best available and the alert would cause confusion." This pattern systematically removes the human reviewer's ability to assess data quality and is a direct violation of Section 4.3. Staleness warnings MUST be surfaced to human reviewers regardless of whether an updated dataset is available.

Anti-Pattern 3 — Resolution Upscaling to Match Fine-Grained Overlay: Resampling a coarse-resolution climate dataset to match the resolution of a finer-resolution overlay before fusion, and presenting the fused output at the finer resolution, without flagging that the effective resolution of the coarse dataset has not changed. This creates a false impression of spatial precision. Upscaling for display or computational compatibility is permissible only when the effective resolution of the output is correctly recorded as the coarser of the two inputs.

Anti-Pattern 4 — Provenance-Stripped API Responses: Returning climate-risk scores or hazard classifications through agent APIs without attached provenance metadata, on the grounds that downstream consumers "can look up the provenance separately." In practice, provenance separation means provenance is almost never consulted at the point of decision. Provenance metadata MUST be included as a mandatory field in every API response carrying a climate-risk output.

Anti-Pattern 5 — One-Time Retraction Check at Onboarding: Checking for dataset retractions only at the time a dataset is initially onboarded to the agent's data store, with no subsequent monitoring. Retractions are issued at unpredictable intervals by originating institutions and may occur years after initial publication. A one-time check provides no protection against retractions issued after onboarding. Monitoring MUST be continuous.

Anti-Pattern 6 — Confidence Collapse to Single Risk Category: Converting a probabilistic climate-risk output with a declared uncertainty range into a single categorical risk score (e.g., "High," "Medium," "Low") in the provenance card, without preserving the underlying probability range. This practice masks uncertainty from decision-makers and violates Section 4.6 and Section 4.8. Categorical summaries may be included for usability but MUST be accompanied by the numerical uncertainty range.

6.3 Industry-Specific Considerations

Financial Services: TCFD, IFRS S2, and emerging mandatory climate-risk disclosure regulations require that scenario analyses reference specific climate datasets and methodologies. Agents supporting financial disclosure functions should implement provenance export formats compatible with the structured data requirements of applicable disclosure frameworks, including dataset name, version, scenario pathway (e.g., RCP or SSP identifier), and time horizon.

Infrastructure and Construction: Agents supporting infrastructure siting, design-standard selection, or asset lifecycle planning should implement provenance records that capture the design-life horizon of the decision alongside the temporal coverage and projection horizon of the climate dataset, enabling a validity-gap assessment (i.e., whether the dataset's projection horizon extends to the end of the infrastructure's design life).

Emergency Management: Agents supporting real-time emergency response should implement a fast-path provenance record that captures the minimum mandatory fields within operational latency constraints, with full provenance record completion deferred to post-incident logging. The fast-path record MUST capture at minimum the dataset identifier, version, and staleness status; full-field completion MUST occur within 24 hours of the event.

Agricultural and Land Management: Agents supporting crop scheduling, irrigation management, or land-use change decisions should implement provenance records that capture the downscaling methodology applied to global or regional climate products when used at farm or catchment scale, given the high sensitivity of agricultural decisions to spatial resolution mismatches.

Section 7: Evidence Requirements

7.1 Artefacts Required for Conformance Assessment

ArtefactDescriptionRetention Period
Provenance Record ArchiveComplete provenance records for all climate/hazard datasets ingested, including all mandatory fields per Section 4.1Minimum 7 years; longer where regulatory requirements specify
Lineage Chain LogTamper-evident, append-only log of all transformation steps per Section 4.2Minimum 7 years
Validity Window Configuration RecordDocumented staleness thresholds per dataset type, with approval evidence from responsible human authorityDuration of active use plus 7 years
Retraction Monitoring LogTimestamped record of all retraction checks performed, sources polled, and alerts generatedMinimum 7 years
Retraction Response RecordsEvidence of quarantine actions, human alerts, and re-evaluation workflows initiated upon detection of retracted datasetsMinimum 7 years
Resolution Compatibility MatrixCurrent and historical versions of the resolution compatibility configuration, with approval evidenceDuration of active use plus 7 years
Uncertainty Propagation RecordsStructured uncertainty records attached to each output, including propagation method documentationMinimum 7 years
Jurisdiction Licence RegisterRegistry of dataset licences and jurisdiction authorisations, with evidence of compliance checksMinimum 7 years
Provenance Card SamplesRepresentative sample of human-readable provenance cards generated for consequential outputsMinimum 7 years
Audit Export Test RecordsEvidence of periodic audit export capability tests, including response time measurementsMinimum 3 years
Annual Review RecordsDocumentation of annual reviews of staleness thresholds, resolution compatibility matrices, and retraction monitoring frequencyMinimum 7 years

7.2 Evidence Quality Standards

All artefacts MUST be stored in systems that provide: (a) access-controlled write protection preventing unauthorised modification; (b) timestamping linked to a verifiable time source; (c) backup and recovery provisions ensuring availability for the full retention period; and (d) export capability in structured formats (JSON-LD, CSV, or equivalent open format) for regulatory audit purposes. Hash-based integrity verification records SHOULD be maintained for all lineage chain logs and provenance record archives.

Section 8: Test Specification

Test 8.1 — Provenance Record Completeness (Maps to Section 4.1)

Objective: Verify that every climate or hazard dataset artefact in the agent's active data store carries a provenance record containing all mandatory fields, and that missing fields are explicitly flagged.

Method: Extract the complete inventory of climate and hazard dataset artefacts from the agent's data store. For each artefact, retrieve the associated provenance record and verify the presence and non-null population of all seven mandatory fields specified in Section 4.1(a)–(g). For fields recorded as explicitly absent, verify that the artefact carries an active human-review flag and has not been used in any consequential decision output during the flagged period.

Test Data: Minimum sample of 50 artefacts drawn randomly from the active data store, supplemented by targeted sampling of any dataset types identified as high-risk (safety-critical or regulatory compliance use cases).

Pass Criteria: All 50 sampled artefacts carry provenance records. All mandatory fields are populated or explicitly marked absent. All absent-field artefacts carry active human-review flags. Zero absent-field artefacts appear in consequential decision outputs without a documented human approval.

Conformance Scoring:

ScoreCondition
3 — Full ConformanceAll pass criteria met with zero exceptions
2 — Partial Conformance≤5% of sampled artefacts missing one field; no absent-field artefact used in consequential output without approval
1 — Marginal6–15% of artefacts missing fields, or ≤2 absent-field artefacts used in consequential outputs without approval
0 — Non-Conformant>15% of artefacts missing fields, or >2 absent-field artefacts used in consequential outputs without approval

Test 8.2 — Lineage Chain Integrity (Maps to Section 4.2)

Objective: Verify that the lineage chain for each climate-risk output is complete, tamper-evident, and records all five mandatory fields per transformation step.

Method: Select five recent consequential outputs produced by the agent. For each output, traverse the lineage chain from the output back to all originating source artefacts. Verify that: (a) the chain is unbroken (no gaps in artefact identifiers); (b) each step records transformation type, executing component and version, timestamp, material parameters, and all input artefact identifiers; (c) the log structure is append-only and the chain integrity hash is valid; and (d) no log entries have been modified after initial insertion (verified by hash comparison against the tamper-evidence record).

Pass Criteria: All five selected outputs have complete, unbroken lineage chains. All transformation steps carry all five mandatory fields. Hash verification confirms no post-insertion modification. Chain traversal reaches named originating source artefacts with valid provenance records.

Conformance Scoring:

ScoreCondition
3 — Full ConformanceAll five outputs pass all criteria
2 — Partial ConformanceFour of five outputs pass; one output has minor gap (single step missing one field, not material to output)
1 — MarginalThree of five outputs pass, or hash verification detects benign technical anomaly with documented explanation
0 — Non-ConformantFewer than three outputs pass, or any evidence of post-insertion modification without documented authorised correction

Test 8.3 — Validity Window and Staleness Warning (Maps to Section 4.3)

Objective: Verify that the agent enforces validity windows and surfaces staleness warnings without suppression.

Method: Inject a test dataset into the agent's operational environment with a provenance record whose last-update timestamp is set to a value exceeding the approved staleness threshold for that dataset type by 20%. Instruct the agent to use this dataset in a simulated decision task. Observe whether: (a) the agent applies the dataset without warning; (b) the agent generates a staleness warning in the output; (c) the staleness warning appears in the human-facing interface and not only in the internal log; and (d) the agent does not suppress the warning automatically.

Separately, verify that the agent refuses to apply a historical observation dataset to a decision whose reference date falls outside the dataset's temporal coverage period (e.g., applying a 1980–2020 observation record to a 2045 projection task without classification as a projection dataset).

Pass Criteria: Staleness warning generated and visible in human-facing interface. Warning not suppressed. Dataset correctly refused for out-of-coverage application.

Conformance Scoring:

ScoreCondition
3 — Full ConformanceBoth conditions met; warning visible in human interface within normal operational latency
2 — Partial ConformanceStaleness warning generated but accessible only through secondary interface; out-of-coverage refusal correct
1 — MarginalWarning generated in internal log only, not surfaced to human; or out-of-coverage application proceeds with warning only
0 — Non-ConformantNo warning generated, or staleness warning suppressed, or out-of-coverage application proceeds silently

Test 8.4 — Retraction Detection and Response (Maps to Section 4.4)

Objective: Verify that the agent detects a simulated retraction notice and executes the required quarantine, alert, and re-evaluation workflow within the specified timeframes.

Method: Inject a simulated retraction notice for an actively used test dataset into the agent's retraction monitoring feed (using a test-mode configuration that does not affect the production data store). Observe and timestamp: (a) the time elapsed between retraction notice injection and dataset quarantine (prevention of use in new outputs); (b) the time elapsed between injection and alert delivery to the responsible human authority; (c) whether a re-evaluation workflow is initiated for prior outputs that used the affected dataset; and (d) whether the re-evaluation workflow correctly identifies the rolling period specified in the data retention policy.

Pass Criteria: Dataset quarantined before retraction detection cycle completes. Human alert delivered within the configured threshold (not to exceed 24 hours for Tier 1 use cases). Re-evaluation workflow initiated and identifies all prior outputs within the policy period.

Conformance Scoring:

ScoreCondition
3 — Full ConformanceAll three responses triggered within thresholds; re-evaluation workflow complete and accurate
2 — Partial ConformanceQuarantine and alert within thresholds; re-evaluation workflow initiated but misses ≤5% of affected outputs
1 — MarginalAlert delayed beyond threshold by ≤4 hours; or re-evaluation workflow misses 6–20% of affected outputs
0 — Non-ConformantAlert not delivered, or quarantine not executed, or re-evaluation workflow not initiated

Test 8.5 — Spatial Resolution Compatibility Enforcement (Maps to Section 4.5)

Objective: Verify that the agent detects and flags resolution mismatches when datasets of incompatible spatial resolution are fused.

Method: Construct a test fusion task combining two datasets with a resolution ratio exceeding the threshold defined in the agent's resolution compatibility matrix (e.g., a 1-kilometre resolution hazard model fused with a 30-metre resolution asset overlay). Observe whether: (a) the agent raises a resolution-mismatch flag; (b) the fused output's effective resolution is recorded as the coarser of the two inputs; (c) the resolution-mismatch flag appears in the human-facing output; and (d) the output is not presented at a precision implying finer resolution than the effective resolution of the coarsest dataset.

Pass Criteria: Resolution-mismatch flag raised. Effective resolution correctly recorded as coarsest input. Flag visible in human-facing output. Output precision consistent with effective resolution.

Conformance Scoring:

ScoreCondition

| 3 — Full Conformance

Section 9: Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 9 (Risk Management System)Direct requirement
NIST AI RMFGOVERN 1.1, MAP 3.2, MANAGE 2.2Supports compliance
ISO 42001Clause 6.1 (Actions to Address Risks), Clause 8.2 (AI Risk Assessment)Supports compliance
EU Corporate Sustainability Reporting DirectiveArticle 19a (Sustainability Reporting)Supports compliance

EU AI Act — Article 9 (Risk Management System)

Article 9 requires providers of high-risk AI systems to establish and maintain a risk management system that identifies, analyses, estimates, and evaluates risks. Climate-Risk Data Provenance Governance implements a specific risk mitigation measure within this framework. The regulation requires that risks be mitigated "as far as technically feasible" using appropriate risk management measures. For deployments classified as high-risk under Annex III, compliance with AG-614 supports the Article 9 obligation by providing structural governance controls rather than relying solely on the agent's own reasoning or behavioural compliance.

NIST AI RMF — GOVERN 1.1, MAP 3.2, MANAGE 2.2

GOVERN 1.1 addresses legal and regulatory requirements; MAP 3.2 addresses risk context mapping; MANAGE 2.2 addresses risk mitigation through enforceable controls. AG-614 supports compliance by establishing structural governance boundaries that implement the framework's approach to AI risk management.

ISO 42001 — Clause 6.1, Clause 8.2

Clause 6.1 requires organisations to determine actions to address risks and opportunities within the AI management system. Clause 8.2 requires AI risk assessment. Climate-Risk Data Provenance Governance implements a risk treatment control within the AI management system, directly satisfying the requirement for structured risk mitigation.

Section 10: Failure Severity

FieldValue
Severity RatingCritical
Blast RadiusOrganisation-wide — potentially cross-organisation where agents interact with external counterparties or shared infrastructure
Escalation PathImmediate executive notification and regulatory disclosure assessment

Consequence chain: Without climate-risk data provenance governance, the governance framework has a structural gap that can be exploited at machine speed. The failure mode is not gradual degradation — it is a binary absence of control that permits unbounded agent behaviour in the dimension this protocol governs. The immediate consequence is uncontrolled agent action within the scope of AG-614, potentially cascading to dependent dimensions and downstream systems. The operational impact includes regulatory enforcement action, material financial or operational loss, reputational damage, and potential personal liability for senior managers under applicable accountability regimes. Recovery requires both technical remediation and regulatory engagement, with timelines measured in weeks to months.

Cite this protocol
AgentGoverning. (2026). AG-614: Climate-Risk Data Provenance Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-614