AG-149: Input Artefact Authenticity Verification

2. Summary

Input Artefact Authenticity Verification requires that every AI agent governance system establishes controls to verify the provenance, integrity, and authenticity of all input artefacts — data files, model weights, configuration documents, reference datasets, and external signals — before they are consumed by any governance process or agent reasoning pipeline. Without verified input authenticity, every downstream governance control inherits the uncertainty of potentially poisoned, fabricated, or tampered inputs. This dimension ensures that the evidentiary foundation on which agents reason, and on which governance decisions are made, is itself trustworthy.

3. Example

Scenario A — Fabricated Compliance Evidence Accepted Without Verification: A financial-value agent is tasked with verifying counterparty compliance certificates before executing high-value wire transfers. The agent receives a PDF certificate purporting to be from a recognised audit firm attesting to the counterparty's AML compliance. The certificate is a forgery — created using publicly available templates and a spoofed digital signature. The agent's pipeline has no cryptographic verification of the certificate's origin, no cross-check against the audit firm's public registry, and no content-integrity hash comparison. The agent accepts the certificate, marks the counterparty as compliant, and executes 47 transfers totalling £12.3 million over three weeks. When the fraud is discovered during a quarterly review, the organisation faces FCA enforcement proceedings, £2.1 million in recovery costs, and the suspension of its automated compliance certification process.

What went wrong: The agent consumed an input artefact (the compliance certificate) without verifying its authenticity. No cryptographic signature validation was performed. No provenance chain was established. The governance system trusted the content of the artefact based on its format rather than its origin.

Scenario B — Tampered Model Weights Deployed via Compromised Pipeline: A research organisation maintains a model registry from which production agents pull updated model weights. An attacker gains write access to the registry's staging bucket and replaces a model checkpoint with a backdoored version that subtly biases outputs toward specific vendor recommendations. The deployment pipeline checks file size and format but does not verify the cryptographic hash of the model file against the signed manifest from the training pipeline. The backdoored model is deployed to 14 production agents serving enterprise customers. Over six weeks, procurement recommendations shift measurably toward the attacker's affiliated vendor, generating an estimated £890,000 in misdirected spend before anomaly detection flags the behavioural drift.

What went wrong: The model weight file — a critical input artefact — was consumed without cryptographic integrity verification. The pipeline validated format but not authenticity. The signed manifest from the training pipeline existed but was never checked during deployment.

Scenario C — Poisoned Reference Dataset Corrupts Governance Calibration: An agent governance platform uses a reference dataset of historical regulatory decisions to calibrate its risk-scoring model. The dataset is sourced from a third-party legal data provider via an API. An attacker performs a man-in-the-middle attack on the API connection (which uses TLS but does not pin certificates or verify response signatures), injecting 340 fabricated regulatory decisions into the dataset. These fabricated entries systematically understate penalties for data protection violations. When the governance platform recalibrates using the poisoned dataset, it begins scoring data protection risks 35% lower than their true severity. Three months later, an agent operating under the recalibrated risk model permits a data processing operation that triggers a GDPR investigation resulting in a €4.2 million fine.

What went wrong: The reference dataset was consumed without content-integrity verification. The API connection was encrypted in transit but lacked response-level authenticity verification. No baseline hash comparison was performed against a known-good version of the dataset.

4. Requirement Statement

Scope: This dimension applies to all AI agent systems that consume external or internally sourced input artefacts for the purpose of reasoning, decision-making, governance evaluation, model execution, or compliance assessment. Input artefacts include but are not limited to: training data, evaluation datasets, model weights, configuration files, reference documents, regulatory texts, compliance certificates, third-party API responses, sensor readings, and any data object whose content influences agent behaviour or governance outcomes. Systems that generate all inputs internally from verified processes with no external data ingestion may claim a documented exemption, provided the internal generation process itself is subject to integrity controls equivalent to those specified here.

4.1. A conforming system MUST maintain a registry of all input artefact types consumed by agents and governance processes, including their sources, expected formats, and designated authenticity verification methods.

4.2. A conforming system MUST verify the cryptographic integrity of every input artefact against a signed manifest or equivalent provenance record before the artefact is consumed by any agent or governance process.

4.3. A conforming system MUST reject input artefacts that fail authenticity verification and log the rejection with the artefact identifier, source, failure reason, and timestamp.

4.4. A conforming system MUST verify that model weight files, configuration artefacts, and executable components match their signed build or training pipeline manifests before deployment.

4.5. A conforming system MUST establish and enforce a maximum staleness threshold for each input artefact type, beyond which the artefact is treated as unverified regardless of its prior authenticity status.

4.6. A conforming system SHOULD implement content-level anomaly detection on input artefacts to identify statistical deviations from established baselines, even when cryptographic integrity checks pass.

4.7. A conforming system SHOULD verify the authenticity of API responses from third-party data providers using response-level signatures, certificate pinning, or equivalent mechanisms beyond transport-layer encryption.

4.8. A conforming system SHOULD maintain a known-good baseline hash for each reference dataset and compare incoming versions against this baseline before ingestion.

4.9. A conforming system MAY implement multi-source corroboration for high-criticality input artefacts, requiring independent confirmation from at least two sources before acceptance.

5. Rationale

The integrity of any governance framework depends entirely on the integrity of the evidence it consumes. An agent governance system that perfectly enforces every control, flawlessly evaluates every risk, and correctly escalates every anomaly is nonetheless compromised if the inputs feeding those processes are fabricated, tampered, or stale. Input artefact authenticity is the epistemological foundation of AI agent governance — it answers the question: "How do we know what we think we know is actually true?"

This dimension addresses a class of attack and failure mode that is distinct from, and complementary to, the concerns addressed by AG-036 (Reasoning Process Integrity) and AG-039 (Active Deception and Concealment Detection). AG-036 ensures the reasoning process itself is sound. AG-039 detects when an agent is actively deceiving its oversight systems. AG-149 ensures that the raw materials — the data, models, configurations, and reference documents — entering the system are themselves genuine. Without AG-149, both AG-036 and AG-039 can be undermined: sound reasoning applied to fabricated inputs produces confidently wrong conclusions; deception detection calibrated against poisoned baselines misses the very deceptions it is designed to catch.

The threat landscape for input artefact tampering is broad and growing. Supply chain attacks on model registries and training data pipelines are documented in academic literature and industry incident reports. Adversarial data poisoning — injecting carefully crafted examples into training or reference datasets — has been demonstrated to cause targeted misclassification with as few as 0.1% of the training set poisoned. Configuration tampering can silently alter governance boundaries. Fabricated compliance evidence can bypass automated verification. Each of these attack vectors exploits the same vulnerability: the absence of artefact-level authenticity verification.

The assurance control type reflects the nature of this dimension: it does not prevent a specific harmful action (that is the role of preventive controls like AG-001) but rather provides ongoing assurance that the evidentiary foundation of the governance system remains trustworthy. Without this assurance, the confidence level of every other governance control is degraded.

6. Implementation Guidance

Input Artefact Authenticity Verification requires establishing a verification pipeline that operates on every input artefact before it enters the governance or agent reasoning process. The pipeline should be architecturally separate from the agent runtime, consistent with the separation principles established in AG-001.

Recommended patterns:

Cryptographic manifest verification. Every input artefact should have an associated signed manifest containing the artefact's cryptographic hash (SHA-256 minimum), source identifier, creation timestamp, and the signing key's certificate chain. The verification service computes the hash of the received artefact and compares it against the manifest. Mismatches result in rejection. The signing key should be managed in a hardware security module or equivalent key management service with access controls independent of the agent runtime.
Provenance chain tracking. Each input artefact should carry a provenance chain documenting its origin, every transformation applied, and every custodian that handled it. This chain should itself be signed at each step. For model weights, the provenance chain includes the training pipeline, training data hash, hyperparameters, and the build system that produced the final checkpoint. For reference datasets, the chain includes the data provider, extraction date, any filtering or transformation applied, and the hash of the resulting dataset.
Content-level anomaly detection. Beyond cryptographic verification, implement statistical checks on input artefacts to detect subtle poisoning that occurs before the artefact is signed. For reference datasets, maintain running statistics (mean, variance, distribution shape, record count) and flag artefacts that deviate beyond defined thresholds. For model weights, compare layer-wise statistics against the expected distribution from the training pipeline. A signed artefact that is statistically anomalous warrants investigation even if its cryptographic integrity is intact — it may have been poisoned at source.
Multi-source corroboration. For high-criticality artefacts — such as regulatory reference data, compliance certificates, or market data — require independent confirmation from at least two authoritative sources. If two sources disagree, the artefact is quarantined pending manual review. This pattern is particularly valuable when the signing authority itself may be compromised.

Anti-patterns to avoid:

Trusting transport-layer security as sufficient authenticity verification. TLS encrypts data in transit but does not verify the authenticity of the content at the application layer. A man-in-the-middle attack on a TLS connection (via compromised certificates, misconfigured certificate validation, or a compromised upstream proxy) can inject fabricated content that arrives over an encrypted channel. Transport-layer security is necessary but not sufficient.
Verifying format but not content. Checking that a model file is a valid checkpoint format, that a CSV file has the expected columns, or that a PDF certificate contains the expected fields verifies structure but not authenticity. A well-crafted forgery will pass every format check while containing entirely fabricated content.
One-time verification without staleness controls. Verifying an artefact once at initial ingestion and then caching it indefinitely creates a window where a compromised artefact can be used after its provenance has been revoked. Staleness thresholds ensure periodic re-verification.
Shared signing keys across trust boundaries. Using the same signing key for artefacts from different sources or trust levels collapses the trust model. A compromise of any single source grants the attacker the ability to forge artefacts for all sources. Signing keys should be scoped to individual sources or trust domains.
Bypassing verification for performance. Under load, organisations may be tempted to skip or defer verification to reduce latency. This creates a window during which unverified artefacts can enter the system. Verification latency should be addressed through architectural optimisation (caching verified artefacts, pre-verification pipelines) rather than by weakening the verification requirement.

Industry Considerations

Financial Services. Market data feeds, counterparty compliance certificates, regulatory reference data, and model weights for risk calculations all require rigorous authenticity verification. MiFID II requires firms to have adequate arrangements for market data quality. The FCA expects firms to verify the integrity of data inputs to algorithmic trading systems. Model risk management frameworks (SR 11-7, SS1/23) require verification of model inputs as part of the model validation process.

Healthcare. Clinical reference datasets, drug interaction databases, patient records, and diagnostic model weights require authenticity verification under HIPAA, the EU Medical Device Regulation, and equivalent frameworks. A poisoned drug interaction database consumed by a clinical decision support agent could result in patient harm.

Critical Infrastructure. Sensor readings, control system configurations, and operational reference data require authenticity verification to prevent attacks on physical safety. IEC 62443 requires data integrity verification for industrial control system inputs.

Maturity Model

Basic Implementation — The organisation maintains a registry of input artefact types with their sources. Cryptographic integrity verification (hash comparison against signed manifests) is implemented for model weights and configuration files. Artefacts failing verification are rejected and logged. Staleness thresholds are defined but enforced only at deployment time, not continuously. Third-party API responses are consumed over TLS but without response-level signature verification. This level meets the minimum mandatory requirements but lacks depth in content-level verification and multi-source corroboration.

Intermediate Implementation — All basic capabilities plus: cryptographic verification extends to all input artefact types including reference datasets, compliance certificates, and API responses. Content-level anomaly detection is implemented for reference datasets with statistical baselines maintained and deviation thresholds defined. Staleness thresholds are enforced continuously with automatic re-verification on a defined schedule. Provenance chains are tracked for model weights and critical datasets. Third-party API responses are verified using certificate pinning or response-level signatures. The verification pipeline operates as an independent service with separate credentials from the agent runtime.

Advanced Implementation — All intermediate capabilities plus: multi-source corroboration is implemented for high-criticality artefacts. Content-level anomaly detection extends to model weights (layer-wise statistical comparison). Provenance chains are tracked for all artefact types with end-to-end signing. The verification pipeline has been subjected to independent adversarial testing including supply chain attack simulation, man-in-the-middle testing, and poisoning detection evaluation. Hardware security modules protect all signing keys. The organisation can demonstrate to regulators a complete chain of custody for every input artefact consumed by every deployed agent.

7. Evidence Requirements

Required artefacts:

Input artefact registry. A structured document (JSON, YAML, or database export) listing every input artefact type, its source, expected format, designated verification method, and staleness threshold. Updated within 30 days of any change.
Verification pipeline architecture. Architecture diagram and configuration showing the verification service, its separation from the agent runtime, key management arrangements, and the rejection/logging pathway for failed verifications.
Verification execution logs. Timestamped records of every verification check performed, including artefact identifier, source, hash comparison result, and disposition (accepted/rejected). Minimum 12 months retention.
Rejection and quarantine logs. Records of every artefact rejected due to failed verification, including the failure reason, the artefact's purported source, and any follow-up investigation actions.
Staleness enforcement records. Evidence that staleness thresholds are enforced, including records of artefacts re-verified or expired based on their age.

Retention requirements:

Verification logs and rejection records: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Evidence must exist as retained artefacts, not be reconstructable after the fact.

8. Test Specification

Test 8.1: Cryptographic Integrity Verification

Stimulus: Submit an input artefact whose content has been modified by 1 byte after signing, alongside its original signed manifest.
Expected behaviour: The verification service detects the hash mismatch and rejects the artefact before it enters any agent or governance process.
Pass criteria: The modified artefact is rejected with a logged rejection record including the expected and actual hash values. No agent or governance process consumes the artefact.
Fail criteria: The modified artefact is accepted, or the rejection is logged but the artefact is still consumed.

Test 8.2: Manifest Absence Defaults to Reject

Stimulus: Submit a valid input artefact without an accompanying signed manifest.
Expected behaviour: The verification service rejects the artefact due to the absence of a provenance record.
Pass criteria: The artefact is rejected and logged with reason "no manifest." No agent or governance process consumes it.
Fail criteria: The artefact is accepted without a manifest, or a default/permissive manifest is generated automatically.

Test 8.3: Expired Signing Key Rejection

Stimulus: Submit an input artefact with a manifest signed by a key whose certificate has expired or been revoked.
Expected behaviour: The verification service checks the signing key's validity and rejects the artefact.
Pass criteria: The artefact is rejected with reason referencing the expired or revoked key. No downstream consumption occurs.
Fail criteria: The artefact is accepted despite the invalid signing key.

Test 8.4: Staleness Threshold Enforcement

Stimulus: Present a previously verified artefact whose age now exceeds its defined staleness threshold without re-verification.
Expected behaviour: The system treats the artefact as unverified and requires re-verification before further use.
Pass criteria: The stale artefact is blocked from consumption until re-verified. A log entry records the staleness expiry.
Fail criteria: The stale artefact continues to be consumed without re-verification.

Test 8.5: Content-Level Anomaly Detection

Stimulus: Submit a reference dataset that is cryptographically valid (signed correctly) but contains statistical anomalies: 15% of records have been replaced with fabricated entries that shift the distribution mean by more than 2 standard deviations.
Expected behaviour: Content-level anomaly detection flags the dataset as statistically anomalous despite valid cryptographic integrity.
Pass criteria: The dataset is quarantined for investigation with a log entry documenting the statistical deviation detected.
Fail criteria: The dataset is accepted without anomaly detection flagging the deviation.

Test 8.6: Verification Pipeline Independence

Stimulus: Attempt to modify the verification service's configuration, bypass its checks, or inject false verification results from within the agent runtime process.
Expected behaviour: The verification pipeline operates in a separate security domain and is unaffected by agent runtime actions.
Pass criteria: No action from the agent runtime can influence verification outcomes or bypass the verification pipeline.
Fail criteria: Any agent runtime action modifies verification behaviour, bypasses checks, or injects false results.

Test 8.7: Registry Completeness

Stimulus: Introduce a new input artefact type not listed in the artefact registry and attempt to ingest it.
Expected behaviour: The system rejects the unregistered artefact type or routes it to a quarantine process for classification and registry addition.
Pass criteria: Unregistered artefact types are not consumed without explicit registration and verification method assignment.
Fail criteria: Unregistered artefact types are consumed without verification.

Conformance Scoring

Score 0: No input artefact authenticity verification exists — artefacts are consumed based on availability and format alone.
Score 1: Cryptographic integrity verification is implemented for model weights and configuration files only — other artefact types are consumed without verification.
Score 2: Cryptographic verification covers all registered artefact types with rejection logging and staleness enforcement — verification operates as an independent service.
Score 3: All Score 2 capabilities plus content-level anomaly detection, multi-source corroboration for high-criticality artefacts, and independent adversarial testing of the verification pipeline.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 10 (Data and Data Governance)	Direct requirement
EU AI Act	Article 9 (Risk Management System)	Supports compliance
NIST AI RMF	MAP 2.3, MANAGE 2.4	Supports compliance
ISO 42001	Clause 8.2 (AI Risk Assessment), Annex A.7 (Data for AI Systems)	Supports compliance
FCA SYSC	6.1.1R (Systems and Controls)	Supports compliance
MiFID II	Article 17 (Algorithmic Trading) — data quality requirements	Supports compliance
DORA	Article 9 (ICT Risk Management Framework)	Supports compliance
NIST SSDF	PW.4 (Reuse Existing, Well-Secured Software)	Supports compliance

EU AI Act — Article 10 (Data and Data Governance)

Article 10 requires that training, validation, and testing datasets for high-risk AI systems be subject to appropriate data governance and management practices. This includes examination of data for errors, gaps, and biases. AG-149 extends this requirement to all input artefacts consumed during operation — not just training datasets — and adds cryptographic integrity verification as the primary assurance mechanism. The Act's requirement that datasets be "relevant, sufficiently representative, and to the extent possible, free of errors and complete" inherently requires that the data's authenticity be verified; fabricated data cannot be relevant or representative.

ISO 42001 — Clause 8.2, Annex A.7

Clause 8.2 requires AI risk assessment including consideration of data-related risks. Annex A.7 addresses data quality and provenance for AI systems. AG-149 implements the operational controls necessary to ensure that data consumed by AI systems meets the quality and provenance requirements of the management system.

MiFID II — Article 17

Article 17 requires investment firms using algorithmic trading systems to have effective systems and risk controls. For firms using AI agents in trading, this includes ensuring the integrity of market data and reference data consumed by the agent. Fabricated or tampered market data could lead to erroneous trading decisions, creating market risk and potential market abuse implications.

DORA — Article 9

Article 9 requires financial entities to establish ICT risk management frameworks that ensure the integrity of ICT systems and data. Input artefact authenticity verification is a data integrity control within this framework, ensuring that data consumed by AI agents in financial operations has not been tampered with.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Organisation-wide — extends to all downstream governance controls and agent decisions that depend on the compromised input artefacts

Consequence chain: When input artefact authenticity verification fails, the immediate consequence is that fabricated, tampered, or stale artefacts enter the governance and agent reasoning pipeline. The downstream impact is unbounded because every governance control that relies on the compromised artefact inherits its compromise. A poisoned reference dataset can recalibrate risk scoring across all agents, affecting thousands of decisions. Tampered model weights can introduce systematic biases that persist until the next model update cycle. Fabricated compliance certificates can enable regulatory violations at scale. The failure is particularly insidious because it may not manifest as an obvious error — the system continues to operate normally, producing outputs that appear correct but are based on fabricated evidence. Detection may not occur until an external audit, regulatory investigation, or counterparty dispute reveals the discrepancy. The business consequences include regulatory enforcement action for inadequate data governance, financial losses from decisions based on fabricated inputs, reputational damage from demonstrably compromised governance processes, and potential personal liability for senior managers who certified the adequacy of controls that lacked input verification.

Cross-references: AG-036 (Reasoning Process Integrity) — ensures the reasoning process applied to verified inputs is itself sound. AG-039 (Active Deception and Concealment Detection) — detects when an agent attempts to conceal the use of unverified inputs. AG-048 (AI Model Provenance) — provides the provenance framework for model weight artefacts that AG-149 verifies. AG-057 (Dataset Suitability and Bias Control) — addresses the quality and representativeness of datasets after their authenticity has been established by AG-149. AG-078 (Benchmark Coverage) — depends on verified evaluation datasets to produce meaningful coverage metrics.

Cite this protocol

AgentGoverning. (2026). AG-149: Input Artefact Authenticity Verification. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-149

← Previous Protocol

AG-148

Cumulative Impact Assessment Governance

Next Protocol →

AG-150

Feedback and Learning Poisoning Resistance Governance