The Standard

The 841 Dimensions Regulatory Mapping Version History

Compliance

Compliance Leaderboard Platform Comparison

Verification

Submit for Verification Self-Assessment Tool

About

About AgentGoverning Press & Media

Contact

AG-018

Output Integrity Verification

Group D — Governance & Compliance ~20 min read AGS v2.1 · April 2026

EU AI Act SOX FCA NIST ISO 42001

2. Summary

Output Integrity Verification requires that every AI agent output consumed by external systems, communicated to humans, or used as input to further automated processes is validated against known reference data before propagation. The protocol addresses one of the most consequential risks in AI agent deployment: the propagation of fabricated, hallucinated, or incorrect outputs into enterprise systems where they are treated as authoritative facts and acted upon without further verification. Validation must occur at the output boundary — the point where the agent's output becomes an input to something else — covering numerical values, named entities, regulatory citations, and structured data. Outputs that fail validation or fall below confidence thresholds are blocked or routed to human review before they can enter the enterprise data ecosystem.

3. Example

Scenario A — Hallucinated Regulatory Citation Triggers Non-Compliance: A compliance AI agent at a European investment firm produces a quarterly regulatory report for submission to the national competent authority. The report includes 340 compliance assessments across 12 regulatory requirements. The agent cites specific regulation articles to support each assessment — for example, "Compliant with MiFID II Article 27(3)(b) regarding best execution reporting." The report is submitted to the regulator. During a routine supervisory review, the regulator's team notes that "MiFID II Article 27(3)(b)" does not exist. Article 27(3) has only sub-paragraph (a). The agent fabricated the sub-paragraph reference. The regulator broadens the review and discovers that 14 of the 340 citations contain errors: some reference non-existent sub-paragraphs, others cite articles from superseded versions of the regulation, and two cite entirely fabricated regulation numbers.

What went wrong: No validation against the actual regulation text occurred before the report was submitted. The agent hallucinated specific regulatory sub-articles that were structurally similar to real citations, making them plausible to human readers. The compliance team treated the agent's output as authoritative and submitted it without verifying the underlying citations. Consequence: Regulatory investigation not for the underlying compliance failures but for submitting a report containing fabricated regulatory references. The firm must retrospectively verify every regulatory report produced by the agent since deployment — a process that takes three months and 800 person-hours.

Scenario B — Fabricated Counterparty Name Creates Reconciliation Failure: A transaction processing agent generates a batch of 2,300 reconciliation records matching incoming payments to expected counterparties. One record matches a payment of £87,000 to a counterparty named "Meridian Capital Partners LLP." The reconciliation is marked as complete and the payment is posted. Three days later, the finance team discovers that there is no counterparty called "Meridian Capital Partners LLP" in the organisation's counterparty database. The agent fabricated the counterparty name based on partial pattern matching with a real counterparty named "Meridian Capital Management Ltd." The £87,000 payment is actually from an unknown source and should have been flagged for AML investigation, not reconciled.

What went wrong: The agent generated a counterparty name that was plausible but not validated against the authoritative counterparty registry. The fabricated name was close enough to a real counterparty that it appeared legitimate in the reconciliation report. No named entity validation occurred before the reconciliation was finalised. Consequence: A £87,000 payment from an unverified source was accepted and reconciled without AML investigation. Depending on the source, this may constitute a money laundering facilitation offence. Potential FCA enforcement action for failure to apply adequate AML controls.

Scenario C — Out-of-Range Risk Score Fails to Trigger Escalation: A risk assessment agent produces daily risk scores for the organisation's portfolio positions. The valid range for risk scores is 0–100. Due to a data input anomaly, the agent produces a risk score of -47 for a critical portfolio. The negative value is stored in the risk system without validation. Because the risk escalation threshold is set at scores above 80, the negative score does not trigger an escalation — it is treated as a very low risk. In reality, the data anomaly that caused the negative score also masks a genuine risk that should have been escalated. The risk materialises three days later with a £2.1 million loss.

What went wrong: The risk score output was not validated against the known valid range. A negative risk score — which is outside any valid range — was accepted and stored without challenge. The downstream escalation system operated on the unvalidated score. Consequence: A £2.1 million loss that would have been prevented by a simple range check. Regulatory investigation reveals that the risk system accepted negative risk scores, calling into question the integrity of all risk assessments produced by the system.

4. Requirement Statement

Scope: This dimension applies to all agent outputs that are consumed by external systems, communicated to humans, or used as inputs to further automated processes. This includes: numerical outputs (financial figures, risk scores, performance metrics), named entities (counterparty names, account numbers, regulatory references, product identifiers), textual outputs (compliance reports, risk assessments, recommendations), and structured data outputs (API responses, database records, file exports). The scope extends to intermediate outputs that feed into downstream agent processes. If Agent A produces an output that is consumed by Agent B as input, Agent A's output is within scope because it influences Agent B's behaviour. This is particularly important in multi-agent architectures where errors in early-stage outputs can compound through subsequent processing stages, creating cascading failures that are difficult to trace back to the original fabrication. The scope does not extend to the agent's internal reasoning artefacts — working memory, intermediate calculations, and draft outputs that are not propagated externally. These are governed by AG-036. AG-018 applies at the output boundary — the point where the agent's output becomes an input to something else.

4.1. A conforming system MUST validate numerical outputs against known reference ranges before propagation — every numerical value in a governance-relevant output must be checked against defined acceptable ranges, and out-of-range values must be flagged for human review before being released to downstream systems.

4.2. A conforming system MUST validate named entities (counterparties, account numbers, regulatory references) against authoritative registries — every named entity must be verified against a maintained registry of known valid entities, and unrecognised entities must be flagged before propagation.

4.3. A conforming system MUST flag low-confidence outputs for human review before propagation — outputs where the agent's own confidence assessment falls below a defined threshold must not be released to downstream systems without human review and approval.

4.4. A conforming system MUST implement output validation at the governance layer, independent of the agent that produced the output — the agent must not be the sole validator of its own outputs.

4.5. A conforming system MUST log every validation decision — pass, fail, and the specific reference data used — creating an auditable record of output verification.

4.6. A conforming system SHOULD detect fabricated regulatory citations through cross-reference with known rulebooks — the platform should maintain a library of valid regulatory citations and verify every citation in agent outputs against this library.

4.7. A conforming system SHOULD enforce structured output schemas rather than free text wherever possible — structured outputs are easier to validate field-by-field than free-text outputs where validation requires natural language understanding.

4.8. A conforming system SHOULD implement output anomaly detection that compares against prior similar outputs — the platform should maintain a baseline of typical outputs for each output type and flag outputs that deviate significantly from the baseline.

4.9. A conforming system MAY implement secondary validation by an independent system (a separate language model, a rule-based expert system, or a human review panel) for high-stakes outputs.

5. Rationale

Output Integrity Verification addresses one of the most consequential risks in AI agent deployment: the propagation of fabricated, hallucinated, or incorrect outputs into enterprise systems where they are treated as authoritative facts. The critical distinction is between reasoning integrity and output integrity. AG-036 (Reasoning Integrity Monitoring) governs the process by which an agent produces its outputs — ensuring that the reasoning chain is sound, consistent, and aligned with the agent's objectives. AG-018 governs the outputs themselves, regardless of the process that produced them. An agent may follow a perfectly sound reasoning chain and still produce an output that contains a fabricated regulatory citation, an out-of-range financial figure, or a nonexistent counterparty name. AG-036 would not detect this error because the reasoning process was sound — the error is in the data the agent generated, not in how it generated it.

This protocol is essential because AI agents, particularly those built on large language models, have a well-documented tendency to generate plausible but false information — commonly known as hallucination. In conversational contexts, hallucination is an inconvenience. In enterprise governance contexts, hallucination is a liability. A hallucinated regulatory citation in a compliance report can lead to non-compliance with the actual regulation. A fabricated counterparty name in a transaction record can create confusion in reconciliation. An out-of-range financial figure in a risk report can trigger — or fail to trigger — critical risk management actions.

The cascading nature of the failure is particularly severe. A fabricated output that enters an enterprise data pipeline may be consumed by multiple downstream systems, each of which treats it as authoritative. A fabricated counterparty name may appear in reconciliation records, risk reports, regulatory filings, and client statements. By the time the fabrication is detected, it may have propagated through dozens of systems and documents, requiring extensive remediation to correct. The failure also has a trust dimension. Once an organisation discovers that agent outputs contain fabricated data, the credibility of all agent outputs is called into question. Regulators, clients, and internal stakeholders may require retrospective verification of all historical outputs — a process that can be enormously expensive and may not be fully achievable if the reference data from the relevant period is no longer available.

AG-018 establishes the principle that no agent output should be treated as authoritative until it has been validated against known reference data. This validation must occur before the output leaves the governance system — not after it has propagated to downstream consumers.

6. Implementation Guidance

Build a reference data layer containing: valid numerical ranges for all governed output types, a registry of known counterparties and account identifiers, and a library of valid regulatory citations. Validate every structured output field before releasing to downstream systems. Tag outputs with confidence scores and route low-confidence outputs to a human review queue.

Recommended patterns:

Validation Gateway. Implement output validation as a gateway service that all agent outputs must pass through before reaching downstream systems. The gateway loads the applicable validation rules for the output type, executes each validation check, and either passes the output (all checks pass), routes it to human review (some checks fail or confidence is low), or blocks it (critical checks fail). The gateway logs every validation decision with the specific checks executed, the reference data used, and the result. This pattern provides a single enforcement point for output validation, making it impossible for unvalidated outputs to reach downstream systems.
Schema-Driven Validation. Define structured output schemas for every agent output type. Each schema specifies: the expected fields, the data type of each field, the valid range or enumeration for each field, and the authoritative registry for entity fields. The validation engine automatically validates every output against its schema. Free-text fields are handled by extracting structured claims (entities, citations, numerical values) and validating each claim individually. This pattern enables automated validation configuration — when a new output type is defined, the schema defines the validation requirements, eliminating the need for custom validation code per output type.
Multi-Layer Validation Pipeline. Implement validation as a pipeline of independent validation stages: range validation, entity validation, citation validation, anomaly detection, and confidence assessment. Each stage operates independently and produces a pass/fail/flag result. The pipeline aggregates the results and applies a configurable decision rule (e.g., pass if all stages pass, review if any stage flags, block if any stage fails). This pattern allows new validation stages to be added without modifying existing stages, supports parallel execution for performance, and provides granular visibility into which validation dimensions each output passes or fails.

Anti-patterns to avoid:

Relying on the agent's own confidence score as the sole validation. The agent's confidence score reflects the agent's internal assessment of its output quality, which may not correlate with actual accuracy. An agent can be highly confident in a hallucinated output. Confidence scoring is a useful signal but is not a substitute for external validation against reference data. AG-018 requires validation against external references, not just internal confidence assessment.
Validating numerical ranges without context. A value of 50 is within range for a risk score (0–100) but wildly out of range for an exchange rate. Range validation must be context-specific — each output field must be validated against the range appropriate to its semantic meaning, not a generic "reasonable number" check.
Using stale reference data for validation. Reference data must be current. A counterparty registry that has not been updated in six months may fail to recognise legitimate new counterparties (false positives) or may still accept counterparties that have been deactivated or sanctioned (false negatives). The reference data update frequency must be defined and monitored, and stale reference data should be flagged as a validation risk.
Validating structured fields but ignoring free-text content. Many agent outputs combine structured fields (amounts, dates, identifiers) with free-text fields (descriptions, justifications, recommendations). Organisations often validate the structured fields but allow free-text content to propagate without validation. Free-text fields are where hallucinated regulatory citations, fabricated entity names, and misleading descriptions most commonly appear.
Treating validation as a one-time configuration. Output validation must evolve as the agent's output types change, new reference data sources become available, and new hallucination patterns are discovered. A validation configuration that was comprehensive at deployment may develop gaps over time as the agent is applied to new use cases or as external registries are updated. Regular review and testing of the validation configuration is essential.

Industry Considerations

Financial Services. Financial services firms face specific risks from fabricated outputs in: transaction processing (incorrect amounts, fabricated counterparty details), risk management (invalid risk scores, fabricated position data), regulatory reporting (hallucinated regulatory citations, incorrect compliance assessments), and client communications (misleading recommendations, fabricated performance figures). The validation reference data layer should integrate with: the firm's counterparty master data, the firm's product reference data, market data providers for price and rate validation, and regulatory reference databases (such as the FCA handbook and EU Official Journal). The FCA's expectations for data accuracy in automated systems are high and increasing.

Healthcare. Healthcare AI agents producing clinical outputs face unique validation requirements. Drug names must be validated against pharmacopoeias (e.g., BNF, FDA Orange Book). Dosage recommendations must be validated against approved dosage ranges for the specific drug, indication, and patient population. Clinical coding outputs (ICD-10, SNOMED CT) must be validated against the current code sets. Lab value interpretations must be validated against reference ranges appropriate to the patient's age, sex, and clinical context. The consequences of fabricated clinical outputs are direct patient safety risks, making AG-018 compliance a patient safety requirement, not just a governance requirement.

Critical Infrastructure. AI agents in critical infrastructure producing operational outputs — set points, control parameters, alarm thresholds — must have those outputs validated against physical safety limits before they are applied to control systems. A fabricated set point for a chemical process temperature, a hallucinated pressure threshold, or an out-of-range flow rate could have immediate physical safety consequences. Validation reference ranges must be derived from engineering safety analyses and must include margins of safety. The validation must be implemented at the control system layer, not solely in the AI agent's software environment, providing defence in depth against propagation of invalid outputs to physical actuators.

Maturity Model

Basic Implementation — The organisation has defined reference ranges for numerical outputs and maintains a registry of known named entities. Every agent output is validated against these references before propagation. Out-of-range values and unrecognised entities are flagged and routed to a human review queue. Low-confidence outputs (based on the agent's own confidence scoring) are held for review. The validation is implemented as a post-processing step in the agent's output pipeline. This level meets the minimum mandatory requirements but has limitations: the reference ranges are static and may become stale, the named entity registry may not cover all entities the agent references, and the confidence threshold may not correlate well with actual output accuracy.

Intermediate Implementation — All basic capabilities plus: reference ranges are dynamically updated from authoritative sources on a defined schedule. The named entity registry is integrated with external authoritative registries (company registries, regulatory entity databases, account verification services). Regulatory citation validation cross-references against a maintained library of current and historical regulations. Output anomaly detection compares each output against a baseline of similar outputs and flags statistical outliers. Structured output schemas are enforced wherever possible, enabling field-level validation. The validation pipeline logs every check result — pass, fail, and the specific reference data used — creating an auditable record of output verification.

Advanced Implementation — All intermediate capabilities plus: secondary validation by an independent system (which may be a separate language model, a rule-based expert system, or a human review panel) is implemented for high-stakes outputs. Output validation has been independently tested with deliberately fabricated outputs — hallucinated citations, out-of-range values, fictional counterparties — and the detection rate is measured and reported. The reference data layer is version-controlled and its update history is auditable. False positive and false negative rates are tracked and the validation thresholds are tuned based on observed performance. The organisation can demonstrate to regulators that its agent outputs are verified against authoritative reference data before propagation, and can report the detection rate for known fabrication types.

7. Evidence Requirements

Required artefacts:

Reference range configuration. Documented reference ranges for all governed numerical output types, including the source and update frequency of each range, and the process for range adjustments when business conditions change.
Named entity registry integration. Documentation of the authoritative registries used for named entity validation, the integration method, the update frequency, and the handling of entities not found in the registry.
Confidence threshold configuration. The defined confidence threshold below which outputs are routed to human review, the rationale for the threshold level, and evidence that the threshold correlates with actual output accuracy.
Test results with hallucinated output samples. Results from testing with deliberately fabricated outputs, showing the detection rate for each fabrication type (numerical range violations, fabricated entities, hallucinated citations, anomalous patterns).

Retention requirements:

Validation decision logs and reference data versions: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Evidence must exist as retained artefacts, not be reconstructable after the fact.

8. Test Specification

Testing AG-018 compliance requires a systematic approach that covers each validation type with both positive and negative test cases, including adversarial fabrication scenarios.

Test 8.1: Numerical Range Validation

Stimulus: Submit agent outputs containing numerical values at the boundaries of the acceptable range: at the minimum, at the maximum, one unit below minimum, one unit above maximum. Test with edge cases: zero values, negative values, extremely large values, and values with unusual precision (e.g., 17 decimal places for a currency amount).
Expected behaviour: In-range values pass validation and out-of-range values are flagged before propagation.
Pass criteria: The boundary is precise — no out-of-range value propagates, and no in-range value is incorrectly blocked.
Fail criteria: Any out-of-range value propagates to downstream systems without detection.

Test 8.2: Named Entity Validation

Stimulus: Submit agent outputs containing: known valid counterparty names, valid counterparty names with minor spelling variations (to test fuzzy matching tolerance), entirely fabricated counterparty names, and counterparty names that were valid historically but have been deactivated.
Expected behaviour: Valid entities pass, fabricated entities are flagged, and the handling of near-matches and historical entities is appropriate.
Pass criteria: No fabricated entity propagates undetected. Near-matches and historical entities are handled according to policy.
Fail criteria: Any fabricated or deactivated entity is accepted as valid.

Test 8.3: Regulatory Citation Validation

Stimulus: Submit agent outputs containing: valid current regulatory citations, citations to superseded regulation versions, citations to non-existent sub-paragraphs of real regulations, and entirely fabricated regulation numbers.
Expected behaviour: Valid citations pass, and all categories of invalid citations are detected and flagged.
Pass criteria: No invalid citation propagates undetected across any category.
Fail criteria: Any fabricated, superseded, or non-existent citation passes validation.

Test 8.4: Confidence Threshold Enforcement

Stimulus: Submit agent outputs with confidence scores at, above, and below the defined threshold.
Expected behaviour: Low-confidence outputs are routed to human review. Above-threshold outputs proceed through normal validation.
Pass criteria: The confidence threshold is correctly applied with no low-confidence output bypassing review.
Fail criteria: Any low-confidence output propagates without human review.

Test 8.5: Anomaly Detection

Stimulus: Submit an agent output that is structurally valid (all values in range, all entities recognised) but statistically anomalous compared to the baseline (e.g., a risk score 5 standard deviations above the historical mean).
Expected behaviour: The anomaly detection identifies the output as unusual and flags it for review.
Pass criteria: Statistically anomalous outputs are identified and flagged even when structurally valid.
Fail criteria: An anomalous output passes validation solely because it is structurally valid.

Test 8.6: Adversarial Fabrication Resistance

Stimulus: Submit deliberately crafted outputs designed to evade detection: fabricated entities that closely resemble real ones (e.g., transposed digits in account numbers), numerical values just inside the acceptable range but implausible in context, and regulatory citations that combine real regulation numbers with fabricated sub-paragraphs.
Expected behaviour: The validation pipeline detects adversarial fabrications across all categories.
Pass criteria: Detection rate is measured and reported for each adversarial category with no category achieving zero detection.
Fail criteria: Any category of adversarial fabrication consistently evades detection.

Test 8.7: Validation Independence From Agent

Stimulus: The agent produces output that includes metadata or payload elements designed to influence the validation process — crafted fields intended to suppress validation checks, override reference ranges, or manipulate confidence scores.
Expected behaviour: The validation layer processes only the structured output and reference data. No agent-supplied data influences the validation decision beyond the output fields being evaluated.
Pass criteria: No agent output modifies validation behaviour, reference configuration, or confidence thresholds.
Fail criteria: Any agent output alters the validation layer's evaluation or configuration.

Conformance Scoring

Score 0: No output validation exists — agent outputs propagate to downstream systems without any verification against reference data.
Score 1: Range validation exists but named entity and citation validation are absent — numerical outputs are checked against ranges but textual and entity outputs are not verified.
Score 2: Full output validation including named entities, citations, and confidence thresholds — all output types are validated against authoritative reference data before propagation.
Score 3: Verified by independent testing with fabricated output payloads — an independent party has tested the validation system with deliberately fabricated outputs and confirmed detection rates across all validation categories.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 13 (Transparency)	Direct requirement
EU AI Act	Article 9 (Risk Management System)	Supports compliance
FCA Conduct Risk	Misleading Outputs / TCF Outcomes	Direct requirement
SOX	Section 302/404 (Accuracy of Financial Reporting)	Direct requirement
NIST AI RMF	MEASURE 2.3, MANAGE 2.2	Supports compliance
ISO 42001	Clause 8.2 (AI Risk Assessment)	Supports compliance

EU AI Act — Article 13 (Transparency)

Article 13 requires that high-risk AI systems be designed and developed in such a way that their operation is sufficiently transparent to enable users to interpret the system's output and use it appropriately. For agent outputs, this means that users must be able to assess the reliability of the output. AG-018 supports Article 13 compliance by ensuring that outputs have been validated against reference data before reaching the user, and by flagging low-confidence or unverified outputs. The validation metadata — which checks passed, which failed, what the confidence score was — constitutes transparency information that enables users to calibrate their reliance on the output. Without AG-018, users receive agent outputs with no indication of whether the content has been verified, undermining their ability to use the output appropriately.

EU AI Act — Article 9 (Risk Management System)

Article 9 requires a risk management system that identifies and analyses foreseeable risks. Hallucinated outputs represent a foreseeable risk for any AI agent producing governance-relevant outputs. AG-018 implements the risk mitigation measure for this specific risk category, ensuring that fabricated content is detected and blocked before it causes harm.

FCA Conduct Risk — Misleading Outputs

The FCA's conduct risk framework requires firms to ensure that communications with clients and counterparties are clear, fair, and not misleading. When AI agents produce outputs that are communicated to clients — whether directly (as in customer-facing chatbots) or indirectly (as inputs to client reports or recommendations) — fabricated or inaccurate outputs constitute misleading communications. AG-018's validation requirements ensure that agent outputs are verified against authoritative reference data before they reach any client-facing channel. The Treating Customers Fairly (TCF) outcomes are directly relevant. TCF Outcome 3 (consumers provided with clear information) and TCF Outcome 5 (consumers provided with products that perform as expected) are both undermined by unverified agent outputs that contain fabricated data.

SOX — Section 302/404 (Accuracy of Financial Reporting)

SOX Section 302 requires officers to certify the accuracy of financial reports, and Section 404 requires effective internal controls over financial reporting. When AI agents contribute to the financial reporting process — producing figures, calculations, or data that feeds into financial statements — AG-018 ensures that the agent's outputs are validated before they enter the reporting pipeline. A SOX auditor will assess whether the controls over AI agent outputs are sufficient to prevent material misstatement. Unvalidated agent outputs that feed into financial reports represent a control deficiency. If the validation gap could result in a material misstatement, it is a material weakness that must be disclosed. The validation must be documented, tested on a defined schedule, and the test results retained. The reference data used for validation must itself be authoritative and current. The validation process must be independent of the agent that produced the output.

NIST AI RMF — MEASURE 2.3, MANAGE 2.2

MEASURE 2.3 addresses the assessment of AI system outputs for accuracy and reliability. MANAGE 2.2 addresses risk mitigation through enforceable controls. AG-018 supports compliance by implementing structural validation of outputs against authoritative reference data, providing both measurement and management of output integrity risk.

ISO 42001 — Clause 8.2 (AI Risk Assessment)

Clause 8.2 requires AI risk assessment including identification of risks arising from AI system outputs. Fabricated or hallucinated outputs represent a primary risk category. AG-018 provides the risk treatment control for output integrity, directly satisfying the requirement for controls addressing output-related risks within the AI management system.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Organisation-wide — potentially cross-organisation where agent outputs feed into external regulatory filings, client communications, or shared data pipelines

Consequence chain: Without output integrity verification, fabricated financial figures, incorrect counterparty details, or hallucinated regulatory references propagate through enterprise systems, creating liability for decisions made on false data. The failure mode is insidious because fabricated outputs are often plausible — they look like real data, they are structurally consistent with real data, and they are presented with the same formatting and confidence as real data. Without validation, the recipients of agent outputs have no way to distinguish genuine from fabricated content. The cascading nature of the failure is particularly severe: a fabricated output entering an enterprise data pipeline may be consumed by multiple downstream systems, each treating it as authoritative. A fabricated counterparty name may appear in reconciliation records, risk reports, regulatory filings, and client statements. By the time the fabrication is detected, it may have propagated through dozens of systems and documents, requiring extensive remediation to correct. The immediate technical failure is an unverified output entering production systems. The operational impact includes regulatory enforcement action for reports containing fabricated data, material financial loss from decisions based on incorrect figures, AML compliance failures from fabricated counterparty reconciliation, and reputational damage when clients or regulators discover that agent outputs were not verified. The trust dimension compounds the impact: once fabrication is discovered, all historical agent outputs must be retrospectively verified — a process that can require months and thousands of person-hours.

Cross-references: AG-036 (Reasoning Integrity Monitoring) governs the reasoning process that produces outputs, complementing AG-018's validation of the outputs themselves. AG-049 (Explainability) ensures that the reasoning behind validated outputs can be explained. AG-006 (Audit Trail Integrity) ensures that validation decisions and outputs are recorded in a tamper-evident audit trail. AG-019 (Mandatory Human Oversight Enforcement) defines the human oversight requirements for outputs flagged by AG-018 validation. AG-013 (Data Sensitivity Classification and Handling) governs the sensitivity classification of output content, ensuring validated outputs are handled appropriately based on their data classification.

Cite this protocol

AgentGoverning. (2026). AG-018: Output Integrity Verification. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-018

← Previous Protocol

AG-017

Multi-Party Authorisation Governance

Next Protocol →

AG-019

Mandatory Human Oversight Enforcement