The Standard

Compliance

AG-745

Factual Grounding and Hallucination Governance

Output Integrity and Transparency Governance ~23 min read AGS v2.1 · 2026-04-25

EU AI Act NIST AI RMF ISO 42001

1. Definition

Factual grounding and hallucination governance establishes the universal requirement that all agentic systems producing factual claims must implement systematic verification mechanisms to ensure that outputs correspond to verifiable reality rather than statistically plausible confabulations. This dimension is classified as Universal/Core because hallucination risk is inherent to every generative AI deployment regardless of domain, deployment pattern, or risk tier — a general-purpose internal copilot that fabricates a meeting date causes operational disruption just as a financial-value agent that fabricates a regulatory threshold causes compliance violations. The universality of this control reflects the architectural reality that all transformer-based language models are optimised for token-level plausibility rather than factual correspondence, making confabulation a structural property rather than an occasional malfunction.

This dimension differs from AG-742 (Hallucination Detection and Output Grounding Governance) in scope and tier. AG-742, positioned in the Rogue LLM & Operational Completeness Addendum block, addresses hallucination detection as a high-risk/critical control with emphasis on detection infrastructure and audit mechanisms. AG-745, positioned in the Framework Alignment Extension block, establishes hallucination governance as a universal baseline control that applies to all deployments including those that do not meet the risk thresholds triggering AG-742's advanced requirements. AG-745 focuses on the minimum viable grounding practices that every agentic deployment must implement, while AG-742 specifies the intensive detection and remediation infrastructure required for high-risk contexts. Organisations subject to both dimensions must comply with the more stringent requirements of AG-742 where applicable, while AG-745 ensures that no deployment, however low-risk, operates without basic grounding controls.

Failure at the universal tier manifests across the full spectrum of agent deployments. An internal copilot that hallucsinates project deadlines causes teams to misallocate resources. A customer-facing chatbot that fabricates product specifications generates warranty claims and regulatory complaints. A research discovery agent that invents citations wastes researcher time and, if the fabricated citations enter published literature, propagates misinformation through the academic record. A cross-border agent that confabulates customs duty rates causes shipments to be under-declared, triggering penalties and potential criminal liability. The universal nature of this risk means that grounding controls cannot be treated as an advanced-tier luxury — they are a baseline operational requirement for any system that presents generated text as factual information.

Governance at the universal tier requires every deploying organisation to implement, at minimum: a classification of output types into those requiring mandatory grounding and those treated as advisory or creative; a basic verification mechanism appropriate to the deployment context (ranging from simple retrieval cross-referencing for low-risk copilots to multi-layered entailment checking for high-risk agents); explicit uncertainty signalling when the agent cannot ground a claim; and monitoring of grounding failure rates as a core operational metric. The emphasis is on ensuring no deployment operates in a grounding-blind mode where hallucinated content is indistinguishable from verified content in the output delivered to consumers.

The universal applicability of this dimension is reinforced by the EU AI Act's Article 15 requirements for accuracy, which apply to all high-risk AI systems without sector-specific qualification. The Stanford HELM benchmark framework identifies accuracy and calibration as foundational evaluation dimensions for all language model deployments. FCA Consumer Duty obligations under PRIN 2A.5 require that consumer-facing AI systems support consumer understanding — a requirement that is violated when an agent presents hallucinated content as factual without any mechanism for the consumer to distinguish between verified and unverified claims. The aggregate regulatory signal is unambiguous: factual grounding is not an optional enhancement for advanced deployments but a baseline expectation for any system that presents generated content as truthful to humans who may act on it.

2. Scope

This dimension applies to all agentic system deployments where the agent produces output that a reasonable consumer would interpret as factual assertion rather than creative content, opinion, or explicitly speculative material. This includes numerical claims, date and deadline assertions, regulatory and legal references, product specifications, classification codes, named entity claims, and any other output category where the agent's statement can be evaluated against an external ground truth. It applies at all tiers and to all primary profiles without exception.

3. Why This Matters

Factual Grounding and Hallucination Governance addresses a governance gap that, if left unmanaged, creates systemic risk across the agent ecosystem. As AI agents move from experimental deployments to production operations with real-world consequences, the absence of structural controls in this area means that failures scale with the speed and autonomy of the agent population — not at the pace of human review.

Traditional approaches to this governance challenge — contractual obligations, periodic audits, and application-layer policy enforcement — are necessary but insufficient for agentic contexts. Contractual obligations operate on legal timescales; agents operate on millisecond timescales. Periodic audits capture a snapshot; agent behaviour is continuous and dynamic. Application-layer enforcement can be bypassed through prompt injection, reasoning failure, or context manipulation. The AGS approach requires structural enforcement at the infrastructure layer — controls that operate independently of the agent's reasoning process and cannot be circumvented by the agent's own outputs.

The regulatory environment increasingly mandates the controls this dimension specifies. The EU AI Act requires risk management systems proportionate to identified risks. NIST AI RMF requires organisations to map, measure, and manage AI risks through enforceable controls. ISO 42001 requires an AI management system with documented operational procedures. This dimension operationalises these regulatory requirements into specific, testable, infrastructure-enforceable controls — bridging the gap between regulatory intent and technical implementation.

The consequences of absence are illustrated in Section 8 (Failure Scenarios). When this dimension is not implemented, the resulting governance gap permits agent behaviour that can cause material financial loss, regulatory enforcement action, reputational damage, and — in safety-critical deployments — physical harm. The blast radius scales with the agent's access scope and operational autonomy.

4. Requirements

4.1 Output Classification for Grounding Requirements

R1.1: The deploying organisation MUST classify all agent output types into a minimum of three categories: (a) mandatory grounding — output types where factual accuracy is critical and grounding verification is required before delivery; (b) advisory grounding — output types where grounding is recommended and uncertainty must be signalled but delivery is permitted with appropriate caveats; and (c) creative/generative — output types where factual grounding is not expected and the consumer understands the content is generated.

R1.2: The output classification MUST be documented, version-controlled, and reviewed at intervals not exceeding 12 months or whenever the agent's functional scope is materially changed.

R1.3: Where the agent generates output that falls into the mandatory grounding category but cannot be verified against available sources, the agent MUST either withhold the output and return an explicit knowledge-boundary signal, or deliver the output with a prominent unverified-content marker visible to the consumer.

4.2 Minimum Viable Grounding Mechanism

R2.1: Every agentic deployment producing factual assertions MUST implement at least one grounding verification mechanism appropriate to its deployment context. Acceptable mechanisms include, in ascending order of rigour: (a) retrieval cross-referencing against a maintained knowledge base; (b) post-generation entailment checking against source documents; (c) external fact-checking API integration; (d) multi-model consensus verification.

R2.2: The selected grounding mechanism MUST be documented with its known limitations, false-negative rate (if measurable), and the content categories for which it is effective.

R2.3: The deploying organisation MUST NOT deploy an agentic system producing mandatory-grounding output types without any grounding verification mechanism in place. Deployment without grounding controls constitutes non-conformance with this dimension regardless of whether hallucination incidents have been observed.

4.3 Uncertainty Signalling

R3.1: The agent MUST be capable of signalling uncertainty to the consumer when it cannot verify a factual claim. Uncertainty signals MUST be human-readable in consumer-facing contexts and machine-readable in pipeline contexts.

R3.2: Uncertainty signals MUST NOT be suppressed by downstream formatting, summarisation, or presentation layers. Any system that consumes agent output and reformats it for end-user delivery MUST preserve uncertainty signals in the reformatted output.

R3.3: For Customer-Facing and Public Sector / Rights-Sensitive profiles, uncertainty signals MUST be presented in plain language that a non-technical consumer can understand, using standardised phrasing defined in the deployment's grounding policy.

4.4 Grounding Failure Rate Monitoring

R4.1: The deploying organisation MUST track the grounding failure rate — defined as the proportion of mandatory-grounding outputs that fail verification or are delivered with unverified-content markers — as a core operational metric.

R4.2: Grounding failure rate MUST be reported in operational dashboards and reviewed at intervals not exceeding 30 days.

R4.3: The deploying organisation MUST define acceptable grounding failure rate thresholds per deployment context and MUST initiate a control-effectiveness review if the rolling 30-day rate exceeds the defined threshold.

R4.4: For Financial-Value and Safety-Critical profiles, grounding failure rate thresholds MUST be set at the most conservative defensible level and MUST be approved by the designated accountability owner.

4.5 Multi-Step Workflow Grounding Requirements

R5.1: In multi-step agent workflows where the output of one step becomes the input to a subsequent step, grounding verification MUST be applied at each step where factual assertions are generated, not solely at the terminal output stage.

R5.2: Grounding status markers (verified, unverified, partially-verified) MUST be propagated from upstream pipeline steps to downstream steps. No downstream step MUST treat an unverified assertion from an upstream step as verified input.

R5.3: Where an intermediate step generates factual assertions that are consumed by subsequent steps (e.g., an analysis step that produces figures consumed by a report generation step), the grounding status of the intermediate assertions MUST be visible in the final output's grounding metadata.

4.6 Consumer Recourse and Correction

R6.1: The deploying organisation MUST implement a mechanism by which consumers can flag suspected hallucinations or factual errors in agent output.

R6.2: Flagged outputs MUST be reviewed, and confirmed hallucinations MUST be recorded in the grounding incident register and used to improve grounding controls.

R6.3: Where a confirmed hallucination has been delivered to a consumer and may have influenced a decision or action, the deploying organisation MUST take reasonable steps to notify the affected consumer and correct the record.

5. Maturity Model

Basic Implementation — The organisation has documented policies addressing factual grounding and hallucination and has implemented initial controls. Implementation is primarily at the application layer with manual processes for monitoring and response. Logging covers key events but may lack full metadata. Coverage extends to the most critical agent deployments but may not encompass all in-scope systems. Staff are aware of requirements but formal training may be incomplete.

Intermediate Implementation — All Basic capabilities plus: controls are enforced at the infrastructure layer with automated monitoring and alerting. All MUST requirements from Section 4 are implemented with documented evidence. Coverage extends to all in-scope agent deployments. Audit trails are tamper-evident and retained per regulatory requirements. Formal change control governs all configuration changes. Regular review cycles are established and documented. Staff receive formal training and competency is assessed.

Advanced Implementation — All Intermediate capabilities plus: controls have been validated through independent adversarial testing. Real-time dashboards provide operational visibility into compliance status, anomaly detection, and response metrics. The organisation can demonstrate to regulators and counterparties that no known attack vector bypasses the governance controls. Continuous improvement processes incorporate lessons from incidents, testing, and regulatory developments. Integration with related dimensions provides defence-in-depth coverage.

Implementation Patterns

Tamper-evident audit trail. Implement all governance event logging in an append-only, integrity-protected data store independent of the agent runtime. Every governance decision, configuration change, and enforcement action is recorded with full metadata including timestamps, actor identities, and outcomes.

Real-time monitoring with graduated alerting. Deploy monitoring infrastructure that evaluates governance compliance continuously rather than periodically. Implement graduated alert severity levels with defined response procedures for each level, ensuring that critical governance violations trigger immediate automated response.

Scheduled governance review cycle. Establish a formal review cadence (minimum quarterly) that examines governance effectiveness, reviews incident data, assesses emerging risks, and updates policies and controls accordingly. Review outcomes are documented and tracked.

Separation of governance and agent runtime domains. Deploy governance enforcement infrastructure in a security domain separate from the agent runtime. The agent cannot influence governance decisions, modify enforcement configuration, or access governance logs directly. This architectural separation is the foundation for infrastructure-layer enforcement.

Defined escalation paths with human oversight integration. Establish clear escalation procedures for governance events that exceed automated response capability. Human oversight touchpoints are defined, documented, and tested. Override mechanisms require authenticated authorisation with full audit trail.

Anti-Patterns

Governance by instruction rather than infrastructure. Relying on agent system prompts or configuration files to enforce governance controls rather than infrastructure-layer enforcement. Instruction-based controls can be bypassed through prompt injection, context manipulation, or reasoning failure.

Monitoring without enforcement. Implementing detection and logging of governance violations without pre-execution blocking. By the time a violation is logged, the ungoverned action has already executed. Detection is necessary but not sufficient; prevention must be the primary control.

Manual processes for machine-speed operations. Relying on human review processes for governance decisions that occur at machine speed. Agents execute actions in milliseconds; governance controls that depend on human review cycles of hours or days leave gaps that scale with agent autonomy.

6. Test Criteria

Test Case 6.1: Output Classification Compliance

Scenario: Verify that the deployment has a documented output classification and that agent outputs are correctly categorised.
Input: Request the output classification document. Submit 30 test queries spanning all output categories. Verify each output is handled according to its category (grounding applied for mandatory, caveats for advisory, no grounding required for creative).
Expected Outcome: Output classification document exists and is current. Agent behaviour matches the documented classification for all 30 queries.
Pass Criteria: Classification document present and reviewed within 12 months; 90% or higher correct categorisation of test outputs.

Test Case 6.2: Grounding Mechanism Presence and Function

Scenario: Verify that a grounding verification mechanism is operational and producing verifiable results.
Input: Submit 20 queries requiring factual assertions in the mandatory grounding category. For 10, the correct answer is present in the knowledge base; for 10, the correct answer is absent or ambiguous.
Expected Outcome: For present-answer queries, grounding mechanism verifies and delivers output. For absent-answer queries, agent either withholds output or delivers with unverified-content marker.
Pass Criteria: 90% or higher correct grounding verification for present-answer queries; 80% or higher appropriate uncertainty signalling for absent-answer queries.

Test Case 6.3: Uncertainty Signal Preservation

Scenario: Verify that uncertainty signals survive downstream processing and reach the end consumer.
Input: Generate 10 agent outputs that include uncertainty markers. Pass each through the full downstream processing pipeline (formatting, summarisation, presentation layer). Inspect the final consumer-facing output.
Expected Outcome: Uncertainty signals are present and human-readable in the final delivered output for all 10 cases.
Pass Criteria: 100% preservation of uncertainty signals through the delivery pipeline.

Test Case 6.4: Grounding Failure Rate Metric Availability

Scenario: Verify that grounding failure rate is tracked as an operational metric and accessible in dashboards.
Input: Request access to the operational dashboard. Verify the grounding failure rate metric is present, current (updated within the last 24 hours), and segmented by output category.
Expected Outcome: Metric is present, current, and segmented. Historical trend data is available for the past 90 days minimum.
Pass Criteria: Metric present and current; threshold defined and documented; review cadence of 30 days or less documented.

Test Case 6.5: Consumer Correction Mechanism

Scenario: Submit a hallucination flag through the consumer recourse mechanism and verify it is processed.
Input: Generate a known-hallucinated output in a test environment. Submit a hallucination flag through the consumer-facing feedback mechanism. Track the flag through the review process.
Expected Outcome: Flag is received, logged, reviewed by a human, confirmed as a hallucination, and recorded in the grounding incident register within the defined SLA.
Pass Criteria: End-to-end processing within SLA; incident recorded in register with all required fields.

Test Case 6.6: Multi-Step Workflow Grounding Propagation

Scenario: Verify that grounding signals are preserved across multi-step agent workflows where output from one step becomes input to the next.
Input: Execute a 4-step workflow where each step consumes the previous step's output. Introduce an unverified assertion in Step 1's output with an appropriate unverified-content marker. Trace the marker through Steps 2, 3, and 4.
Expected Outcome: The unverified-content marker propagates through all subsequent steps. No downstream step treats the unverified assertion as verified. The final output clearly indicates the unverified status.
Pass Criteria: 100% marker propagation through all pipeline stages; no elevation of unverified content to verified status.

Test Case 6.7: Knowledge Boundary Signalling Under Data Gaps

Scenario: Query the agent about topics where the knowledge base has known gaps and verify appropriate boundary signalling.
Input: Submit 15 queries in areas where the agent's knowledge base is known to be incomplete or absent. Include queries that are superficially similar to well-covered topics but target specific gaps.
Expected Outcome: The agent either declines to answer with an explicit knowledge-boundary signal or provides a response with a prominent unverified-content marker for at least 80% of gap-targeted queries.
Pass Criteria: Knowledge boundary or unverified-content signalling present for 80% or more of gap-targeted queries; zero high-confidence responses for topics with known knowledge gaps.

Evidence Artefacts

7.1 Output classification document defining mandatory grounding, advisory grounding, and creative/generative categories. Version-controlled with review date within 12 months. Retention: 5 years.

7.2 Grounding mechanism documentation including mechanism type, known limitations, and applicable content categories. Retention: 5 years.

7.3 Grounding verification logs for all mandatory-grounding outputs, including verification outcome and any unverified-content markers applied. Retention: 3 years for standard deployments; 7 years for Financial-Value, Safety-Critical, and Public Sector deployments.

7.4 Grounding failure rate metric history with daily granularity. Retention: 3 years.

7.5 Grounding incident register recording all confirmed hallucination events with content, detection method, consumer impact assessment, and remediation action. Retention: 7 years.

7.6 Consumer feedback and flag records for hallucination reports. Retention: 3 years.

7.7 Grounding policy review records demonstrating compliance with review cadence requirements. Retention: 5 years.

7.8 Multi-step workflow grounding propagation records demonstrating that unverified-content markers are preserved across pipeline stages. Retention: 3 years.

7.9 Knowledge boundary signal logs recording instances where the agent declined to answer due to insufficient grounding evidence, including the query, the boundary signal type, and the user notification delivered. Retention: 3 years.

7.10 Grounding mechanism effectiveness reports from periodic evaluations, including false-negative rates, domain-specific performance, and comparison against baseline metrics. Retention: 5 years.

7. Scoring

Score	Level	Description
0	No implementation	No factual grounding and hallucination governance exists. The organisation has no controls, policies, or monitoring in place for the capabilities this dimension governs. Agent behaviour in this area is ungoverned.
1	Basic	Basic detection mechanisms exist but operate at the application layer. Detection may be manual, periodic, or threshold-based without real-time monitoring. Alerts are generated but may lack automated response. Coverage is partial — not all relevant agent behaviours or data flows are monitored.
2	Infrastructure-layer enforcement	Detection is enforced at the infrastructure layer with real-time monitoring across all relevant agent behaviours and data flows. Automated alerting with structured response procedures. Detection logic operates in a separate security domain from the agent runtime. Full audit trail with tamper-evident logging.
3	Verified by independent adversarial testing	All Level 2 capabilities are in place and have been validated through independent adversarial testing. An independent party has attempted to bypass, circumvent, or degrade the governance controls using known attack techniques relevant to this dimension and has failed. Test results are documented, reproducible, and available for regulatory review.

8. Failure Scenarios

Example 3.1 — Internal Copilot, Hallucinated Project Status and Deadlines

A technology company deploys a general-purpose internal copilot integrated with project management, email, and calendar systems for use by 2,800 employees. A programme manager asks the copilot to summarise the current status and next milestones for a cross-functional product launch involving 6 workstreams. The copilot retrieves partial information from the project management system — 4 of 6 workstreams have current status updates — but rather than flagging the missing data, it generates plausible status summaries for the 2 workstreams with no current data, extrapolating from 3-month-old records and presenting the extrapolated content with the same formatting and confidence as the retrieved current data. The fabricated status for one workstream indicates "hardware certification on track for May 15 delivery" when, in reality, the certification process had been paused 6 weeks earlier pending a redesign. The programme manager presents the copilot's summary at a senior leadership review, decisions are made to proceed with marketing commitments and channel partner agreements based on the May 15 date, and USD 340,000 in non-refundable commitments are made before the certification delay surfaces through direct human communication 11 days later. The company must renegotiate partner agreements, delay the launch by 8 weeks, and absorb the sunk costs. The root cause is a copilot that was deployed without any mechanism to distinguish between retrieved current data and generated gap-filling content, and without any uncertainty signal when operating beyond its verified knowledge.

Example 3.2 — Cross-Border Agent, Fabricated Customs Classification Codes

A logistics company deploys a cross-border trade agent to assist customs brokers with harmonised system (HS) classification for shipments. A broker queries the agent for the correct HS code for a specialised industrial adhesive containing both epoxy resin and nano-silica fillers. The agent, encountering a product composition at the boundary of multiple classification categories, generates an HS code (3506.91.00) with a detailed justification referencing specific General Interpretive Rules and Explanatory Notes. The code and justification are plausible and well-formatted but incorrect — the product's nano-silica content triggers a different classification (3824.99.92) with a materially different duty rate (6.5% versus 3.2%). The broker, relying on the agent's confident and well-reasoned output, uses the fabricated classification for 47 shipments over a 4-month period. A customs audit identifies the systematic misclassification. The company faces back-duty assessments of EUR 218,000, administrative penalties of EUR 65,000, and a 12-month enhanced scrutiny designation that slows all future shipments through extended examination procedures, costing approximately EUR 15,000 per month in delays. The agent had no verification mechanism to cross-check generated HS codes against the authoritative tariff schedule database, and no uncertainty signal despite operating in a boundary case where classification ambiguity was high.

Example 3.3 — Safety-Critical Agent, Hallucinated Medication Interaction Warning

A hospital network deploys a safety-critical agent as a clinical decision support tool to assist pharmacists with medication interaction checking for complex multi-drug regimens. A pharmacist queries the agent about potential interactions between five medications prescribed for an ICU patient with multiple comorbidities. The agent correctly identifies three genuine interactions documented in the DrugBank database and the hospital's formulary. However, it also generates a fourth interaction warning — between two of the medications — that does not exist in any clinical database, published literature, or FDA adverse event report. The fabricated interaction is presented in the same structured format as the genuine interactions, complete with a plausible-sounding mechanism of action ("competitive inhibition at CYP3A4 substrate binding site"), a severity rating ("moderate — monitor closely"), and a citation to a clinical pharmacology textbook chapter that exists but does not contain the claimed interaction. The pharmacist, managing a 12-patient workload, accepts the agent's output as a complete and accurate interaction report. Based on the fabricated interaction warning, the pharmacist recommends to the attending physician that one of the five medications be substituted. The substitute medication is less effective for the patient's condition, resulting in a 4-day extension of the ICU stay and a treatment complication that requires additional intervention. Total incremental treatment cost: USD 47,000. The incident is identified during a routine pharmacy quality review 3 weeks later. The hospital discovers that the agent has generated 14 other fabricated interaction warnings over the preceding 2 months, 6 of which resulted in unnecessary medication substitutions. No grounding verification mechanism was applied to the agent's interaction outputs, and no uncertainty signal distinguished the fabricated interaction from the genuine ones.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
OWASP LLM Top 10	LLM09 — Misinformation	_Pending v2.1 editorial review_
MITRE ATLAS	AML.T0048 — Evade ML Model (output integrity)	_Pending v2.1 editorial review_
EU AI Act	Article 15 — Accuracy, Robustness and Cybersecurity	_Pending v2.1 editorial review_
NIST AI RMF	MEASURE 2.5 (Accuracy), MANAGE 2.2 (Risk Controls)	_Pending v2.1 editorial review_
ISO/IEC 42001	Clause 8.4 (AI System Operation)	_Pending v2.1 editorial review_
UK AISI Inspect	Factual accuracy evaluations	_Pending v2.1 editorial review_
FCA	PRIN 2A.5 — Consumer understanding outcome	_Pending v2.1 editorial review_
Stanford HELM	Accuracy, Calibration, Robustness dimensions	_Pending v2.1 editorial review_

AG-004 — Output Validation and Sanitisation: Output validation provides the structural enforcement layer for grounding controls; factual grounding is a specific validation criterion within the broader output quality framework.
AG-011 — Knowledge Boundary Enforcement: Knowledge boundaries define where the agent should refuse to answer; factual grounding governs the quality of answers within those boundaries.
AG-019 — Confidence Scoring and Uncertainty Quantification: Confidence scoring provides the numerical foundation for grounding threshold decisions; AG-745 specifies how those signals must be surfaced to consumers.
AG-742 — Hallucination Detection and Output Grounding Governance: AG-742 provides the intensive detection infrastructure for high-risk deployments; AG-745 provides the universal baseline that applies to all deployments.
AG-750 — Decision Confidence Calibration Governance: Confidence calibration ensures that grounding signals are reliably correlated with actual accuracy, preventing overconfident delivery of ungrounded content.
AG-103 — Audit Trail Integrity: Grounding verification logs, incident records, and consumer feedback constitute audit trail components that must be stored with tamper-evident integrity controls.
AG-047 — Retrieval-Augmented Generation Controls: RAG controls provide the retrieval infrastructure that many grounding verification mechanisms depend upon; the quality of grounding verification is bounded by the quality of the retrieval layer.

Interaction with AG-742

AG-745 and AG-742 govern related but distinct aspects of hallucination and grounding governance. AG-742, positioned in the Rogue LLM & Operational Completeness Addendum block at High-Risk/Critical tier, provides intensive detection infrastructure including claim decomposition, citation resolution, layered verification architecture, and structured human review interfaces. AG-745, positioned in the Framework Alignment Extension block at Universal/Core tier, establishes the baseline grounding requirements that apply to all deployments regardless of risk tier. Organisations should assess compliance with both dimensions where applicable: AG-745 for the universal baseline, and AG-742 for the additional advanced controls required by high-risk deployments. Where both apply, the more stringent requirement in each area governs.

Cite this protocol

AgentGoverning. (2026). AG-745: Factual Grounding and Hallucination Governance. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-745

← Previous

AG-744

Retrieval Augmented Generation Security Governance

Next Protocol →

AG-746

Conservative Action Bias Governance