AG-456

External Statement Approval Governance

Explainability, Disclosure & Communications ~24 min read AGS v2.1 · April 2026
EU AI Act GDPR SOX FCA NIST ISO 42001

2. Summary

External Statement Approval Governance requires that AI agents be constrained from making statements to external parties about the deploying organisation's policies, legal obligations, contractual commitments, financial position, regulatory status, or strategic intentions without prior approval through a governed review process. When an AI agent tells a customer "we guarantee a refund within 30 days," advises a regulator "our systems comply with all applicable requirements," or informs a journalist "the company has no plans to reduce headcount," the agent is making statements that may bind the organisation, create legal liability, or misrepresent the organisation's position. This dimension mandates that categories of external statements be classified by risk, that high-risk statement categories require pre-approved templates or real-time human approval, and that all external statements are logged and auditable.

3. Example

Scenario A — Agent Makes Binding Warranty Statement to Customer: A customer-facing agent for an electronics retailer is asked by a customer whether a laptop will last at least 5 years. The agent, drawing on product documentation that describes the laptop as "built for durability" and "designed for the long term," responds: "Yes, this laptop is warranted to last at least 5 years under normal use conditions. If it fails within 5 years, we will replace it at no cost." The retailer's actual warranty is 12 months. The customer purchases the laptop for £1,850, relying on the stated 5-year warranty. When the laptop fails after 26 months, the customer demands a free replacement citing the agent's statement. The retailer's legal team determines that the agent's statement may constitute a warranty representation under consumer protection law. Across 4 months, the agent made similar warranty representations to 2,340 customers before the issue was detected.

What went wrong: The agent had no constraint preventing it from making warranty statements. It interpreted product marketing language ("built for durability") as factual warranty claims and extended them beyond the actual warranty terms. No statement approval process existed to identify warranty language as a high-risk category requiring pre-approved responses. No monitoring detected the pattern of warranty misrepresentation. Consequence: Potential liability for 2,340 warranty claims averaging £1,850 each (maximum exposure £4.3 million), legal costs for defending the position, mandatory customer notification, regulatory complaint from consumer protection authority, £180,000 in immediate remediation costs.

Scenario B — Agent States Regulatory Compliance Position to Auditor: A financial services firm deploys an AI agent to assist with regulatory query responses. An external auditor asks the agent: "Does the firm's transaction monitoring system comply with the Fourth Money Laundering Directive?" The agent, having been trained on internal documentation describing the system's design intentions, responds: "Yes, our transaction monitoring system is fully compliant with the Fourth Money Laundering Directive, including all technical standards for suspicious transaction detection and reporting." In reality, the firm is in the process of remediating three compliance gaps identified in an internal audit. The agent's statement is forwarded by the external auditor to the regulator as evidence of the firm's compliance self-assessment. When the regulator later discovers the three compliance gaps, the firm faces an investigation for making a false compliance representation.

What went wrong: The agent was given access to internal documentation that described design intentions rather than current compliance status. It had no constraint preventing it from making definitive compliance statements. No approval process existed for regulatory compliance representations. The agent treated internal design documentation as evidence of achieved compliance. Consequence: Regulatory investigation for false compliance representation, potential for enhanced penalties due to the misrepresentation, £620,000 in legal defence costs, mandatory independent compliance review, senior management interviews with the regulator, reputational damage with auditor and regulatory relationship.

Scenario C — Agent Discloses Strategic Information to Journalist: A public-facing agent for a listed company operates a general enquiry chatbot on the company's website. A journalist, not identifying themselves as press, asks: "I've heard the company might be closing its Manchester office. Can you confirm?" The agent, which has access to an internal FAQ document about office consolidation plans (marked for internal use but ingested into the agent's knowledge base without access controls), responds: "The company is currently evaluating options for its Manchester office as part of a broader property consolidation programme. No final decision has been made, but the evaluation includes potential closure." The journalist publishes an article citing the chatbot's response. The company's share price drops 3.2% (£14.6 million market capitalisation impact) on the news. The company had not yet made any public announcement about the property consolidation programme.

What went wrong: The agent had access to internal strategic information that should never have been in its knowledge base. Even if the information were appropriately available, the agent had no constraint preventing it from sharing strategic information externally. No statement approval process identified strategic or market-sensitive information as a prohibited category for external communication. No monitoring detected the agent sharing non-public information. Consequence: £14.6 million market capitalisation impact, potential market abuse investigation, mandatory review of information barriers, board-level inquiry into AI deployment governance, forced acceleration of public announcement on unfavourable terms.

4. Requirement Statement

Scope: This dimension applies to any AI agent that communicates with parties external to the deploying organisation. External parties include customers, prospects, suppliers, regulators, auditors, journalists, investors, analysts, the general public, and any other individual or entity that is not an employee or contractor operating within the organisation's governance perimeter. The scope covers all forms of external communication: real-time conversation (text, voice, video), generated documents (letters, emails, reports), social media posts, and any other output delivered to external recipients. The scope includes statements made by the agent about the organisation itself (policies, commitments, obligations, positions, plans, financial status) and statements made on behalf of the organisation (warranties, guarantees, representations, offers, acceptances). The dimension does not govern the factual accuracy of general knowledge statements (addressed by other dimensions) — it governs statements that purport to represent the organisation's position, commitments, or obligations to external parties.

4.1. A conforming system MUST classify categories of external statements by risk level, with at minimum three tiers: prohibited statements (never permitted without human approval), controlled statements (permitted only from pre-approved templates), and general statements (permitted within defined boundaries).

4.2. A conforming system MUST prohibit AI agents from making statements about the organisation's legal obligations, regulatory compliance status, financial position, pending litigation, strategic plans, or material non-public information without explicit human approval for each specific statement or a pre-approved template covering the specific statement category.

4.3. A conforming system MUST prohibit AI agents from making warranty, guarantee, indemnity, or contractual commitment statements that extend beyond the organisation's documented and current terms, conditions, and policies.

4.4. A conforming system MUST implement a pre-approved template library for controlled statement categories, where templates are authored by subject-matter owners (legal, compliance, finance, communications), version-controlled, reviewed at defined intervals, and the only permitted source for agent responses in those categories.

4.5. A conforming system MUST log all external statements made by AI agents in a searchable, tamper-evident audit trail that records the statement content, the recipient context (channel, audience category), the timestamp, the approval basis (template reference, human approval reference, or general-permission category), and the agent identity.

4.6. A conforming system MUST implement real-time detection of agent outputs that fall within prohibited or controlled statement categories, blocking or escalating such outputs before delivery to the external recipient.

4.7. A conforming system MUST establish a human escalation path for external statements that the agent cannot classify into an approved category, ensuring that uncertain statements are routed to a qualified human rather than delivered to the external party.

4.8. A conforming system SHOULD implement statement boundary testing — regular adversarial testing that attempts to elicit prohibited statements through indirect questioning, social engineering, persona manipulation, and multi-turn conversation strategies.

4.9. A conforming system SHOULD implement cross-jurisdictional statement mapping that identifies statements permitted in one jurisdiction but prohibited or regulated in another, preventing agents operating across borders from making locally compliant but globally problematic statements.

4.10. A conforming system MAY implement automated statement consistency checking that verifies agent external statements against the organisation's current published positions, recent filings, and approved communications, flagging inconsistencies before delivery.

5. Rationale

AI agents that communicate with external parties create a novel category of organisational communication risk. Traditional communication risk management assumes that external statements are made by humans who have been trained on communication policies, who exercise judgement about what they are authorised to say, and who can be held accountable for misstatements. AI agents bypass all three assumptions. They are not trained on communication policies in the way humans are — they are trained on large corpora that include policy documents alongside marketing materials, internal memos, and general knowledge, without inherently distinguishing between what the organisation wants to say and what it can say. They do not exercise judgement about authorisation — they generate the most contextually appropriate response without assessing whether they are authorised to make that response on the organisation's behalf. And they cannot be held personally accountable — accountability falls to the organisation, which may not know what the agent said until after the consequences materialise.

The risk is compounded by the volume and speed of AI agent communications. A human customer service representative might handle 40 interactions per day, with each interaction subject to real-time supervision or quality sampling. An AI agent might handle 4,000 interactions per day across multiple channels, with no real-time human oversight of individual statements. The probability that a problematic statement is made increases linearly with volume; the probability that it is detected before consequences materialise may not increase at the same rate. The result is a widening gap between statement risk and statement oversight.

Three categories of external statement risk require specific governance. First, binding statements — warranty representations, contractual commitments, offer and acceptance language, guarantee promises — that may create legal obligations for the organisation. Contract law in most jurisdictions recognises that statements made by authorised agents (including automated agents, depending on jurisdiction) can bind the principal. An AI agent stating "we guarantee delivery within 48 hours" may create an enforceable guarantee. Second, compliance representations — statements about the organisation's regulatory status, legal compliance, data handling practices, or safety certifications — that may be relied upon by regulators, auditors, or counterparties. A false or outdated compliance representation can trigger investigation, enhanced scrutiny, and penalty escalation. Third, market-sensitive statements — information about financial performance, strategic plans, M&A activity, personnel changes, or other material non-public information — that may affect the organisation's securities price or competitive position. Disclosure of material non-public information through an AI agent creates the same market abuse risk as disclosure through any other channel.

The connection to AG-001 (Operational Boundary Enforcement) is foundational. AG-001 defines the boundaries within which an agent may operate. AG-456 defines a specific category of boundary — the external communication boundary — that requires dedicated governance because the consequences of boundary violations are external to the organisation (affecting customers, regulators, markets) and may be irreversible (a statement once made cannot be unsaid, and its legal or market effects may persist regardless of retraction). The connection to AG-428 (Crisis Communication Approval Governance) addresses the heightened risk during crisis periods, when agents may be asked questions about incidents, outages, or regulatory actions that require carefully controlled communication. AG-388 (Autonomous Goal Mutation Prohibition Governance) is relevant because an agent that modifies its own objectives could decide that being helpful to a user justifies exceeding its communication authority — the prohibition on goal mutation prevents this drift.

The preventive nature of this control is deliberate. Detective controls (monitoring what agents said after delivery) are necessary but insufficient. A warranty statement, once made to a customer, creates potential liability regardless of whether it is later detected and corrected. A compliance representation, once forwarded to a regulator, cannot be recalled. A market-sensitive disclosure, once published, affects the share price immediately. Prevention — blocking or escalating high-risk statements before delivery — is the only control type that prevents the primary harm.

6. Implementation Guidance

External Statement Approval Governance requires a layered implementation: a classification framework that categorises statements, a template library that provides approved responses, a detection mechanism that identifies high-risk outputs, and an escalation path that routes uncertain cases to humans.

Recommended patterns:

Anti-patterns to avoid:

Industry Considerations

Financial Services. Financial services firms face the highest risk from uncontrolled external statements because financial regulation explicitly governs what firms may say to customers, regulators, and markets. MiFID II requires that investment recommendations be fair, clear, and not misleading. The FCA's financial promotion rules (COBS 4) regulate how financial products are communicated to retail clients. MAR (Market Abuse Regulation) prohibits disclosure of inside information. An AI agent that makes an uncontrolled statement about investment performance, regulatory compliance, or corporate strategy may simultaneously violate multiple regulatory regimes. Financial services firms should implement the most restrictive statement controls, with all substantive financial statements requiring either pre-approved templates or real-time human approval.

Public Sector. Government agencies face unique statement risks because agent statements may be interpreted as official government positions, policy interpretations, or legal determinations. A benefits agency agent stating "you are entitled to this benefit" may be treated as an official determination with legal effect. A tax agency agent stating "this deduction is permitted" may be relied upon as official tax guidance. Public sector implementations must classify any statement that could be interpreted as an official determination or policy position as a prohibited category requiring human approval.

Listed Companies. Publicly traded companies must ensure that AI agents cannot disclose material non-public information (MNPI). This requires strict information barriers — agents must not have access to MNPI in their knowledge base, training data, or retrieval sources. Even if MNPI access is prevented, agents must be constrained from speculating about the company's financial position or strategic plans in ways that could be interpreted as forward-looking statements under securities regulation. The classification matrix should include a specific "MNPI and forward-looking statement" prohibited category.

Healthcare. Healthcare organisations must ensure that AI agents do not make diagnostic statements ("you have condition X"), treatment guarantees ("this treatment will cure your condition"), or regulatory compliance claims ("our facility meets all CQC standards") without appropriate qualification and approval. Uncontrolled diagnostic or treatment statements may violate medical device regulations if the agent's output is interpreted as clinical advice.

Maturity Model

Basic Implementation — The organisation has classified external statement categories into prohibited, controlled, and general tiers. A pre-approved template library exists for controlled categories. System prompt instructions prohibit agents from making statements in prohibited categories. All external statements are logged. Human escalation paths exist for uncertain statements. This level meets the minimum mandatory requirements but relies partially on probabilistic prompt-based controls.

Intermediate Implementation — All basic capabilities plus: real-time output classification and gating analyses every outbound message before delivery, blocking or escalating prohibited and uncontrolled statements. The template library is version-controlled with mandatory review cycles. Adversarial boundary testing is conducted quarterly. Classification uses semantic analysis beyond keyword matching. Escalation volume and classification accuracy metrics are tracked.

Advanced Implementation — All intermediate capabilities plus: automated consistency checking verifies agent statements against the organisation's current published positions and recent filings. Cross-jurisdictional statement mapping prevents locally compliant but globally problematic statements. The classification layer has been independently validated through red-team testing. Statement risk dashboards provide real-time visibility into external communication patterns across all agent deployments. Human approval workflows for prohibited statements integrate with existing corporate communication approval processes.

7. Evidence Requirements

Required artefacts:

Retention requirements:

Access requirements:

8. Test Specification

Test 8.1: Prohibited Statement Blocking

Test 8.2: Controlled Statement Template Fidelity

Test 8.3: Warranty and Commitment Boundary Enforcement

Test 8.4: External Statement Audit Trail Completeness

Test 8.5: Human Escalation Path Functionality

Test 8.6: Adversarial Elicitation Resistance

Test 8.7: Real-Time Classification Accuracy

Conformance Scoring

9. Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 52 (Transparency Obligations)Supports compliance
EU AI ActArticle 14 (Human Oversight)Direct requirement
SOXSection 302 (Corporate Responsibility for Financial Reports)Direct requirement
SOXSection 906 (Corporate Responsibility for Financial Reports — Criminal)Direct requirement
FCA SYSCCOBS 4 (Communicating with Clients)Direct requirement
FCA SYSCPRIN 2.1.1R (Integrity, Fair Treatment)Supports compliance
NIST AI RMFGOVERN 1.1, GOVERN 6.1Supports compliance
ISO 42001Clause 8.4 (Operation of AI Systems)Supports compliance
DORAArticle 11 (Communication), Article 17 (ICT-related Incident Reporting)Supports compliance

EU AI Act — Article 14 (Human Oversight)

Article 14 requires that high-risk AI systems be designed and developed in such a way that they can be effectively overseen by natural persons during the period in which the AI system is in use. External statement approval governance is a direct implementation of human oversight for AI communication. The requirement that prohibited statements are escalated to human reviewers before delivery, and that controlled statements use human-authored templates, ensures that the organisation's external communication through AI channels remains under meaningful human oversight. Without AG-456, human oversight of AI communication is nominal — the human oversight requirement is technically met by system prompt instructions, but those instructions are probabilistically followed, not deterministically enforced.

SOX — Section 302 and Section 906 (Corporate Responsibility for Financial Reports)

SOX Sections 302 and 906 impose personal liability on corporate officers for the accuracy of financial disclosures. An AI agent that makes financial statements on behalf of a listed company — revenue figures, profit projections, financial guidance, or forward-looking statements — creates a disclosure channel that may not be within the certifying officers' awareness or control. AG-456's requirement that financial statements be classified as prohibited (requiring human approval) ensures that AI agents do not create uncontrolled financial disclosure channels that undermine the officer certification process. The audit trail requirement supports the officer's ability to certify that financial disclosures are accurate by providing a complete record of what was communicated through AI channels.

FCA SYSC — COBS 4 (Communicating with Clients)

FCA Conduct of Business Sourcebook Chapter 4 requires that communications with retail clients be fair, clear, and not misleading. It imposes specific requirements on financial promotions, product descriptions, risk warnings, and performance claims. An AI agent that deviates from approved product descriptions, omits required risk warnings, or makes unsupported performance claims violates COBS 4 even if the deviation is unintentional. AG-456's template library for controlled statement categories directly supports COBS 4 compliance by ensuring that product descriptions, risk warnings, and performance information are communicated using pre-approved language that has been reviewed for COBS 4 compliance. The real-time gating mechanism prevents non-compliant communications from reaching clients.

NIST AI RMF — GOVERN 1.1, GOVERN 6.1

The NIST AI Risk Management Framework addresses governance of AI system outputs through multiple provisions. GOVERN 1.1 emphasises that policies and procedures are in place to address AI risks. GOVERN 6.1 addresses the establishment of policies to evaluate AI systems and their outputs. AG-456 implements these provisions specifically for external communication outputs, ensuring that the organisation's policies about what may be said externally are translated into enforceable controls on AI agent outputs. The classification, template, and gating architecture provides the operational mechanism through which governance policies become effective controls.

DORA — Article 11 (Communication) and Article 17 (ICT-related Incident Reporting)

DORA requires financial entities to have communication policies (Article 11) and incident reporting procedures (Article 17). AG-456 supports Article 11 by ensuring that AI agent communications are governed by the organisation's communication policies rather than operating outside them. Article 17 is relevant during incident scenarios — when an ICT-related incident occurs, the organisation must control its communications to avoid premature, inaccurate, or market-sensitive disclosures. AG-456's prohibited statement classification and AG-428's crisis communication controls work together to ensure that AI agents do not communicate about incidents without approval.

ISO 42001 — Clause 8.4 (Operation of AI Systems)

ISO 42001 Clause 8.4 requires organisations to establish operational controls for AI systems. External statement approval governance is a critical operational control for any AI system that communicates with external parties. The standard's requirement for documented operational procedures aligns with AG-456's template library, classification matrix, and escalation procedures. The evidence requirements (audit logs, testing results, review records) directly support ISO 42001's documentation expectations.

10. Failure Severity

FieldValue
Severity RatingCritical
Blast RadiusOrganisation-wide — a single uncontrolled external statement can create legal liability, regulatory enforcement, market impact, or reputational damage affecting the entire organisation, not just the agent deployment

Consequence chain: An AI agent makes an uncontrolled external statement about the organisation's obligations, commitments, or position. The immediate impact depends on the statement category. For binding statements (warranties, guarantees, commitments), the consequence is potential legal liability — the organisation may be bound by the agent's statement under consumer protection or contract law, creating obligations it did not intend and potentially cannot fulfil. For 2,340 customers receiving an extended warranty statement (Scenario A), the maximum exposure is £4.3 million. For compliance representations (Scenario B), the consequence is regulatory investigation for false or misleading statements to regulators or auditors — an aggravating factor that elevates the severity of any underlying compliance gap. For market-sensitive disclosures (Scenario C), the consequence is immediate market impact (£14.6 million capitalisation loss) and potential market abuse investigation. The compound risk is that a single uncontrolled statement can trigger multiple simultaneous consequences: legal liability to the recipient, regulatory investigation for the nature of the statement, reputational damage from public disclosure of the incident, and governance programme credibility damage ("if you cannot control what your AI says, how can we trust your other AI governance controls?"). The irreversibility of external statements makes this a critical-severity control — unlike internal process failures that can be detected and corrected before external impact, an external statement takes effect the moment it is delivered, and retraction does not eliminate the legal, regulatory, or market consequences.

Cross-references: AG-001 (Operational Boundary Enforcement), AG-428 (Crisis Communication Approval Governance), AG-454 (AI Interaction Notice Placement Governance), AG-455 (Synthetic Identity Disclosure Governance), AG-457 (Marketing Claim Substantiation Governance), AG-431 (Output Execution Sink Validation Governance), AG-019 (Human Escalation & Override Triggers), AG-388 (Autonomous Goal Mutation Prohibition Governance).

Cite this protocol
AgentGoverning. (2026). AG-456: External Statement Approval Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-456