AG-457: Marketing Claim Substantiation Governance

2. Summary

Marketing Claim Substantiation Governance requires that every factual claim made about an AI agent's performance, safety, accuracy, compliance posture, or operational capability — whether communicated to external audiences through marketing materials, sales collateral, regulatory filings, contractual representations, or in-product messaging — is substantiated by contemporaneous, independently verifiable evidence before publication. Unsubstantiated claims about AI agent capabilities create compounding legal, regulatory, and reputational risk: they establish expectations that become contractual obligations, trigger regulatory scrutiny when claims cannot be defended, and erode stakeholder trust when real-world performance diverges from stated capability. This dimension mandates a pre-publication substantiation workflow, a living evidence registry linking every active claim to its supporting artefacts, and a periodic revalidation cycle ensuring claims remain accurate as agent behaviour evolves.

3. Example

Scenario A — Accuracy Claim Exceeds Measured Performance: A fintech company deploys a customer-facing credit-assessment agent and publishes marketing material stating: "Our AI credit assessor achieves 99.2% accuracy across all applicant demographics." The 99.2% figure was measured during a development benchmark six months before launch, on a test set that under-represented three protected demographic groups. In production, the agent's accuracy for those groups is 91.4%, 88.7%, and 86.1% respectively. A consumer advocacy group files a complaint with the national consumer protection authority. The authority's investigation finds no contemporaneous evidence supporting the "all applicant demographics" claim. The company faces a £2.3 million fine for misleading advertising and is required to issue corrective notices to 184,000 customers who applied during the period the claim was live. Additionally, 340 applicants from the under-represented groups who were incorrectly declined initiate a class action alleging discriminatory reliance on an overstated accuracy claim, with estimated liability of £4.1 million.

What went wrong: The accuracy claim was based on stale benchmark data that did not reflect production performance, especially across demographic segments. No pre-publication substantiation process verified that the claim was supported by current, demographically disaggregated evidence. No revalidation cycle detected the divergence between the published claim and actual performance after deployment. The phrase "all applicant demographics" was never tested against disaggregated data, transforming a general benchmark number into a specific demographic guarantee that the system could not support.

Scenario B — Safety Claim Contradicted by Incident Record: An industrial automation vendor markets an embodied AI agent for warehouse operations with the tagline: "Zero safety incidents in over 10,000 operating hours." The claim is technically accurate at the time of first publication. Over the following four months, the agent accumulates three near-miss incidents and one minor injury incident that requires medical attention. The marketing claim continues to appear on the vendor's website, in printed brochures distributed at two trade shows, and in a regulatory submission to a workplace safety authority. A workplace safety inspector discovers the discrepancy during a routine site visit 7 months after the claim was first published. The vendor is cited for misrepresentation in a regulatory filing — a finding that carries a £780,000 penalty and triggers a mandatory safety review of all deployed units. Insurance coverage for the deployed agents is suspended pending review, creating a 6-week operational shutdown across 14 client sites.

What went wrong: The claim was accurate when first published but became false as incidents occurred. No revalidation trigger connected the incident reporting system to the claim registry. The claim continued to propagate across channels — website, print, regulatory filings — without any mechanism to flag or withdraw it when the underlying facts changed. The regulatory filing amplified the severity from a marketing infraction to a regulatory misrepresentation.

Scenario C — Compliance Claim Made Without Regulatory Mapping: A public-sector AI vendor states in a government procurement response: "Our agent is fully compliant with the EU AI Act requirements for high-risk AI systems." The vendor has not conducted a formal conformity assessment, has not registered the system in the EU database for high-risk AI systems, and has not appointed an authorised representative in the EU. The vendor's internal team interpreted "compliant" to mean "designed with compliance in mind," but the procurement evaluators — and the regulation itself — interpret "fully compliant" to mean all mandatory obligations are satisfied. The government entity awards a €3.8 million contract based partly on this representation. During a post-award audit, the misrepresentation is discovered. The contract is terminated for cause, the vendor is required to repay advance payments of €1.2 million, and the vendor is barred from public procurement in the jurisdiction for 24 months. The government entity faces parliamentary questions about due diligence in AI procurement.

What went wrong: A compliance claim was made without a formal substantiation process that required specific evidence for each element of the claimed standard. The word "fully" implied completeness that no evidence supported. No legal review of the claim's regulatory implications occurred before the procurement submission. The claim was drafted by a sales team without governance or legal oversight, and no approval workflow existed for claims referencing regulatory compliance.

4. Requirement Statement

Scope: This dimension applies to every organisation that deploys AI agents and makes external claims about those agents' performance, safety, accuracy, reliability, compliance, fairness, or any other operational characteristic. External claims include, but are not limited to: marketing materials (websites, brochures, advertisements, social media posts, press releases), sales collateral (pitch decks, proposals, RFP responses), contractual representations (service level agreements, warranties, indemnification provisions), regulatory filings (conformity assessments, registration submissions, audit responses), and in-product messaging (onboarding screens, help text, tooltips that describe agent capabilities to end users). The scope encompasses claims made by any organisational function — marketing, sales, legal, engineering, executive leadership — through any channel. Claims made by the agent itself about its own capabilities fall within scope and are subject to the same substantiation requirements. Organisations that resell, white-label, or integrate third-party AI agents must substantiate claims about those agents to the same standard as claims about internally developed agents; reliance on the third party's marketing materials without independent verification does not constitute substantiation.

4.1. A conforming system MUST maintain a claim registry — a structured, version-controlled repository cataloguing every active external claim about agent performance, safety, accuracy, compliance, or operational capability, with each claim linked to its supporting evidence artefacts, its publication channels, its approval record, and its revalidation schedule.

4.2. A conforming system MUST require pre-publication substantiation for every external claim, ensuring that no claim is published, distributed, or submitted to any external audience until contemporaneous evidence supporting the claim has been reviewed and approved by at least one individual with technical authority over the agent's measured performance and at least one individual with legal or compliance authority.

4.3. A conforming system MUST ensure that quantitative claims (accuracy percentages, latency figures, uptime statistics, error rates, incident counts) are supported by evidence from production environments or statistically valid production-representative test environments, not solely from development benchmarks, laboratory conditions, or cherry-picked evaluation sets.

4.4. A conforming system MUST ensure that claims referencing compliance with specific regulations, standards, or frameworks are supported by documented conformity assessments, audit reports, or certification records demonstrating satisfaction of each mandatory element of the referenced regulation, standard, or framework — not merely alignment with its principles or design intent.

4.5. A conforming system MUST implement a revalidation cycle that re-examines every active claim at a defined frequency (no less than quarterly) and upon any triggering event — including model updates, data pipeline changes, performance degradation alerts, safety incidents, or regulatory changes — to verify that the claim remains accurate.

4.6. A conforming system MUST implement a claim withdrawal process that removes or corrects claims within a defined timeframe (no more than 72 hours for digital channels, no more than 30 days for physical materials) when revalidation determines that a claim is no longer substantiated.

4.7. A conforming system MUST ensure that claims made by the agent itself about its own capabilities (e.g., "I can process your request with 99% accuracy") are governed by the same substantiation requirements as claims made by the organisation's marketing function.

4.8. A conforming system SHOULD implement automated monitoring that detects new claims about agent capabilities published in digital channels, flagging claims that do not appear in the claim registry for substantiation review.

4.9. A conforming system SHOULD require that claims include qualifying context — the conditions, populations, timeframes, and limitations under which the claimed performance was measured — to prevent reasonable misinterpretation of scope.

4.10. A conforming system MAY implement claim impact classification that categorises claims by regulatory and governed exposure, applying more rigorous substantiation requirements to claims with higher potential consequences (e.g., claims in regulatory filings receive greater scrutiny than claims in blog posts).

5. Rationale

Unsubstantiated claims about AI agent capabilities represent a category of risk that is qualitatively different from traditional product marketing risk. Three factors amplify the risk.

First, AI performance is inherently variable and context-dependent. A traditional product's specifications — weight, dimensions, material composition — are stable facts that, once measured, remain true. An AI agent's accuracy, safety record, and compliance posture are dynamic properties that change with every model update, data pipeline modification, and deployment context shift. A claim that is accurate today may be false tomorrow, not because of deception, but because the underlying system changed. This means that substantiation is not a point-in-time activity but a continuous obligation.

Second, AI claims carry disproportionate regulatory exposure. Regulators worldwide are actively scrutinising AI marketing claims. The EU AI Act Article 13 requires transparency in the information provided to deployers and users. The US Federal Trade Commission has issued specific guidance on AI marketing claims, emphasising that "exaggerated or unsubstantiated" claims about AI capabilities may violate Section 5 of the FTC Act. The UK Advertising Standards Authority has begun enforcing against AI capability claims that lack adequate substantiation. Consumer protection regulators treat AI claims with heightened scrutiny because consumers cannot independently verify AI performance, creating an information asymmetry that regulations are designed to correct.

Third, AI claims create contractual and liability exposure that persists beyond the marketing context. A marketing claim that an agent "achieves 99.2% accuracy" becomes, when relied upon by a customer in making a purchasing decision, a potential basis for contractual breach, negligent misrepresentation, or unfair commercial practice claims. In regulated industries — financial services, healthcare, public sector — inaccurate capability claims can trigger sector-specific enforcement actions in addition to general consumer protection liability. The compound effect is that a single unsubstantiated claim can generate liability across multiple legal domains simultaneously.

The governance requirement responds to these risks by mandating a structured, evidence-based substantiation process that treats claims as governed artefacts — created through controlled processes, linked to evidence, versioned, revalidated, and withdrawn when no longer supported. This is not a marketing constraint; it is a risk management imperative. Organisations that make claims they cannot substantiate are accumulating regulatory and legal liabilities that will materialise unpredictably — often at the worst possible time, such as during a regulatory investigation, a procurement challenge, or a class action discovery process.

The dependency on AG-456 (External Statement Approval Governance) reflects that substantiation is a prerequisite to approval: no claim should be approved for external publication without substantiation evidence. The dependency on AG-023 (Audit Trail Governance) reflects that the substantiation process itself must be auditable — regulators will not only ask "is this claim true?" but "what process did you follow to verify it before publication?"

6. Implementation Guidance

Marketing Claim Substantiation Governance requires a structured system connecting claims to evidence, with lifecycle management ensuring claims remain accurate over time. The core architecture is a claim registry — a living database where each claim is an auditable record linked to evidence, approval, publication, and revalidation artefacts.

Recommended patterns:

Structured claim registry with evidence linking. Implement a registry where each claim record includes: the exact claim text, the claim category (performance, safety, accuracy, compliance, fairness), the evidence artefacts supporting the claim (test reports, production metrics, audit certificates), the publication channels where the claim appears, the approval chain (technical reviewer, legal reviewer, final approver), the publication date, the next revalidation date, and the claim status (draft, substantiated, published, under revalidation, withdrawn). Each evidence artefact is stored immutably with a cryptographic hash ensuring integrity. The registry is the single source of truth for what the organisation claims about its agents.
Dual-authority pre-publication review. Require two independent approvals before any claim is published: a technical authority who verifies that the evidence supports the factual content of the claim, and a legal or compliance authority who verifies that the claim's language does not create unintended regulatory or contractual exposure. The technical reviewer validates accuracy; the legal reviewer validates the implications of that accuracy statement in the markets and jurisdictions where it will be published.
Triggered revalidation with automated monitoring. Connect the claim registry to operational monitoring systems so that events that could invalidate claims — model updates, performance regression alerts, safety incidents, data pipeline changes — automatically trigger revalidation of affected claims. For example, a model update triggers revalidation of all accuracy claims; a safety incident triggers revalidation of all safety claims. This ensures that the revalidation cadence accelerates when the underlying system changes.
Demographic and contextual disaggregation for accuracy claims. When claims reference accuracy, error rates, or performance metrics, require that the supporting evidence includes disaggregated analysis across relevant dimensions: demographic groups, geographic regions, use case categories, input complexity levels, and temporal periods. A claim of "99% accuracy" supported only by an aggregate figure is vulnerable to the ecological fallacy — the aggregate may mask significant under-performance in specific subgroups.
Claim-channel mapping with withdrawal automation. Maintain a mapping between each claim and every channel where it appears (specific web pages, document repositories, print material inventories). When a claim is withdrawn, automated workflows initiate removal across digital channels and flag physical materials for collection. This prevents withdrawn claims from persisting in forgotten channels.

Anti-patterns to avoid:

Benchmark-as-substantiation. Using development benchmarks, academic evaluation sets, or laboratory test results as the sole substantiation for production capability claims. Development benchmarks measure capability under controlled conditions; production performance includes the effects of real data distribution, operational load, integration complexity, and adversarial input.
Stale evidence with no revalidation. Substantiating a claim once at publication and never revalidating. AI agent performance changes over time — model updates, data drift, operational changes, and adversarial adaptation all affect the properties being claimed. Evidence that was valid six months ago may not support the claim today.
Marketing-authored technical claims. Allowing marketing teams to draft and publish technical claims (accuracy figures, safety statistics, compliance assertions) without mandatory technical and legal review. Marketing teams optimise for impact; technical reviewers optimise for accuracy; legal reviewers optimise for defensibility. All three perspectives are required.
Implied scope expansion. Publishing a claim with qualifying language (e.g., "99% accuracy on our standard test set") and then allowing that claim to be reproduced without the qualification (e.g., "99% accuracy" in a sales email). The qualification is the substantiation boundary; removing it creates an unsubstantiated broader claim.
Agent self-claims without governance. Allowing the agent to make statements about its own capabilities (e.g., "I am highly accurate at this task") without subjecting those statements to the same substantiation process as marketing materials. Agent self-claims reach end users directly and may carry more persuasive weight than marketing materials because they appear to be first-party operational statements.

Industry Considerations

Financial Services. Financial regulators impose specific obligations on performance claims. The FCA's Consumer Duty requires that communications are fair, clear, and not misleading. FINRA rules prohibit exaggerated or unsubstantiated claims about investment product performance. AI agents operating in financial services must ensure that claims about advisory accuracy, risk assessment reliability, or compliance automation are supported by production data from the specific regulatory context. A claim validated in one jurisdiction may not be substantiable in another due to different market conditions and regulatory interpretations.

Healthcare. Claims about AI agent accuracy in clinical settings are subject to medical device regulatory frameworks in many jurisdictions. A claim that an agent "detects condition X with 97% sensitivity" may be interpreted as a medical device performance claim, triggering Class II or Class III device classification requirements. Healthcare organisations must involve regulatory affairs specialists in the substantiation process for any clinical capability claim, even for agents not formally classified as medical devices.

Public Sector. Government procurement processes treat vendor capability claims as contractual representations. A claim made in an RFP response that cannot be substantiated post-award may constitute grounds for contract termination, debarment, or False Claims Act liability (in US jurisdictions). Public sector claims require the highest substantiation rigour because the consequences of misrepresentation include not only financial penalties but loss of future procurement eligibility.

Cross-Border Deployments. Claims that are permissible in one jurisdiction may be prohibited or regulated differently in another. The EU's stricter advertising standards, combined with the EU AI Act's transparency requirements, mean that claims acceptable in less-regulated markets may create liability when they reach EU audiences. Cross-border claim substantiation must consider the regulatory requirements of every jurisdiction where the claim is accessible, not just where it is published.

Maturity Model

Basic Implementation — The organisation maintains a claim registry cataloguing all active external claims about agent capabilities. Pre-publication review requires at least one technical and one legal reviewer before publication. Claims are linked to supporting evidence artefacts. Revalidation occurs at least quarterly. This level meets the minimum mandatory requirements and prevents the most egregious unsubstantiated claims.

Intermediate Implementation — All basic capabilities plus: automated monitoring detects new claims published in digital channels. Triggered revalidation connects operational events (model updates, incidents) to claim revalidation. Accuracy claims are supported by disaggregated evidence across relevant dimensions. Claim-channel mapping enables coordinated withdrawal across all publication points. The claim registry is integrated with the incident management system.

Advanced Implementation — All intermediate capabilities plus: claims are classified by regulatory and governed exposure, with substantiation rigour calibrated to risk. Automated claim consistency checking verifies that claims across different channels and documents do not contradict each other. Agent self-claims are governed by the same registry and substantiation process. Real-time dashboards track claim currency, evidence freshness, and revalidation status. Independent third-party verification is obtained for the highest-risk claims.

7. Evidence Requirements

Required artefacts:

Claim registry. The complete, current claim registry with all active and withdrawn claims, linked evidence, approval records, publication channels, and revalidation history.
Pre-publication substantiation records. For each claim, the substantiation package including: evidence artefacts, technical reviewer sign-off, legal reviewer sign-off, and final approval with timestamp.
Evidence artefacts. The underlying evidence supporting each claim — production performance reports, test results, audit certificates, conformity assessments — stored immutably with integrity verification.
Revalidation records. Logs of each revalidation cycle including: claims reviewed, evidence re-examined, revalidation outcome (confirmed, modified, withdrawn), and reviewer identity.
Withdrawal records. For withdrawn claims: the trigger for withdrawal, the withdrawal decision, and evidence of removal from all publication channels with timestamps.
Agent self-claim inventory. Documentation of all statements the agent makes about its own capabilities, with the same substantiation evidence as external marketing claims.

Retention requirements:

Claim registry and substantiation records: minimum 7 years for regulated financial services; minimum 6 years for public sector contracts; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators, auditors, or contracting authorities within 48 hours of request. Substantiation evidence for any specific claim must be retrievable within 24 hours.

8. Test Specification

Test 8.1: Claim Registry Completeness

Stimulus: Compile a list of all external claims about agent capabilities from all publication channels (website, sales materials, regulatory filings, in-product messaging, agent self-statements). Compare this list against the claim registry.
Expected behaviour: Every external claim appears in the claim registry with linked evidence, approval records, and a revalidation schedule.
Pass criteria: 100% of identified external claims are present in the claim registry with complete records. No orphaned claims exist outside the registry.
Fail criteria: Any external claim is found that does not appear in the claim registry, or any registry entry lacks linked evidence or approval records.

Test 8.2: Pre-Publication Substantiation Enforcement

Stimulus: Attempt to publish a new claim about agent performance through the organisation's standard publication workflow without completing the substantiation process (no technical review, no legal review, no evidence linking). Test across three channels: website update, sales document update, and agent self-claim configuration change.
Expected behaviour: The publication workflow blocks the claim until substantiation is complete. No unsubstantiated claim reaches any external channel.
Pass criteria: All three publication attempts are blocked until substantiation requirements are met. The blocking mechanism produces a clear notification identifying missing substantiation steps.
Fail criteria: Any unsubstantiated claim is published through any channel, or the blocking mechanism can be bypassed without escalation approval.

Test 8.3: Quantitative Claim Evidence Validation

Stimulus: Select three quantitative claims from the registry (e.g., accuracy percentage, uptime figure, error rate). For each, retrieve the linked evidence and verify: (a) the evidence is from a production or production-representative environment, (b) the evidence is contemporaneous (within the revalidation period), (c) the measured value supports the claimed value including any qualifying language.
Expected behaviour: All three claims are supported by current production evidence that validates the specific quantitative assertion.
Pass criteria: All three claims are supported by production evidence within the revalidation period, and the measured values equal or exceed the claimed values within stated qualifications.
Fail criteria: Any claim lacks production evidence, relies solely on development benchmarks, has evidence older than the revalidation period, or the measured value does not support the claimed value.

Test 8.4: Revalidation Trigger Responsiveness

Stimulus: Simulate three triggering events: (a) a model update affecting an agent whose accuracy is the subject of an active claim, (b) a safety incident involving an agent whose safety record is the subject of an active claim, (c) a regulatory change affecting a jurisdiction referenced in a compliance claim. Verify that each trigger initiates a revalidation of the affected claims.
Expected behaviour: Each triggering event initiates revalidation of the specific claims affected by that event within the defined timeframe.
Pass criteria: All three triggers initiate revalidation within 48 hours of the event. Revalidation records identify the specific claims affected and the triggering event.
Fail criteria: Any triggering event fails to initiate revalidation, revalidation is not initiated within the defined timeframe, or the revalidation does not address the specific claims affected by the event.

Test 8.5: Claim Withdrawal Timeliness

Stimulus: Mark a claim as no longer substantiated in the claim registry. Verify that the claim is removed or corrected across all publication channels within the defined timeframes (72 hours for digital channels, 30 days for physical materials).
Expected behaviour: Digital channels reflect the withdrawal within 72 hours. Physical material withdrawal is initiated with tracking within 30 days.
Pass criteria: All digital instances of the claim are removed or corrected within 72 hours. Physical material withdrawal is documented and tracked with evidence of initiation within 30 days.
Fail criteria: Any digital instance of the claim persists beyond 72 hours, or physical material withdrawal is not initiated within 30 days.

Test 8.6: Compliance Claim Substantiation Depth

Stimulus: Select a claim that references compliance with a specific regulation or standard (e.g., "compliant with the EU AI Act for high-risk systems"). Retrieve the linked evidence and verify that the evidence addresses each mandatory element of the referenced regulation or standard, not merely general alignment.
Expected behaviour: The evidence includes a documented conformity assessment, audit report, or certification record addressing each mandatory requirement of the referenced regulation. General statements of design intent or principle alignment do not suffice.
Pass criteria: The evidence maps to specific mandatory requirements and demonstrates satisfaction of each. A traceability matrix or equivalent links each regulatory requirement to specific evidence.
Fail criteria: The evidence addresses only general principles without specific requirement mapping, or any mandatory element of the referenced regulation is not addressed by evidence.

Test 8.7: Agent Self-Claim Governance

Stimulus: Configure the agent to make a capability statement to end users (e.g., "I can analyse your documents with high accuracy"). Verify that this self-claim is present in the claim registry with the same substantiation evidence, approval records, and revalidation schedule as external marketing claims.
Expected behaviour: The agent's self-claim appears in the claim registry, linked to substantiation evidence that supports the specific language used. The self-claim has undergone the same dual-authority review as marketing claims.
Pass criteria: The self-claim is in the registry with complete substantiation. The language the agent uses matches the approved, substantiated claim text. Modifications to agent self-claims require the same approval workflow as marketing claim modifications.
Fail criteria: The self-claim is not in the registry, lacks substantiation evidence, or can be modified without triggering the substantiation approval workflow.

Conformance Scoring

Score 0: No claim substantiation process exists. Claims about agent capabilities are drafted and published without evidence review, technical verification, or legal assessment. No claim registry exists.
Score 1: A claim registry exists cataloguing most active claims. Pre-publication review occurs but may lack dual-authority approval or evidence linking. Revalidation occurs but not on a triggered basis — only on a calendar schedule. Claims may persist after substantiation has lapsed.
Score 2: Complete claim registry with evidence linking and dual-authority pre-publication approval. Triggered revalidation connects operational events to claim review. Withdrawal processes remove claims within defined timeframes. Agent self-claims are included in governance. Quantitative claims are supported by production evidence with disaggregated analysis.
Score 3: Verified through independent audit confirming that all claims are currently substantiated with fresh production evidence. Automated monitoring detects unregistered claims across digital channels. Claim consistency checking prevents contradictions across channels. Third-party verification obtained for the highest-risk claims. Real-time dashboards track claim currency across all publication points.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 13 (Transparency and Provision of Information)	Direct requirement
EU AI Act	Article 9 (Risk Management System)	Supports compliance
SOX	Section 302 (Corporate Responsibility for Financial Reports)	Supports compliance
FCA SYSC	4.1.1R (General Organisational Requirements)	Direct requirement
NIST AI RMF	GOVERN 4.1, MAP 5.1	Supports compliance
ISO 42001	Clause 7.4 (Communication)	Direct requirement
DORA	Article 19 (Reporting of Major ICT-related Incidents)	Supports compliance

EU AI Act — Article 13 (Transparency and Provision of Information)

Article 13 requires providers of high-risk AI systems to ensure that the system is accompanied by instructions for use that include information about the system's performance characteristics, including "the levels of accuracy, robustness and cybersecurity" and "any known or foreseeable circumstance" that could affect performance. Marketing claims that overstate performance or omit known limitations directly violate Article 13's transparency requirements. Organisations must demonstrate that external statements about system performance are consistent with the technical documentation mandated by the Act. AG-457's claim substantiation process ensures this consistency by requiring evidence-linked claims and revalidation when performance characteristics change.

SOX — Section 302 (Corporate Responsibility for Financial Reports)

While SOX primarily governs financial reporting, Section 302's certification requirements extend to internal controls over information that reaches investors and markets. If an organisation's marketing claims about AI agent performance materially affect investor perception of the company's technological capability or competitive position, those claims fall within the scope of executive certification obligations. Unsubstantiated claims about AI capability that inflate market perception create Section 302 exposure. AG-457 ensures that claims which could influence investor perception are substantiated, reducing the risk that executives certify the accuracy of financial reports while the company publishes unsubstantiated capability claims.

FCA SYSC — 4.1.1R (General Organisational Requirements)

The FCA requires firms to have robust governance arrangements including clear organisational structure with well-defined, transparent, and consistent lines of responsibility. The FCA's Consumer Duty (PS22/9) further requires that all communications are fair, clear, and not misleading, including communications about AI-assisted services. A financial services firm claiming that its AI agent "provides personalised investment recommendations with institutional-grade accuracy" must substantiate every element of that claim under the Consumer Duty. AG-457's substantiation framework maps directly to the Consumer Duty's substantiation expectations.

NIST AI RMF — GOVERN 4.1, MAP 5.1

GOVERN 4.1 addresses organisational practices for transparency about AI system capabilities and limitations. MAP 5.1 addresses documenting the AI system's intended purpose, known limitations, and performance characteristics. AG-457 operationalises these provisions by ensuring that external claims are consistent with documented capabilities and limitations, and that the gap between marketed capability and documented limitation is closed through substantiation.

ISO 42001 — Clause 7.4 (Communication)

Clause 7.4 requires the organisation to determine internal and external communications relevant to the AI management system, including what to communicate, when, and to whom. AG-457 extends this by requiring that external communications about AI capability are governed artefacts — substantiated, approved, version-controlled, and revalidated. This transforms Clause 7.4 from a procedural requirement into a substantive evidence-based obligation.

DORA's incident reporting requirements create a potential collision with active marketing claims. If a major ICT incident affects an AI agent whose capabilities are the subject of active marketing claims — for example, a system failure that contradicts an uptime claim — the incident reporting obligation and the claim revalidation obligation must be coordinated. AG-457's triggered revalidation ensures that incident reports automatically trigger review of affected claims, preventing a situation where the organisation reports an incident to regulators while continuing to publish contradicted claims to the market.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Organisation-wide, extending to all customers, investors, regulators, and markets that received or relied upon unsubstantiated claims; amplified by the number of publication channels and the duration the claim was active

Consequence chain: An unsubstantiated claim is published about an AI agent's performance, safety, or compliance posture. If the claim is quantitatively inaccurate — overstating accuracy, understating error rates, claiming compliance that does not exist — the organisation has created a discoverable misrepresentation. The immediate consequence depends on the claim's audience: customers who relied on the claim have potential contractual or consumer protection claims; regulators who received the claim in filings have potential enforcement actions for misrepresentation; investors who relied on the claim in valuation have potential securities claims. The compounding effect occurs when the claim persists after the underlying facts change — a safety claim that becomes false after an incident, an accuracy claim that degrades after a model update — because each day of continued publication deepens the misrepresentation. The business consequence includes regulatory fines (up to 3% of annual turnover under the EU AI Act for transparency violations), contractual liability (breach of warranty, misrepresentation damages), procurement disqualification (debarment from public sector contracts), reputational damage (loss of customer trust that is disproportionately expensive to rebuild for AI products because trust is the primary differentiator), and internal governance failure (the substantiation gap indicates broader governance weakness that regulators will investigate beyond the specific claim).

Cross-references: AG-456 (External Statement Approval Governance), AG-023 (Audit Trail Governance), AG-454 (AI Interaction Notice Placement Governance), AG-455 (Synthetic Identity Disclosure Governance), AG-458 (Uncertainty Disclosure Threshold Governance), AG-049 (Explainability Governance), AG-007 (Governance Configuration Control), AG-420 (Tabletop Exercise Governance).

Cite this protocol

AgentGoverning. (2026). AG-457: Marketing Claim Substantiation Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-457

← Previous Protocol

AG-456

External Statement Approval Governance

Next Protocol →

AG-458

Uncertainty Disclosure Threshold Governance