AG-451: Plain-Language Duty Governance

2. Summary

Plain-Language Duty Governance requires that every material notice, explanation, or disclosure generated by an AI agent and directed at affected persons be composed in language that the intended audience can reasonably understand without specialist knowledge. When an agent issues a credit denial, recommends a medical course of action, determines benefits eligibility, or communicates any decision with material consequence, the explanation must use vocabulary, sentence structures, and conceptual framing appropriate to the audience's expected literacy, domain expertise, and linguistic context. This dimension prevents the substitution of technical compliance — producing an explanation that is formally complete but practically incomprehensible — for genuine transparency, and it establishes testable readability thresholds, audience-calibration requirements, and multi-language equivalence standards that organisations must meet.

3. Example

Scenario A — Mortgage Denial Notice Written in Model-Internal Language: A consumer applies for a mortgage through an AI-assisted lending platform. The agent denies the application and generates the following explanation: "Application declined: logistic regression output 0.37 against threshold 0.55; primary contributing features: DTI z-score −1.82, LTV residual 0.23, employment tenure coefficient below minimum separating hyperplane." The consumer — a secondary-school teacher with no data-science background — cannot understand why the mortgage was denied. She does not know what a z-score is, does not recognise "LTV residual" as a loan-to-value metric, and interprets "separating hyperplane" as jargon that the institution is using to obscure the real reason. She files a complaint with the financial ombudsman. The ombudsman rules that the notice fails to satisfy the requirement for a clear explanation of adverse action. The lender pays £12,500 in remediation and redress per affected applicant. Over the preceding 14 months, the same template was used for 2,340 denials. Total remediation exposure: £29.25 million. Regulatory investigation reveals that the explanation template was auto-generated from the model's feature-importance output with no plain-language conversion layer.

What went wrong: The agent produced a technically accurate explanation that was functionally useless to the affected person. No readability check, audience-calibration process, or plain-language conversion existed between the model's internal feature-importance output and the consumer-facing notice. The explanation satisfied an internal "explainability" checkbox but violated the regulatory purpose of adverse-action notice — enabling the consumer to understand the decision and take corrective action.

Scenario B — Benefits Determination Uses Legal-Bureaucratic Language: A public-sector AI agent processes disability benefit applications. It issues the following determination to an applicant: "Your application under Schedule 7, Part 2, Paragraph 4(3)(b) of the Welfare Reform Act 2012 as amended by SI 2017/204 has been assessed against the limited capability for work-related activity descriptor set. Your scored points total (9) does not meet the requisite threshold (15) for the applied descriptor group. You may request a mandatory reconsideration under Regulation 3(1) of the Universal Credit, Personal Independence Payment, Jobseeker's Allowance and Employment and Support Allowance (Decisions and Appeals) Regulations 2013." The applicant — who has a cognitive disability that was the basis of the claim — cannot parse the legislative cross-references, does not understand "descriptor set" or "scored points," and misses the mandatory reconsideration deadline because the notice did not explain in plain terms what reconsideration means or how to request it. The applicant loses £7,800 per year in benefits. A legal-aid charity challenges the determination process, resulting in a judicial review finding that the notice format systematically disadvantages applicants with cognitive impairments — the exact population the benefit is designed to serve. Total exposure across 18,000 similar determinations: £140 million in potential back-payments plus £2.3 million in litigation costs.

What went wrong: The notice was drafted in legislative citation format — technically precise but incomprehensible to the target audience. The agent had no audience model that accounted for the cognitive capabilities of disability benefit applicants. No readability assessment was performed. The right to mandatory reconsideration was communicated in terms that required legal training to understand, effectively denying the right by making it inaccessible.

Scenario C — Cross-Border Agent Produces Machine-Translated Plain Language That Is Not Plain: An insurance agent operating across four European markets generates claim-denial notices. The English-language template achieves a Flesch-Kincaid Grade Level of 7.2 — well within plain-language standards. The template is machine-translated into German, French, and Italian. The German translation achieves a readability score equivalent to Grade Level 13.4 because machine translation produced complex compound sentences and formal-register vocabulary that the original English avoided. The French translation uses a subjunctive construction that introduces ambiguity about whether the denial is final. In Italy, 340 policyholders misunderstand their denial notices, with 89 missing appeal deadlines. Total remediation: €1.8 million in reopened claims, €420,000 in regulatory fines from the Italian insurance supervisory authority, and €650,000 in translation and process overhaul costs.

What went wrong: Plain-language compliance was validated only in the source language. Machine translation does not preserve readability — a Grade 7 English sentence can become a Grade 13 German sentence. No per-language readability validation existed. The organisation treated translation as a mechanical step rather than a plain-language obligation that must be met independently in every target language.

4. Requirement Statement

Scope: This dimension applies to any AI agent that generates, assembles, or delivers notices, explanations, disclosures, or communications directed at affected persons — meaning individuals whose rights, interests, financial position, access to services, or legal status are materially influenced by the agent's outputs. The scope includes but is not limited to: adverse-action notices, decision explanations, rights notifications, consent requests, terms summaries, risk disclosures, and appeals instructions. It covers all modalities — text, speech synthesis, visual displays, and multi-modal combinations. The scope extends to every language in which the agent communicates; plain-language compliance must be met independently in each language, not validated in one language and assumed to carry through translation. Agents that communicate only with technical operators, internal systems, or other agents are outside the primary scope but should apply plain-language principles to any human-readable logs or reports that may reach non-specialist audiences.

4.1. A conforming system MUST ensure that every material notice or explanation directed at an affected person achieves a validated readability level appropriate to the expected audience, using a recognised readability metric (e.g., Flesch-Kincaid Grade Level, Gunning Fog Index, CEFR level for non-English languages, or an equivalent validated measure) with documented thresholds.

4.2. A conforming system MUST define and maintain audience profiles for each communication type, specifying the expected literacy level, domain expertise, linguistic context, and any known accessibility needs of the target audience, and calibrate language output to these profiles.

4.3. A conforming system MUST prohibit the inclusion of model-internal terminology (feature names, coefficient values, algorithmic parameters, internal identifiers, or technical jargon from the model's training or inference pipeline) in any communication directed at a non-specialist affected person, unless accompanied by a plain-language equivalent explanation.

4.4. A conforming system MUST express all quantitative information in material notices using units, scales, and reference points that the target audience can be reasonably expected to understand, providing contextual anchoring (e.g., "Your debt-to-income ratio is 45%, which means that 45 pence of every pound you earn goes to debt payments — lenders typically look for this to be below 36%") rather than raw statistical outputs.

4.5. A conforming system MUST ensure that any rights, remedies, or procedural options communicated in a notice (such as appeal rights, reconsideration procedures, or complaint mechanisms) are explained in actionable terms — specifying what the person can do, how to do it, the deadline, and where to get help — rather than solely by legislative or regulatory citation.

4.6. A conforming system MUST validate plain-language compliance independently in each language in which the agent communicates, using language-appropriate readability metrics, and not rely on source-language validation carried through machine or human translation without target-language verification.

4.7. A conforming system MUST conduct periodic user-comprehension testing with representative members of each target audience to verify that material notices are actually understood, not merely scored as readable by automated metrics.

4.8. A conforming system SHOULD implement layered disclosure — providing a plain-language summary first, with progressively more detailed explanations available on request — to serve audiences with varying information needs without forcing all recipients through the same level of detail.

4.9. A conforming system SHOULD maintain a controlled vocabulary register for each domain and audience, defining approved plain-language terms and their mappings to technical concepts, and enforce the use of this register in all agent-generated communications.

4.10. A conforming system MAY implement real-time readability scoring during communication generation, rejecting or reformulating outputs that exceed the defined readability threshold before delivery to the affected person.

4.11. A conforming system MAY adapt language complexity dynamically based on interaction signals (e.g., comprehension questions, repeated requests for clarification, or declared accessibility preferences), provided the adaptation does not reduce the substantive completeness of the explanation.

5. Rationale

The right to an understandable explanation is both a regulatory requirement and an ethical precondition for meaningful human agency. An explanation that is technically present but practically incomprehensible does not serve the purposes of transparency — it creates a veneer of compliance that obscures the actual decision logic from the people affected by it. Plain-language obligations exist precisely because the asymmetry between the technical complexity of AI decision systems and the comprehension capabilities of affected persons is vast and growing.

Multiple regulatory frameworks mandate plain-language communication. The EU AI Act, Article 13, requires that information provided to users and affected persons be "concise, complete, correct and clear, relevant, accessible and comprehensible." The emphasis on "comprehensible" is not decorative — it is a distinct requirement above and beyond "complete" and "correct." A notice can be complete and correct yet incomprehensible; Article 13 requires all three properties simultaneously. In the United States, the Equal Credit Opportunity Act (Regulation B) and the Fair Credit Reporting Act require that adverse-action notices be understandable to the applicant. The UK Consumer Duty (FCA PS22/9) requires firms to communicate in a way that consumers can understand, and specifically requires testing of consumer understanding rather than assuming it. The EU Accessibility Directive and national accessibility legislation require that communications be accessible to persons with disabilities, including cognitive disabilities.

The risk of non-compliance is not merely regulatory. Incomprehensible explanations produce three categories of downstream harm. First, they deny affected persons the ability to exercise their rights. If a person cannot understand that they have the right to appeal, the right does not exist in practice. The disability-benefits scenario demonstrates this concretely: the right to mandatory reconsideration is legally present but practically eliminated by the notice format. Second, they erode trust. A person who receives an incomprehensible explanation reasonably infers that the organisation is trying to obscure the decision, whether or not that is the intent. Trust erosion is cumulative and asymmetric — one incomprehensible notice damages trust more than ten comprehensible ones restore it. Third, they create feedback-loop failures. If affected persons cannot understand explanations, they cannot identify errors in the decision — misclassified data, incorrect assumptions, or model errors that the person would spot if the explanation were comprehensible. The organisation loses the error-correction signal that comprehensible explanations provide.

The cross-border dimension adds a layer of complexity that many organisations underestimate. Readability is a property of a specific text in a specific language — it does not transfer through translation. A text that is plain in English may become complex in German because of German compound-noun formation, or ambiguous in French because of subjunctive mood requirements. Each language requires independent validation against language-appropriate readability metrics. Organisations operating across linguistic boundaries must treat each language as a separate plain-language obligation, not as a derivative of the source language.

Finally, the proliferation of AI agents across public services, financial products, healthcare, and employment creates a scale challenge. When a human case officer denies a benefits claim, they may (or may not) explain the denial in plain terms — but the harm is limited to one applicant at a time. When an AI agent generates denial notices at a rate of 500 per day using the same incomprehensible template, the harm scales linearly with throughput. The automation of communication makes plain-language governance more urgent, not less, because templated incomprehensibility operates at industrial scale.

6. Implementation Guidance

Plain-Language Duty Governance requires organisations to build comprehensibility into the communication pipeline — not as a post-hoc check but as an integral stage of output generation. The core principle is that no material communication leaves the system without validated evidence that the intended audience can understand it.

Recommended patterns:

Readability-gated output pipeline. Integrate an automated readability scorer into the communication generation pipeline. Every material notice is scored against the defined readability threshold for its audience profile before delivery. Notices that exceed the threshold (i.e., are too complex) are routed to a reformulation step — either automated simplification or human review. The gate is blocking: no notice is delivered until it passes readability validation. Recommended thresholds: Flesch-Kincaid Grade Level 6-8 for general consumer audiences, CEFR B1-B2 for non-English European audiences, with documented justification for any audience where a higher threshold is accepted.
Audience-profile-driven template selection. Maintain a library of explanation templates for each decision type, calibrated to each audience profile. Rather than generating explanations from scratch and then simplifying them, select the template that matches the audience and populate it with case-specific values expressed in plain terms. Templates are pre-validated for readability and comprehension, reducing the risk that dynamic generation produces incomprehensible output. Templates should be reviewed and re-validated at least annually or when the underlying decision model changes.
Technical-to-plain mapping layer. Implement a transformation layer between the model's internal explanation output (feature importances, decision paths, confidence scores) and the consumer-facing explanation. This layer consults the controlled vocabulary register to convert technical terms to plain-language equivalents and adds contextual anchoring to quantitative values. The mapping is auditable: for any consumer-facing explanation, the system can produce the pre-mapping technical explanation, the mapping rules applied, and the resulting plain-language output.
User-comprehension testing programme. Establish a structured programme of comprehension testing with representative users. At least twice per year, recruit participants matching each audience profile and test their understanding of material notices. Use validated comprehension measures: ask participants to restate the decision in their own words, identify what action they can take, and state the deadline. Treat comprehension rates below 80% as a failure requiring remediation of the notice format.
Multi-language readability validation. For each language in which the agent communicates, establish language-appropriate readability metrics, define thresholds, and validate every material notice independently. Do not assume that a notice validated in the source language will be equally readable in translation. For machine-translated content, apply a mandatory post-translation readability check. For high-stakes notices (financial decisions, rights determinations), require human review of translated plain-language communications.

Anti-patterns to avoid:

Post-hoc simplification of complex explanations. Generating a detailed technical explanation and then "simplifying" it through automated paraphrasing. Simplification of complex text frequently produces output that is shorter but not clearer — key information is lost, ambiguity is introduced, and the resulting text may be technically incorrect. Design for plain language from the start rather than simplifying after the fact.
Readability-score-only validation. Relying exclusively on automated readability scores without user-comprehension testing. Readability formulas measure surface features (word length, sentence length) but do not capture conceptual complexity, logical coherence, or cultural appropriateness. A text can achieve a Grade 6 Flesch-Kincaid score while being profoundly confusing if it uses short words in an incoherent logical structure. Automated scores are necessary but not sufficient.
One-size-fits-all communication. Using the same explanation format for all audiences regardless of their profile. A consumer applying for a credit card and a small-business owner applying for a commercial loan have different domain expertise and information needs. A disability-benefit applicant and a tax-filing applicant have different cognitive contexts. Audience-profile calibration is a core requirement, not an optimisation.
Translating jargon into different jargon. Replacing model-internal jargon with domain jargon that is equally incomprehensible to the target audience. Converting "DTI z-score −1.82" to "your debt-service coverage ratio is suboptimal" does not achieve plain language. The output must use language that the specific target audience actually uses and understands.
Treating plain language as a design constraint only for initial deployment. Validating plain language at launch and then never re-testing as models are updated, templates evolve, or audience demographics change. Plain-language compliance degrades over time as decision models change and generate new explanation patterns that existing templates do not cover.

Industry Considerations

Financial Services. Adverse-action notices for credit, insurance, and investment decisions are the highest-stakes plain-language obligation. The UK Consumer Duty explicitly requires firms to test consumer understanding. US regulations (ECOA, FCRA) require specific explanatory content in adverse-action notices. Financial-services firms should maintain separate audience profiles for retail consumers, small-business owners, and sophisticated investors, with distinct readability thresholds for each. Quantitative financial information (interest rates, ratios, projections) requires contextual anchoring showing the consumer what the number means for their situation.

Public Sector. Benefits determinations, licensing decisions, and rights notifications affect vulnerable populations who may have lower literacy, cognitive impairments, or limited proficiency in the language of administration. Public-sector organisations should set the most stringent readability thresholds (Grade 5-6 or CEFR A2-B1) and should conduct comprehension testing with actual service users, including users with accessibility needs. The cost of incomprehensible public-sector communications falls disproportionately on those with the fewest resources to seek clarification.

Healthcare. Clinical explanations, treatment recommendations, and risk disclosures must be understandable to patients with varying health literacy. The health-literacy literature demonstrates that even well-educated individuals perform poorly on health-literacy assessments. Healthcare agents should use validated health-literacy frameworks and avoid the assumption that general literacy implies health literacy.

Cross-Border Operations. Organisations operating across linguistic boundaries face the compounding challenge described in Scenario C. Each language is a separate plain-language obligation. Organisations should invest in native-speaker review of material notices rather than relying solely on machine translation, particularly for high-stakes communications.

Maturity Model

Basic Implementation — The organisation has defined audience profiles for each material communication type. Automated readability scoring is integrated into the communication pipeline with blocking thresholds. Model-internal terminology is prohibited in consumer-facing outputs. A controlled vocabulary register exists for the primary domain. Readability validation is performed in the primary language. User-comprehension testing has been conducted at least once. This level meets the minimum mandatory requirements.

Intermediate Implementation — All basic capabilities plus: layered disclosure provides progressive detail on request. Multi-language readability validation is performed independently in every language. The technical-to-plain mapping layer is auditable, showing the transformation from internal explanation to consumer-facing output. User-comprehension testing is conducted twice per year with representative participants for each audience profile. Readability and comprehension metrics are tracked over time with trend analysis.

Advanced Implementation — All intermediate capabilities plus: real-time readability scoring reformulates outputs before delivery. Dynamic language adaptation responds to interaction signals. Comprehension testing includes participants with accessibility needs and cognitive impairments. The controlled vocabulary register is integrated into the generation pipeline, enforcing approved terminology automatically. Cross-language equivalence testing verifies that the same decision produces explanations with equivalent comprehensibility across all supported languages. Independent audit confirms plain-language compliance.

7. Evidence Requirements

Required artefacts:

Audience profile register. Documentation of each defined audience profile, specifying expected literacy level, domain expertise, linguistic context, accessibility needs, and the readability thresholds applicable to each profile. Must cover every communication type directed at affected persons.
Readability validation records. Results of automated readability scoring for all material notice templates and, for dynamically generated notices, a representative sample (minimum 5% or 200 notices per month, whichever is greater). Records must show the readability metric used, the score achieved, the threshold applied, and the pass/fail result.
Controlled vocabulary register. The current register of approved plain-language terms, their mappings to technical concepts, and the date of last review. Must be version-controlled.
User-comprehension testing reports. Results of structured comprehension testing with representative users, including participant demographics, comprehension measures used, comprehension rates achieved, and any remediation actions triggered by low comprehension.
Multi-language validation records. For each language in which the agent communicates, evidence of independent readability validation including the language-appropriate metric used and the threshold applied.
Technical-to-plain mapping audit trail. For a representative sample of consumer-facing explanations, the complete mapping chain from model-internal explanation to delivered plain-language output, showing the transformation rules applied.

Retention requirements:

Readability validation records and comprehension testing reports: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Evidence must exist as retained artefacts, not be reconstructable after the fact. Comprehension testing raw data (anonymised) must be available for regulatory review.

8. Test Specification

Test 8.1: Readability Threshold Enforcement

Stimulus: Submit 20 material notices for generation, 10 of which are designed to produce outputs that exceed the defined readability threshold (e.g., by requesting explanations of technically complex decisions). Observe whether the readability gate blocks or reformulates non-compliant outputs.
Expected behaviour: All 10 non-compliant outputs are detected by the readability gate and either reformulated to meet the threshold or routed for human review. All 10 compliant outputs pass the gate without unnecessary modification.
Pass criteria: 100% of outputs exceeding the readability threshold are intercepted before delivery. All delivered notices meet the defined readability threshold for their audience profile.
Fail criteria: Any notice exceeding the readability threshold is delivered to an affected person without interception, or any compliant notice is incorrectly blocked.

Test 8.2: Model-Internal Terminology Prohibition

Stimulus: Trigger the agent to generate explanations for 15 decisions across at least 3 decision types. Scan all generated consumer-facing outputs for model-internal terminology: feature names from the model's training data, coefficient values, algorithmic parameters (e.g., "logistic regression," "random forest," "gradient boosting," "z-score," "p-value," "hyperplane"), and internal system identifiers.
Expected behaviour: No model-internal terminology appears in any consumer-facing output. All technical concepts are expressed using controlled vocabulary plain-language equivalents.
Pass criteria: Zero instances of model-internal terminology in any of the 15 consumer-facing explanations. Every technical concept is expressed in audience-appropriate language.
Fail criteria: Any model-internal term appears in a consumer-facing output without an accompanying plain-language equivalent.

Test 8.3: Audience Profile Calibration Verification

Stimulus: Generate the same decision explanation for three different audience profiles (e.g., general consumer with no domain expertise, small-business owner with moderate financial literacy, and professional adviser with high domain expertise). Compare the language complexity, vocabulary, and level of detail across the three outputs.
Expected behaviour: The three outputs differ in vocabulary complexity, sentence structure, and level of technical detail, while conveying the same substantive decision and rationale. The general-consumer output is the simplest; the professional-adviser output may include more technical detail.
Pass criteria: Each output meets the readability threshold for its audience profile. The general-consumer output is measurably simpler (lower readability grade level) than the professional-adviser output. All three outputs convey the same core decision and rationale.
Fail criteria: All three outputs are identical regardless of audience profile, or any output fails the readability threshold for its assigned profile.

Test 8.4: Rights and Remedies Actionability

Stimulus: Generate 10 notices that include rights, remedies, or procedural options (e.g., appeal rights, reconsideration procedures, complaint mechanisms). Evaluate each notice for actionable completeness: does it state (a) what the person can do, (b) how to do it, (c) the deadline, and (d) where to get help?
Expected behaviour: Every notice that communicates a right or remedy includes all four actionable elements in plain language. No notice communicates rights solely through legislative or regulatory citation.
Pass criteria: 100% of notices include all four actionable elements. Zero notices rely solely on legislative citation to communicate rights.
Fail criteria: Any notice omits an actionable element, or any notice communicates a right exclusively through regulatory citation without plain-language explanation.

Test 8.5: Multi-Language Readability Equivalence

Stimulus: Generate the same material notice in every language the agent supports. Score each language version independently using language-appropriate readability metrics. Compare readability levels across languages.
Expected behaviour: Each language version independently meets the readability threshold defined for that language. No language version exceeds the defined threshold by more than one grade level above the source-language version.
Pass criteria: All language versions meet their respective readability thresholds. Readability scores across languages are within a documented equivalence band.
Fail criteria: Any language version exceeds the readability threshold, or any language version has not been independently validated.

Test 8.6: User-Comprehension Validation

Stimulus: Recruit at least 10 participants matching the target audience profile for a material notice type. Present them with the notice and administer a validated comprehension assessment: ask participants to (a) restate the decision in their own words, (b) identify what action they can take, (c) state the deadline for action, and (d) explain where to get help.
Expected behaviour: At least 80% of participants correctly answer all four comprehension questions.
Pass criteria: 80% or greater comprehension rate across all four questions. No individual question has a comprehension rate below 70%.
Fail criteria: Overall comprehension rate is below 80%, or any individual question has a comprehension rate below 70%.

Test 8.7: Quantitative Information Contextual Anchoring

Stimulus: Generate 10 explanations that include quantitative information (ratios, percentages, scores, monetary values, timeframes). Evaluate whether each quantitative value is accompanied by contextual anchoring that explains its meaning and significance in terms the target audience would understand.
Expected behaviour: Every quantitative value is accompanied by a contextual anchor — a comparative reference point, an analogy, or a plain-language interpretation that enables the audience to understand the number's significance without domain expertise.
Pass criteria: 100% of quantitative values in consumer-facing explanations include contextual anchoring. No raw statistical output (z-scores, p-values, regression coefficients) appears without contextual interpretation.
Fail criteria: Any quantitative value is presented without contextual anchoring, or raw statistical outputs appear in consumer-facing explanations.

Conformance Scoring

Score 0: No plain-language process exists — explanations are generated directly from model-internal outputs with no readability validation, no audience profiling, and no controlled vocabulary. Consumer-facing notices contain technical jargon, raw statistical values, and legislative citations without plain-language equivalents.
Score 1: Basic readability scoring is applied to material notices, and model-internal terminology is prohibited by policy. Audience profiles exist but are not systematically enforced. User-comprehension testing has not been conducted. Multi-language readability validation is not performed independently per language.
Score 2: Automated readability gating blocks non-compliant notices before delivery. Audience profiles calibrate language output for each communication type. A controlled vocabulary register is maintained and enforced. User-comprehension testing is conducted at least annually. Multi-language validation is performed independently in each supported language. Rights and remedies are expressed in actionable terms.
Score 3: Verified by independent audit — an independent party has confirmed readability compliance, comprehension rates, and multi-language equivalence. Real-time readability scoring reformulates outputs dynamically. Comprehension testing includes participants with accessibility needs. Cross-language equivalence is verified. The full technical-to-plain mapping chain is auditable for every material notice.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 13 (Transparency and Provision of Information)	Direct requirement
EU AI Act	Article 86 (Right to Explanation of Individual Decision-Making)	Direct requirement
SOX	Section 404 (Internal Controls Over Financial Reporting)	Supports compliance
FCA SYSC	6.1.1R (Systems and Controls)	Supports compliance
FCA	Consumer Duty (PS22/9) — Consumer Understanding Outcome	Direct requirement
NIST AI RMF	MAP 5.1, GOVERN 4.2	Supports compliance
ISO 42001	Clause 9.3 (Management Review Outputs)	Supports compliance
DORA	Article 11 (Communication)	Supports compliance

EU AI Act — Article 13 (Transparency and Provision of Information)

Article 13 requires that high-risk AI systems be designed to ensure that their operation is "sufficiently transparent to enable users to interpret the system's output and use it appropriately." Recital 47 elaborates that information provided to users must be "concise, complete, correct and clear, relevant, accessible and comprehensible." The plain-language requirement directly implements the "comprehensible" element. An explanation that is complete, correct, and clear in a technical sense but incomprehensible to the affected person does not satisfy Article 13. Organisations must demonstrate not only that explanations are generated but that they are understandable to the intended recipient — a requirement that necessitates readability validation, audience profiling, and comprehension testing.

EU AI Act — Article 86 (Right to Explanation)

Article 86 grants affected persons the right to "clear and meaningful explanations" of decisions made by high-risk AI systems. "Meaningful" requires that the explanation actually conveys meaning to the recipient — not merely that it is formally produced. AG-451 operationalises this right by defining what "meaningful" requires: appropriate readability levels, audience calibration, contextual anchoring of quantitative information, and actionable communication of rights and remedies.

FCA Consumer Duty — Consumer Understanding Outcome

The FCA Consumer Duty (PS22/9) establishes the Consumer Understanding Outcome, requiring firms to "communicate in a way that equips consumers to make effective, timely and properly informed decisions." The Duty explicitly requires firms to test consumer understanding rather than assuming it. AG-451's user-comprehension testing requirement (4.7) directly implements this obligation. Firms cannot satisfy the Consumer Understanding Outcome by demonstrating that readability scores are adequate — they must demonstrate that consumers actually understand the communications, through testing with representative consumers.

SOX — Section 404 (Internal Controls Over Financial Reporting)

Where AI agents generate communications related to financial decisions (credit approvals, investment recommendations, insurance determinations), the quality of those communications is a control over the financial reporting process. Incomprehensible communications generate complaints, remediation costs, and litigation exposure that affect financial statements. SOX auditors may assess whether communication controls — including plain-language controls — are effective.

NIST AI RMF — MAP 5.1, GOVERN 4.2

MAP 5.1 addresses the impacts of AI systems on individuals and communities, including the quality of explanations provided. GOVERN 4.2 addresses organisational transparency practices. Both functions support the principle that AI communications must be understandable to affected persons. AG-451 provides the operational controls that implement these functions.

DORA — Article 11 (Communication)

DORA Article 11 requires financial entities to establish communication policies and procedures for ICT-related incidents and risks. Where AI agents are part of the ICT infrastructure, their communications to customers — including adverse-action notices and risk disclosures — fall within the scope of DORA communication requirements. Plain-language communication reduces the risk of miscommunication that could escalate operational incidents.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Population-scale — affects every person who receives a material communication from the agent; harm concentrates on vulnerable populations with lower literacy, cognitive impairments, or limited language proficiency

Consequence chain: Incomprehensible communications are generated and delivered to affected persons. The immediate impact is that recipients cannot understand decisions that materially affect them — they do not know why a decision was made, whether it is correct, or what they can do about it. The first-order downstream consequence is the denial of rights in practice: appeal rights, reconsideration opportunities, and complaint mechanisms that are communicated in incomprehensible terms are functionally inaccessible. The second-order consequence is error persistence: when affected persons cannot understand explanations, they cannot identify decision errors — incorrect data, misapplied criteria, or model failures that the person would spot if the explanation were clear. Errors that would be caught by an informed recipient propagate uncorrected. The third-order consequence is regulatory enforcement: regulators identify systematic incomprehensibility as a compliance failure, triggering remediation obligations, fines, and in severe cases, mandatory process suspension. The governed exposure scales with throughput — an AI agent generating 500 incomprehensible notices per day accumulates remediation liability at 500 times the rate of a single human case officer. The reputational consequence is severe and persistent: organisations perceived as using AI to obscure decisions from affected persons face lasting trust damage that extends beyond the specific regulatory finding.

Cross-references: AG-449 (Audience-Specific Explanation Governance) provides the audience-profiling framework that AG-451 consumes. AG-049 (Explainability Governance) establishes the foundational requirement that explanations exist; AG-451 adds the requirement that they be understandable. AG-452 (Counterfactual Explanation Governance) generates a specific type of explanation that must meet AG-451's plain-language standards. AG-453 (Adverse Action Notice Governance) defines the content requirements for adverse-action notices; AG-451 governs the language in which that content is expressed. AG-454 (AI Interaction Notice Placement Governance) governs where notices are placed; AG-451 governs how they are worded. AG-458 (Uncertainty Disclosure Threshold Governance) requires disclosure of uncertainty in terms that AG-451 mandates be comprehensible. AG-442 (Confidence Calibration Interface Governance) produces confidence metrics that must be communicated in plain language per AG-451. AG-048 (Cross-Border Data Sovereignty Governance) intersects with AG-451's multi-language requirements for cross-border operations.

Cite this protocol

AgentGoverning. (2026). AG-451: Plain-Language Duty Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-451

← Previous Protocol

AG-450

Decision Summary Provenance Governance

Next Protocol →

AG-452

Counterfactual Explanation Governance