AG-628

Financial Vulnerability Protection Governance

Insurance, Credit & Lending ~24 min read AGS v2.1 · April 2026
EU AI Act GDPR SOX FCA NIST ISO 42001

Section 2: Summary

This dimension governs how AI agents operating in insurance, credit, and consumer-lending contexts must detect, record, and respond to indicators of customer financial vulnerability — including but not limited to severe indebtedness, mental health disclosures, bereavement, unemployment, age-related cognitive decline, and language or literacy barriers. Heightened care in this context is not optional customer service enhancement; it is a structural control requirement because the consequences of automated decisions made without vulnerability awareness fall disproportionately on the people least equipped to challenge or absorb them, creating irreversible financial harm, regulatory censure, and systemic reputational damage. Failure manifests as an agent that continues standard underwriting, collections, or cross-sell workflows against a customer who has disclosed a terminal diagnosis, a customer who is demonstrably in debt spiral, or a customer whose communication pattern strongly signals cognitive impairment — processing their interactions as routine transactions while every downstream output deepens their harm.

Section 3: Example

Example 3.1 — Debt Collection Agent and Mental Health Disclosure

A consumer-finance AI agent is deployed by a mid-tier personal-loan provider to handle inbound collections contacts. A customer named in this record as C-1 calls to discuss a £4,200 outstanding balance that is 47 days past due. During the first 90 seconds of the call, C-1 discloses that they have recently been discharged from a psychiatric hospital following a suicide attempt and states that they are "not coping with the letters." The agent has no vulnerability detection module and no escalation pathway configured. It continues its standard collections script: confirms the outstanding balance, notes the accruing daily interest rate of 0.8%, advises that a County Court Judgment may be registered in 14 days, and offers a single repayment option. C-1 terminates the call without engaging. Three automated follow-up messages are sent over the following five days. C-1's guarantor later reports that C-1 experienced a second mental health crisis in the fortnight following the contact. The regulator's supervisory review of the lender's AI systems identifies this call sequence in a batch of 2,300 similar contacts — 14% of which contained an explicit vulnerability marker that the agent failed to flag. The lender faces a formal investigation, a voluntary redress programme costing approximately £780,000, and mandatory re-architecture of its collections AI infrastructure. The harm to C-1 is irreversible; financial redress is made but the causal link between the agent's conduct and the mental health deterioration is documented in the regulatory record.

Example 3.2 — Insurance Renewal Agent and Cognitive Impairment Indicators

An AI agent deployed by a household insurer handles automated renewal conversations for approximately 18,000 policyholders annually. One policyholder, P-1, is 84 years old and has been a customer for 22 years. During a renewal call, P-1 asks the same clarifying question about their contents cover four times in six minutes, confirms agreement to a premium increase from £412 to £689 per year, and then states "I'm not quite sure what I'm agreeing to but you sound very clear about it." The agent's sentiment-analysis layer flags no anomaly because the customer's tone is calm. No age-related repetition pattern analysis is configured. The renewal is processed. P-1's daughter, who holds lasting power of attorney for financial matters, contacts the insurer six weeks later. She has identified that P-1 was subsequently enrolled in a legal-expenses add-on and home-emergency cover — neither of which P-1 required — at an additional £23.50 per month. Upon audit, it is found that the agent offered these products during the same call using a standard cross-sell module that was not suppressed for customers exhibiting repetition or comprehension-uncertainty language patterns. The Financial Ombudsman receives a complaint. The insurer refunds 14 months of premiums across similar cases — a total of £61,400 — and must document why its AI did not have a "vulnerability pause" mechanism for customers over 80 or for any customer who explicitly states uncertainty about what they are agreeing to.

Example 3.3 — Mortgage Pre-Approval Agent and Income Shock Disclosure

A digital mortgage broker deploys an AI agent to handle affordability pre-screening for refinancing applications. Applicant A-1 is midway through a standard income-and-expenditure dialogue when they volunteer that they were made redundant three weeks ago, that they are managing on a £2,300 severance payment, and that they "need to sort out the mortgage quickly before the bank finds out." The agent — designed to gather declared income and expenditure fields and output a pre-approval band — continues the workflow, records £0 for current employment income, and because A-1's declared property equity exceeds the loan-to-value threshold, outputs a pre-approval estimate of £187,000 at a 5.4% rate and recommends proceeding to full application. No vulnerability flag is raised; no referral to a human mortgage adviser is triggered; no information about debt advice organisations or free independent money guidance is provided. A-1 proceeds to full application, incurs a £375 valuation fee, and is declined at underwriting stage. A-1 subsequently defaults on their existing mortgage. The broker's FCA-authorised status comes under review because its AI pre-screening process demonstrably failed the "fair treatment of customers" obligation by generating an output — the pre-approval estimate — that was objectively misleading given disclosed circumstances, and by failing to route A-1 to appropriate support resources before generating that output. The £375 fee is refunded under regulatory direction. The broker must remediate 6,400 historical cases to determine whether similar vulnerability-blind pre-approvals were made.

Section 4: Requirement Statement

4.0 Scope

This dimension applies to all AI agents — including conversational agents, scoring agents, document-processing agents, and multi-step agentic pipelines — that interact with or make decisions about retail customers in the Insurance, Credit, and Lending landscape. It applies at every touchpoint in the customer journey: origination, underwriting, servicing, renewal, collections, claims handling, and cross-sell. It applies regardless of whether the customer interaction is real-time (voice, chat) or asynchronous (email, document submission). The dimension does not apply to purely institutional or wholesale counterparty interactions where no retail consumer is a party to the decision. Where an agent processes information about a customer on behalf of a human adviser, the agent remains in scope for all detection and flagging requirements even if the final decision rests with the human.

Vulnerability, for the purposes of this dimension, encompasses any circumstance — temporary or permanent — that may compromise a customer's ability to make informed, free, and rational financial decisions or that increases the potential severity of harm from a financial product or decision. This includes, non-exhaustively: mental health conditions, cognitive impairment, serious illness or bereavement of a close family member, significant income shock, recent job loss, severe over-indebtedness, language or literacy barriers, age-related factors (both youth and advanced age), domestic abuse or financial coercion, and addiction.

4.1 Vulnerability Signal Detection

4.1.1 The agent MUST maintain an active, runtime-evaluated vulnerability detection layer that monitors all available input signals — including explicit verbal or written disclosures, linguistic pattern indicators, interaction anomalies, and structured data fields — for markers of customer vulnerability throughout the entire duration of each customer interaction.

4.1.2 The agent MUST treat an explicit customer disclosure of any of the following as an automatic vulnerability trigger: mental health condition or crisis, suicidal ideation or reference to self-harm, terminal or serious illness (self or close family), bereavement (within 12 months as declared), redundancy or income loss (within 90 days as declared), disclosure of domestic abuse, or statement of incapacity to understand the current interaction.

4.1.3 The agent MUST maintain a minimum detection sensitivity threshold such that, across test cohorts representative of real customer populations, the false-negative rate for explicit verbal or written vulnerability disclosures does not exceed 2% on any rolling 90-day evaluation window.

4.1.4 The agent SHOULD detect non-explicit vulnerability indicators including but not limited to: repeated identical questions within a single session (three or more repetitions of substantively the same question), expressed uncertainty about comprehension of stated terms, abnormally high response latency patterns inconsistent with the customer's prior interaction history, and emotional distress markers (crying sounds in voice channels, distress language in text).

4.1.5 The agent MAY use demographic data — such as age bands — to adjust its detection sensitivity, provided that such adjustment is applied to increase protective attention and is never applied to reduce detection sensitivity or to infer vulnerability in the absence of supporting behavioural signals.

4.2 Workflow Modification Upon Vulnerability Trigger

4.2.1 Upon any vulnerability trigger identified under 4.1.2, the agent MUST immediately pause any active collections, cross-sell, upsell, or adverse-decision workflow and MUST NOT resume that workflow within the same interaction session without explicit confirmation from a qualified human agent that resumption is appropriate.

4.2.2 The agent MUST provide the vulnerable customer with clear, accessible information about at least one relevant free, independent support resource — such as a national debt advice organisation, a mental health crisis line, or a money guidance service — before taking any further action in the interaction.

4.2.3 The agent MUST NOT generate any financial output — including pre-approval estimates, premium quotes, credit limit recommendations, or repayment schedules — for a customer who has triggered a 4.1.2 vulnerability marker within the same interaction session until a human review has been completed and a qualified human has authorised the continuation of the financial workflow.

4.2.4 The agent MUST create a vulnerability record in the customer's file within the same interaction session, documenting: the trigger type, the timestamp, the verbatim or near-verbatim text or transcription of the triggering disclosure, the actions taken, and the identity or reference of any human agent to whom the case was referred.

4.2.5 Where the agent detects a non-explicit indicator under 4.1.4 but not a 4.1.2 explicit trigger, the agent SHOULD modify its interaction pace, increase the clarity and simplicity of its language, offer a human escalation option prominently, and SHOULD NOT progress to any product recommendation or adverse decision step without first offering the customer an explicit opportunity to speak with a human.

4.2.6 The agent MUST NOT use a customer's disclosed vulnerability circumstance — including mental health disclosure, income shock, or bereavement — as a predictor or feature in any credit-scoring, underwriting, or pricing model unless such use is explicitly required by actuarial soundness regulation and has been reviewed and approved under the organisation's model governance framework.

4.3 Continuity and Persistence of Vulnerability Status

4.3.1 The agent MUST propagate a vulnerability flag to all downstream systems and agents involved in the same customer journey when a vulnerability record has been created under 4.2.4, such that no subsequent automated touchpoint processes the customer without awareness of the flag.

4.3.2 The agent MUST respect a vulnerability flag for a minimum retention period of 12 months from the date of the triggering disclosure, unless a qualified human supervisor reviews and formally removes the flag with documented rationale.

4.3.3 The agent MUST NOT allow a customer's vulnerability flag to expire silently through automated TTL (time-to-live) mechanisms without human-supervised review.

4.3.4 Where a customer contacts the organisation through a different channel than the one in which the vulnerability was first flagged, the agent MUST retrieve and honour the existing vulnerability flag before initiating any interaction workflow.

4.4 Communication Standards for Vulnerable Customers

4.4.1 When operating in a confirmed-vulnerability state for a customer, the agent MUST apply communication adjustments including: reduced interaction pace, plain-language output at a maximum reading-complexity equivalent to approximately a UK Year 8 or US Grade 8 level, avoidance of multi-part conditional statements, and explicit confirmation checkpoints at each material decision point.

4.4.2 The agent MUST NOT use urgency framing — including countdown language, time-limited offer constructs, or imminent-consequence warnings — in any interaction with a customer whose vulnerability flag is active.

4.4.3 The agent MUST offer, at a minimum of every third interaction step with a vulnerability-flagged customer, an option to pause, continue later, or speak with a human — without making any inference about the customer's likelihood of choosing those options.

4.4.4 Where the agent is operating in a multilingual or multi-jurisdiction environment, the agent MUST detect when a customer is communicating with apparent language difficulty and MUST treat this as a non-explicit vulnerability indicator under 4.1.4, triggering the adjustments in 4.2.5.

4.5 Collections and Enforcement Constraints

4.5.1 When a customer's vulnerability flag is active, the agent MUST NOT initiate or escalate automated collections contact — including payment demand messages, default notices, legal-action warnings, or credit bureau notification warnings — without a human review step authorising that contact.

4.5.2 The agent MUST apply a minimum 72-hour cooling-off window following any interaction in which a 4.1.2 vulnerability trigger was first detected, during which no outbound automated contact of any kind is initiated toward that customer.

4.5.3 The agent MUST be capable of connecting a customer in active collections contact to a human agent within 60 seconds if the customer requests human assistance at any point, or if a vulnerability trigger is detected during the collections interaction.

4.5.4 Where a customer has triggered a vulnerability marker and has also disclosed or demonstrated inability to engage with a repayment arrangement, the agent SHOULD generate a referral recommendation to an appropriate external debt advice service rather than continuing to cycle through internal repayment options.

4.6 Claims Handling Constraints

4.6.1 In insurance claims contexts, where a claim is inherently associated with a vulnerability-indicative event — including death-of-policyholder claims, critical-illness claims, accident-and-emergency claims, and domestic-abuse-related property claims — the agent MUST apply heightened-care protocols from the first moment of the claim interaction without requiring an explicit disclosure trigger.

4.6.2 The agent MUST NOT decline, defer, or issue a counter-offer on a claim where the claimant is actively presenting a 4.1.2 vulnerability marker, except by escalation to and authorisation by a qualified human claims manager.

4.6.3 The agent MUST provide a clear, human-readable written summary of any claims decision outcome to a vulnerability-flagged customer in addition to any standard digital communication, unless the customer has explicitly opted out of written communications.

4.7 Data Governance for Vulnerability Information

4.7.1 The agent MUST treat all vulnerability-related data — including transcripts containing disclosures, vulnerability flags, and referral records — as special-category sensitive data and MUST apply access controls equivalent to those applied to health or financial hardship data under applicable data protection regulation.

4.7.2 The agent MUST NOT share vulnerability-related data with third parties — including data brokers, credit reference agencies, or affiliate lenders — except where explicitly required by law or where the customer has provided informed, specific consent.

4.7.3 The agent MUST support customer subject access requests for their vulnerability record, including the full history of flags, triggers, and actions taken, within the response timescales mandated by applicable data protection regulation.

4.8 Human Oversight and Escalation Architecture

4.8.1 The agent MUST have a defined escalation pathway to a qualified human agent for every vulnerability scenario covered in 4.1.2, and that pathway MUST be tested and operational at all times the agent is in production.

4.8.2 The agent MUST be capable of reaching a human escalation destination within a maximum latency of 120 seconds from the point of escalation trigger, during business operating hours. Outside business hours, the agent MUST provide the customer with a guaranteed callback time and a reference number.

4.8.3 The organisation deploying the agent MUST maintain documented evidence that human agents responsible for receiving vulnerability escalations are trained in vulnerability-sensitive financial services communication and that this training is refreshed at a minimum annually.

4.8.4 The agent MUST log every instance in which an escalation pathway was triggered and every instance in which it was triggered but the escalation could not be completed within the maximum latency threshold, creating a breach record for operational review.

4.9 Prohibited Behaviours

4.9.1 The agent MUST NOT, under any circumstances, use vulnerability indicators — whether disclosed or inferred — to target a customer for a higher-cost product, a lower credit limit offer designed to restrict rather than support, or any other commercially advantageous action for the deploying organisation that is adverse to the customer's financial interest.

4.9.2 The agent MUST NOT generate language that minimises, dismisses, or creates friction around a customer's disclosure of vulnerability, including responses that immediately redirect the customer back to the financial task before acknowledging the disclosure.

4.9.3 The agent MUST NOT operate in a configuration where vulnerability detection is disabled, degraded, or bypassed for performance or cost-efficiency reasons. Any configuration change that affects vulnerability detection capability MUST go through the organisation's model change management process and MUST require sign-off from a responsible senior compliance officer.

Section 5: Rationale

5.1 Why Structural Control Is Required

Financial vulnerability in the insurance, credit, and lending context is not a rare edge case. Industry-wide research consistently shows that between 24% and 50% of adult retail financial services customers exhibit at least one characteristic associated with vulnerability at any given time — with the proportion rising significantly during periods of macroeconomic stress such as inflationary shocks, unemployment spikes, or health emergencies. An AI agent that processes customer interactions at scale without vulnerability awareness is therefore not encountering vulnerability rarely; it is encountering it thousands of times per month and systematically failing to respond appropriately.

Behavioural controls alone — training human oversight teams to catch vulnerability cases that AI systems miss — are structurally inadequate because they rely on the human oversight function having visibility of interactions that the AI has already handled. In automated collections, digital mortgage pre-screening, and insurance renewal pipelines, a significant proportion of interactions complete without any human review. The only architecturally sound approach is to embed detection and response capability within the agent itself, making vulnerability-aware behaviour a property of the system rather than a downstream quality-assurance aspiration.

5.2 The Asymmetry of Harm

The harm asymmetry in vulnerability-blind financial AI systems runs in a consistent direction: the costs of false negatives — failing to detect vulnerability when it is present — fall on the customer, while the costs of false positives — treating a non-vulnerable customer with heightened care — are primarily operational (slightly longer interaction times, lower cross-sell conversion rates in the session). This asymmetry is fundamental to understanding why the detection sensitivity thresholds in 4.1.3 are calibrated toward sensitivity rather than specificity, and why the workflow modification requirements in 4.2 are designed to be conservative. An agent that occasionally offers a human escalation option to a customer who does not need it has incurred a minor operational cost. An agent that fails to pause its collections script when a customer discloses suicidal ideation has contributed to potential catastrophic harm.

5.3 Regulatory Obligation Structure

This dimension's requirements are not solely derived from risk management principles; they are anchored in binding regulatory obligations that apply across the major jurisdictions in which these landscapes operate. The UK's Consumer Duty (PS22/9) requires regulated firms to act to deliver good outcomes for all retail customers, with explicit reference to customers in vulnerable circumstances requiring proportionate additional care. The EU's Consumer Credit Directive recast (2023/2225) introduces responsible lending obligations that require creditworthiness assessment to be conducted in the customer's genuine interest. The FCA's Guidance for Firms on the Fair Treatment of Vulnerable Customers (FG21/1) establishes that firms must identify vulnerability and respond to it — and that automated systems operating on behalf of firms inherit those obligations. An AI agent that does not meet the requirements of this dimension exposes its deploying organisation to regulatory enforcement, not merely reputational risk.

5.4 Why Prevention Is More Effective Than Detection-and-Remediation

Post-hoc remediation of vulnerability-related AI failures in financial services is structurally expensive in a way that prevention is not. Once an automated collections cycle has run against a vulnerable customer, the harm has occurred. Redress programmes require case-by-case review, customer tracing, individual financial remediation, and — where regulatory investigation is triggered — supervised implementation. The three examples in Section 3 each resulted in remediation costs in the hundreds of thousands of pounds plus irreversible operational disruption. The engineering and operational cost of building compliant vulnerability detection and workflow modification capability is significantly lower than the cost of a single significant remediation programme. Prevention-oriented control architecture is not merely ethically correct; it is economically rational.

Section 6: Implementation Guidance

Pattern 1 — Layered Signal Architecture. Build vulnerability detection as a layered pipeline rather than a single classifier. The first layer handles explicit keyword and phrase matching for the 4.1.2 mandatory trigger categories using a curated, regularly updated lexicon that includes colloquial and indirect expressions ("I can't cope," "things have got really bad," "since my husband died") in addition to clinical language. The second layer applies conversational pattern analysis — repetition detection, comprehension-uncertainty phrase identification, response latency profiling in real-time voice channels. The third layer applies contextual enrichment from structured data, such as claim type, account age, and declared age bands. Triggers from any layer independently initiate the vulnerability response; the multi-layer design is intended to maximise detection, not to require confluence across layers before acting.

Pattern 2 — Vulnerability State Machine. Model the customer's vulnerability status as an explicit finite state machine with clearly defined states (No Flag, Soft Indicator, Active Flag, Review-Pending, Flag Retained, Flag Formally Closed), defined transition conditions for each state change, and a requirement that any transition from Active Flag to any less-protected state requires a human-authorised event. This makes the vulnerability status auditable, persistent, and resistant to accidental reset by session boundaries or system restarts.

Pattern 3 — Suppression-First Workflow Architecture. Design the agent's workflow engine so that the default response to an Active Flag is suppression of all commercial and collections actions unless a specific human-authorised exception is present. This inverts the default: instead of the vulnerability response module needing to actively interrupt a running workflow, the commercial workflow cannot run without checking for the absence of an active vulnerability flag. Suppression-first architectures are significantly more robust to edge cases and integration failures than interrupt-based architectures.

Pattern 4 — Resource Library with Jurisdiction Mapping. Maintain a curated, regularly reviewed library of support resources (debt advice, mental health, bereavement, domestic abuse) mapped to customer jurisdiction and communication channel. When 4.2.2 is triggered, the resource selection is automatic and jurisdiction-appropriate rather than relying on the agent to generate resource references ad hoc, which risks hallucination or outdated information. In the UK, this library should include National Debtline, StepChange, Samaritans, MoneyHelper, and Refuge as baseline entries. Update the library at a minimum every six months.

Pattern 5 — Vulnerability-Aware Evaluation Datasets. Construct dedicated adversarial evaluation datasets for vulnerability detection that include: direct disclosures, indirect disclosures, disclosures embedded mid-task, disclosures made in distressed or fragmented language, disclosures in non-standard dialects or with language-barrier indicators, and control cases where no vulnerability is present but emotive language might create false positives. Run these evaluations on every model version update and at minimum quarterly for deployed models.

Pattern 6 — Human Escalation Circuit Testing. Test the human escalation pathway end-to-end — including handoff data completeness, context transfer to the human agent, and maximum latency compliance — at a minimum weekly using automated synthetic test cases, and monthly using live test cases run by quality-assurance staff operating as customers.

6.2 Anti-Patterns

Anti-Pattern 1 — Vulnerability Detection as Post-Processing. Do not implement vulnerability detection as a post-hoc review of completed interaction transcripts. Detection must occur in real time, or in the case of asynchronous interactions such as email or document submission, within the processing pipeline before any output is generated. Post-processing detection is valuable for quality assurance and audit but cannot substitute for runtime detection because the harmful output has already been produced.

Anti-Pattern 2 — Single-Trigger Threshold Without Continuous Monitoring. Do not implement a design in which vulnerability detection runs only at the start of an interaction, or only at defined interaction milestones. Vulnerability disclosures occur at unpredictable points in a conversation — often when the customer has built sufficient rapport or when the topic of the financial interaction has directly evoked their circumstances. Monitoring must be continuous throughout the session.

Anti-Pattern 3 — Undifferentiated Escalation. Do not design the escalation pathway so that all vulnerability escalations go to a general customer-service queue without vulnerability context. Human agents receiving escalations must receive: the triggering statement or indicator, the customer's current interaction context, the vulnerability flag type, and any relevant account history. Context-free escalations result in customers having to re-disclose their vulnerability to the human agent, which is harmful and inefficient.

Anti-Pattern 4 — Vulnerability as a Conversion Barrier Only. Do not implement vulnerability flagging in a configuration where the primary operational effect is the suppression of cross-sell conversion metrics without corresponding activation of support and care protocols. The purpose of vulnerability flagging is customer protection, not commercial pipeline management. Organisations that implement the suppression side of vulnerability controls without the care and referral side are in non-compliance with this dimension and with underlying regulatory obligations.

Anti-Pattern 5 — Hardcoded Vulnerability Lexicon Without Review. Do not deploy a vulnerability detection lexicon that is hardcoded at deployment time and not subject to regular review. Language patterns associated with vulnerability evolve; colloquial expressions in particular shift over time and vary by demographic group. A lexicon that was comprehensive at deployment may have material gaps 18 months later. Implement a formal review and update process.

Anti-Pattern 6 — Silent Flag Expiry. Do not configure vulnerability flags with automated expiry times (TTL settings) that cause flags to disappear without human review. Vulnerability circumstances frequently persist for months or years. A flag that expires automatically after 30 or 90 days may remove protection from a customer who remains in the triggering circumstance.

6.3 Industry-Specific Considerations

In insurance claims handling, the first-notice-of-loss interaction is a particularly high-risk point. Customers reporting death-of-policyholder, critical illness, or accident claims are by definition interacting from a position of acute stress. Agents handling these interaction types should operate in a heightened-care mode by default for the first interaction, without requiring a vulnerability trigger to be detected. In consumer credit origination, the application interview — particularly in digital or telephony channels — is the highest-probability point for vulnerability disclosure, as customers often contextualise their financing need in terms of their circumstances. In mortgage processing, the income-and-expenditure module is the most likely point of income-shock disclosure and requires specific prompt engineering to ensure that disclosures within that module are not processed purely as data fields.

6.4 Maturity Model

Level 1 — Baseline. Explicit 4.1.2 trigger detection active, basic workflow pause implemented, human escalation pathway operational. Vulnerability records created and retained. Resource library deployed for at least three of the six trigger categories.

Level 2 — Managed. Non-explicit indicator detection active (4.1.4). Vulnerability state machine implemented. Suppression-first workflow architecture in place. Jurisdiction-mapped resource library complete. Quarterly evaluation against vulnerability test dataset.

Level 3 — Advanced. Full propagation of vulnerability flags across all downstream systems and channels. Cooling-off period automation validated end-to-end. Collections-specific constraints (4.5) fully implemented. Claims-specific heightened-care defaults (4.6) in operation. Human escalation latency monitored in real time with breach alerting.

Level 4 — Optimised. Predictive vulnerability risk scoring using longitudinal behavioural patterns — with strict ethical review and model governance. Closed-loop feedback from human escalation outcomes used to continuously improve detection model. Proactive outreach to customers with persistent vulnerability flags offering support referrals. Full cross-channel vulnerability status synchronisation with sub-5-second propagation latency.

Section 7: Evidence Requirements

7.1 Mandatory Artefacts

ArtefactDescriptionRetention Period
Vulnerability Detection Configuration DocumentComplete technical specification of the detection pipeline, including lexicon versions, pattern detection rules, threshold settings, and training data provenance for any ML-based detection componentsMinimum 7 years from deployment version date
Vulnerability Event LogPer-customer, per-interaction log of all vulnerability triggers, trigger types, timestamps, triggering text or signal type, actions taken, human referral references, and flag status transitionsMinimum 7 years from the date of the triggering event
Human Escalation Pathway Test RecordsDocumented results of weekly automated and monthly live tests of the human escalation pathway, including measured latency, context-transfer completeness, and any breach recordsMinimum 3 years
False-Negative Evaluation ReportsQuarterly evaluation reports against vulnerability test datasets, documenting false-negative rates per trigger category and overall, with comparison against the 2% threshold in 4.1.3Minimum 5 years
Vulnerability Flag Lifecycle Audit TrailFor each vulnerability flag: creation event, all status transitions, human review records for flag retention decisions, and formal closure records with authorising officer referenceMinimum 7 years from flag closure
Suppression Event LogLog of every commercial workflow suppression triggered by vulnerability flag activation, including interaction ID, suppression type, and any human-authorised exception recordsMinimum 5 years
Resource Library Version HistoryVersioned record of the support resource library with effective dates, review dates, and reviewers, demonstrating at-minimum biannual update cadenceMinimum 5 years
Human Escalation Agent Training RecordsRecords demonstrating that all human agents in the escalation pathway have received vulnerability-sensitive communication training and the date of most recent refreshMinimum 5 years
Model Change Impact AssessmentFor any change to detection model, thresholds, or workflow suppression rules: documented impact assessment showing effect on detection sensitivity before and after change, with compliance sign-offMinimum 7 years

7.2 Audit Accessibility

All artefacts in 7.1 must be accessible to the organisation's internal audit function within 48 hours of request and to the relevant financial services regulator within the timescales specified in the applicable supervisory engagement framework (typically 5–10 business days for standard requests, immediate for urgent supervisory investigations). Vulnerability Event Logs must be accessible by subject access request from the customer within statutory data protection timescales (30 days under UK GDPR / GDPR).

Section 8: Test Specification

Test 8.1 — Explicit Vulnerability Trigger Detection (Maps to Requirements 4.1.1, 4.1.2, 4.1.3)

Objective: Verify that the agent reliably detects explicit vulnerability disclosures across all mandatory 4.1.2 categories.

Method: Present the agent with a standardised test battery of 200 synthetic interaction transcripts, of which 100 contain explicit vulnerability disclosures covering all six mandatory 4.1.2 categories (minimum 10 per category, distributed across direct statement, indirect statement, and colloquial expression variants), and 100 contain no vulnerability indicators but do contain emotive or financial-stress language that may create false positives. Test transcripts must include voice-channel transcription variants with realistic transcription noise.

Pass Criteria:

Scoring:

Test 8.2 — Workflow Suppression and Human Escalation Upon Trigger (Maps to Requirements 4.2.1, 4.2.3, 4.8.1, 4.8.2)

Objective: Verify that upon detection of a 4.1.2 trigger, the agent immediately pauses all commercial workflows and initiates human escalation within maximum latency thresholds.

Method: Run 20 live end-to-end test cases — minimum 5 in each of the following workflow types: collections, cross-sell/upsell, credit pre-screening, insurance renewal. In each test, a test operative plays the role of a customer and delivers a scripted 4.1.2-category disclosure at a randomised point in the interaction. Test operatives record: whether the commercial workflow paused, whether the resource information was provided per 4.2.2, whether a human escalation was initiated,

Section 9: Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 9 (Risk Management System)Direct requirement
SOXSection 404 (Internal Controls Over Financial Reporting)Supports compliance
FCA SYSC6.1.1R (Systems and Controls)Supports compliance
NIST AI RMFGOVERN 1.1, MAP 3.2, MANAGE 2.2Supports compliance
ISO 42001Clause 6.1 (Actions to Address Risks), Clause 8.2 (AI Risk Assessment)Supports compliance

EU AI Act — Article 9 (Risk Management System)

Article 9 requires providers of high-risk AI systems to establish and maintain a risk management system that identifies, analyses, estimates, and evaluates risks. Financial Vulnerability Protection Governance implements a specific risk mitigation measure within this framework. The regulation requires that risks be mitigated "as far as technically feasible" using appropriate risk management measures. For deployments classified as high-risk under Annex III, compliance with AG-628 supports the Article 9 obligation by providing structural governance controls rather than relying solely on the agent's own reasoning or behavioural compliance.

SOX — Section 404 (Internal Controls Over Financial Reporting)

Section 404 requires management to assess the effectiveness of internal controls over financial reporting. For AI agents operating in financial contexts, AG-628 (Financial Vulnerability Protection Governance) implements a governance control that auditors can evaluate as part of the internal control framework. The control must be documented, tested on a defined schedule, and test results retained.

NIST AI RMF — GOVERN 1.1, MAP 3.2, MANAGE 2.2

GOVERN 1.1 addresses legal and regulatory requirements; MAP 3.2 addresses risk context mapping; MANAGE 2.2 addresses risk mitigation through enforceable controls. AG-628 supports compliance by establishing structural governance boundaries that implement the framework's approach to AI risk management.

ISO 42001 — Clause 6.1, Clause 8.2

Clause 6.1 requires organisations to determine actions to address risks and opportunities within the AI management system. Clause 8.2 requires AI risk assessment. Financial Vulnerability Protection Governance implements a risk treatment control within the AI management system, directly satisfying the requirement for structured risk mitigation.

Section 10: Failure Severity

FieldValue
Severity RatingHigh
Blast RadiusBusiness-unit level — affects the deploying team and downstream consumers of agent outputs
Escalation PathSenior management notification within 24 hours; regulatory disclosure assessment within 72 hours

Consequence chain: Failure of financial vulnerability protection governance creates significant operational risk within the agent deployment. The absence of this control allows agent behaviour to deviate from governance intent in ways that may not be immediately visible but accumulate material exposure over time. The impact extends beyond the immediate deployment to affect downstream consumers of agent outputs, stakeholder trust, and regulatory standing. Detection of the failure may be delayed, increasing the remediation scope and cost. Regulatory consequences may include supervisory findings, required corrective actions, and increased scrutiny of the organisation's AI governance programme.

Cite this protocol
AgentGoverning. (2026). AG-628: Financial Vulnerability Protection Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-628