AG-619

Underwriting Fairness Governance

Insurance, Credit & Lending ~23 min read AGS v2.1 · April 2026
EU AI Act SOX FCA NIST ISO 42001

Section 2: Summary

This dimension governs the design, testing, deployment, and ongoing monitoring of AI agents that participate in or materially influence underwriting, pricing, credit-scoring, and claims-eligibility decisions, with the objective of preventing discriminatory outcomes against legally protected and proxy-defined groups. Underwriting fairness is a tier-one regulatory obligation across every major financial jurisdiction, and AI-driven automation dramatically amplifies the scale and speed at which biased decisions can propagate — a model serving ten million applicants per year can produce structurally discriminatory outcomes at industrial scale before a single human reviewer detects the pattern. Failure in this dimension manifests as adverse disparate impact on protected classes, regulatory enforcement actions with material fines, civil litigation, reputational collapse, and — most critically — systematic economic exclusion of the consumers the agent was designed to serve.

Section 3: Examples

Example 3.1 — Proxy Discrimination via Postal-Code Clustering in Motor Insurance Pricing

A motor insurer deploys an AI pricing agent that synthesises 340 input features to produce a base premium. The feature set does not include race, ethnicity, or religion explicitly. However, two features — granular postal-code cluster (72 distinct bins) and primary-language preference extracted from prior customer-service interactions — together act as near-perfect proxies for racial composition in metropolitan areas. In a 24-month production period, applicants in the highest-minority postal clusters pay premiums 23% higher than actuarially equivalent applicants in majority-white clusters, after controlling for all legitimate risk factors including claims history, vehicle type, annual mileage, and driver age. Total premium overcharge across the affected population is estimated at £18.4 million. When a regulatory examination identifies the pattern through a disparate-impact audit, the insurer faces a £9.1 million fine, mandatory premium rebates, a requirement to rebuild the pricing model under regulatory supervision, and a two-year enhanced reporting obligation. The underlying failure is the absence of any proxy-feature screening protocol during model development and the absence of post-deployment disparate-impact monitoring.

Example 3.2 — Credit Score Compression Amplifying Racial Wealth Gap in Mortgage Underwriting

A mortgage lender integrates an AI agent that supplements traditional credit-scoring with a proprietary "financial health index" trained on transaction-level banking data. The index is trained on a historical dataset spanning 2010–2020, a period during which the 2008–2012 foreclosure crisis had disproportionately depleted assets and disrupted credit histories for Black and Hispanic households at rates 2.3× higher than white households — not because of inherent risk differences but because of geographically concentrated predatory lending during the preceding decade. The AI agent faithfully learns the correlations in the training data. On a test cohort of 8,200 applications, the agent approves 71% of white applicants, 44% of Black applicants, and 47% of Hispanic applicants at the same income-to-debt ratio band. The four-fifths rule threshold is 0.8; the ratios observed are 0.62 and 0.66 respectively, indicating statistically significant adverse impact. The lender does not conduct pre-deployment disparate-impact testing because it classifies the financial health index as a "supplementary analytics tool" rather than a model subject to model-risk governance. A plaintiff class-action under the Equal Credit Opportunity Act (ECOA) and Fair Housing Act (FHA) proceeds to a $34 million settlement. The failure chain is: biased historical data → no pre-deployment testing → no adverse-impact threshold → no monitoring → unchecked production deployment → class-action exposure.

Example 3.3 — Cross-Border Proxy Misfire in Embedded Insurance for Consumer Lending

A fintech platform operating across Germany, Poland, and the Czech Republic deploys an embedded insurance underwriting agent that assigns credit-life insurance premiums at the point of consumer loan origination. The agent uses device metadata — handset model, operating system version, and browser language setting — as features correlated with repayment behaviour in its training population. In Germany the feature performs as intended, with no detectable group disparities. In Poland and the Czech Republic, however, device ownership patterns correlate strongly with Roma ethnicity — a legally protected characteristic under EU anti-discrimination law and the EU AI Act's prohibited-ground provisions. Applicants from communities with lower average device-tier exhibit premium loadings of up to 31% above the mean, with no corresponding claims-data justification. The system has no cross-border fairness monitoring; fairness testing was conducted only on the German training population. Regulatory referral from the Czech Financial Market Supervisory Authority (ČNB) triggers a pan-EU investigation under the AI Act high-risk system provisions. The company is required to suspend the agent in all three jurisdictions, conduct full retrospective impact assessments, remediate affected customers, and implement continuous cross-border demographic-parity monitoring before reinstatement. Estimated remediation cost: €6.2 million; lost revenue during suspension: €3.8 million over seven months.

Section 4: Requirement Statement

4.0 Scope

This dimension applies to any AI agent, model, scoring system, or automated decision component that:

The scope extends to upstream feature-engineering pipelines, pre-trained foundation models used as embeddings, and third-party data enrichment feeds when those components materially influence outputs. Pure rules-based systems with no learned parameters are out of scope; hybrid systems where learned components contribute to final decisions are in scope.

The dimension applies across all deployment jurisdictions in which the Primary Profiles operate and does not permit jurisdiction-selective fairness compliance.

4.1 Protected Characteristic Register

4.1.1 The deploying organisation MUST maintain a documented Protected Characteristic Register (PCR) that enumerates, for each jurisdiction of operation, every legally protected characteristic — including but not limited to race, colour, ethnicity, national origin, sex, gender identity, sexual orientation, religion, disability, age, familial status, marital status, and pregnancy — alongside any locally recognised additional grounds.

4.1.2 The PCR MUST be reviewed and updated at minimum every twelve months and upon any material change in applicable law or regulatory guidance.

4.1.3 The agent's feature set MUST be screened against the PCR at design time, and all features with a documented Pearson correlation |r| ≥ 0.4 or a mutual information score ≥ 0.25 with any protected characteristic in the PCR MUST be classified as high-proxy-risk features requiring explicit justification for inclusion.

4.1.4 High-proxy-risk features included in a production model MUST be accompanied by a written business necessity and least-discriminatory-alternative (LDA) analysis, approved by a designated Fairness Review Authority (FRA) with authority independent of the commercial underwriting function.

4.2 Pre-Deployment Disparate Impact Testing

4.2.1 Before any agent is deployed into a live underwriting, pricing, or credit-decision pipeline, the deploying organisation MUST conduct a pre-deployment disparate impact assessment (DIA) covering every protected characteristic in the PCR for the target jurisdiction.

4.2.2 The DIA MUST compute, at minimum, the following metrics for each protected class pair (protected group vs. reference group): approval-rate ratio (four-fifths/80% rule), adverse action rate difference, mean score difference with 95% confidence intervals, and — for pricing outputs — mean premium/rate ratio and coefficient of variation across demographic segments.

4.2.3 Where the four-fifths rule threshold is breached for any protected group, deployment MUST be blocked unless the FRA grants an explicit written exception supported by: (a) actuarial or credit-risk evidence that the disparity reflects legitimate risk differentiation; (b) documented failure of all identified mitigation strategies to reduce disparity below the threshold while maintaining acceptable model performance; and (c) a time-bounded remediation plan.

4.2.4 The pre-deployment DIA report MUST be retained and made available to regulators upon request. Retention period: minimum seven years or the duration of the product life cycle, whichever is longer.

4.3 Proxy Feature Elimination and Causal Pathway Analysis

4.3.1 The deploying organisation MUST conduct a causal pathway analysis for each feature in the model that is not directly observable actuarial or credit-risk data, documenting the hypothesised causal chain from feature to risk outcome.

4.3.2 Features for which no plausible actuarial or credit-risk causal chain can be documented MUST be removed from the production feature set regardless of predictive lift.

4.3.3 The agent MUST NOT use postal code, census tract, or geographic unit as a direct feature when that geographic unit exhibits a demographic composition correlation ≥ 0.5 with any protected characteristic in the PCR, unless actuarial necessity is demonstrated through a geographic risk study conducted by a qualified actuary and reviewed by the FRA.

4.3.4 The deploying organisation SHOULD apply algorithmic debiasing techniques — including but not limited to adversarial debiasing, reweighting, and calibrated threshold adjustment — and MUST document the outcome of each technique evaluated, including any performance–fairness trade-off accepted.

4.4 Post-Deployment Fairness Monitoring

4.4.1 Following deployment, the deploying organisation MUST implement continuous fairness monitoring with a minimum monitoring cadence of monthly for high-volume agents (≥ 10,000 decisions per month) and quarterly for lower-volume agents.

4.4.2 The monitoring programme MUST track the same metrics required in the pre-deployment DIA (Section 4.2.2) and MUST trigger an automated alert when any metric crosses a defined threshold. The threshold for approval-rate ratio MUST be set no higher than the four-fifths rule value; organisations MAY set more stringent internal thresholds.

4.4.3 Upon threshold breach, the deploying organisation MUST initiate a root-cause investigation within five business days and MUST either: (a) implement a remediation within 30 calendar days; or (b) escalate to the FRA with a documented justification if remediation within 30 days is not technically feasible.

4.4.4 The agent MUST be subject to a full re-validation — equivalent to the pre-deployment DIA — at least every 24 months and whenever any of the following occur: a material change in the input feature set; retraining on a new or significantly extended dataset; a change in the underwriting product or coverage structure; or a new deployment jurisdiction.

4.4.5 Monitoring data MUST be retained for a minimum of seven years and MUST be structured to support regulatory examination queries.

4.5 Adverse Action Explainability

4.5.1 When an agent produces an adverse underwriting or credit decision — decline, referral to substandard terms, rate increase above standard band, or coverage exclusion — the deploying organisation MUST be able to produce, within five business days of a consumer request, a plain-language adverse action notice that: (a) states the principal factors contributing to the decision; (b) does not reference any protected characteristic; and (c) is consistent with the internal model explanation.

4.5.2 The adverse action notice MUST be generated from the same explanation mechanism used for internal audit purposes; shadow explanations constructed solely for consumer disclosure that diverge from internal model attributions are prohibited.

4.5.3 The deploying organisation MUST maintain an explanation audit trail linking each adverse action notice to the specific model version, input feature values (appropriately anonymised), and explanation output that generated it.

4.6 Human Oversight and Escalation

4.6.1 The deploying organisation MUST designate a Fairness Review Authority (FRA) composed of individuals with independence from commercial underwriting targets, including at minimum one qualified actuary or credit-risk professional and one person with documented expertise in fair-lending or anti-discrimination law.

4.6.2 The agent MUST be configured to route borderline decisions — defined as decisions within a configurable confidence band around the decision threshold — to human review, and the reviewing human MUST have access to the explanation output before making a final determination.

4.6.3 Human reviewers MUST NOT be provided with information about a consumer's protected characteristics when conducting a fairness-sensitive review, and the review workflow MUST technically enforce this constraint where feasible.

4.6.4 The deploying organisation SHOULD establish a consumer redress pathway through which applicants who believe they have been unfairly treated may request a human re-review, and MUST document the disposition of all such requests.

4.7 Data Governance for Training and Calibration Data

4.7.1 The deploying organisation MUST document the demographic composition of all datasets used to train, validate, and calibrate the agent, including identification of any known historical periods or geographic regions associated with discriminatory lending, underwriting, or claims practices.

4.7.2 Where training data contains historical periods associated with documented discriminatory practices, the deploying organisation MUST apply data remediation techniques — including temporal windowing, re-sampling, or synthetic augmentation — and MUST document the rationale for the approach selected.

4.7.3 Third-party data feeds and enrichment services used as model inputs MUST be subject to the same proxy-screening and disparate-impact assessment requirements as internally generated features. Vendor contractual arrangements MUST include audit rights permitting the deploying organisation to assess the fairness properties of external data.

4.7.4 Data used for model training MUST be version-controlled, with each training run linked to the specific dataset version used, to enable retrospective investigation of fairness properties.

4.8 Cross-Border and Multi-Jurisdiction Operation

4.8.1 Agents deployed across multiple jurisdictions MUST conduct separate pre-deployment DIAs for each jurisdiction, using training and validation data that is representative of the applicant population in that jurisdiction.

4.8.2 A fairness configuration that is compliant in one jurisdiction MUST NOT be assumed to be compliant in another; cross-jurisdiction transferability MUST be explicitly tested and documented.

4.8.3 The deploying organisation MUST maintain a jurisdiction-specific fairness compliance matrix that maps each jurisdiction's protected characteristics, applicable thresholds, and monitoring obligations, and MUST update this matrix within 60 days of any material change in applicable law or regulatory guidance in any active jurisdiction.

4.8.4 Where jurisdictional legal requirements conflict — for example, where one jurisdiction requires collection of demographic data for monitoring purposes and another prohibits it — the deploying organisation MUST document the conflict, adopt the more protective standard where technically feasible, and seek regulatory guidance where it is not.

4.9 Record-Keeping and Regulatory Disclosure

4.9.1 The deploying organisation MUST maintain a complete model card or equivalent structured documentation for each production agent, updated at each retraining cycle, that includes: model purpose, training data provenance, feature list with proxy-risk classification, pre-deployment DIA results, known limitations, and monitoring programme summary.

4.9.2 All fairness-related documentation — PCR, DIA reports, FRA decisions, monitoring records, remediation plans, and adverse action audit trails — MUST be retained for a minimum of seven years in a format that supports export and regulatory examination.

4.9.3 Upon formal regulatory request, the deploying organisation MUST be able to produce a complete fairness evidence package within 20 business days; the agent's technical infrastructure MUST support this obligation.

4.9.4 The deploying organisation SHOULD publish an annual Algorithmic Fairness Summary accessible to consumers, describing in non-technical language how fairness is assessed and maintained in its underwriting and pricing processes.

Section 5: Rationale

5.1 Why Preventive Control Is Necessary

Underwriting and pricing decisions are binary, consequential, and — in AI-driven pipelines — executed at a scale and speed that makes after-the-fact correction economically and socially inadequate. A discriminatory pricing model operating at 50,000 decisions per month inflicts harm on approximately 1,650 consumers per day before post-deployment monitoring can detect and remediate a problem. Preventive controls — pre-deployment testing, proxy screening, causal pathway analysis — intercept structural bias before it enters production. Detective controls alone, however well designed, are insufficient in high-velocity pipelines; they can reduce but not eliminate harm to consumers who received biased decisions before detection.

5.2 Structural vs Behavioural Enforcement

Traditional fair-lending compliance assumes that discriminatory intent can be identified and prohibited — a behavioural model. AI underwriting bias is predominantly structural: it emerges from data distributions, feature correlations, and optimisation objectives that encode historical inequities without any discriminatory intent by the deploying organisation. Structural bias requires structural controls: feature engineering governance, disparate-impact thresholds enforced at the pipeline level, and monitoring architectures that track outcomes rather than inputs. Behavioural controls — training, attestation, policy — are necessary but not sufficient. This dimension therefore mandates structural interventions (PCR screening, DIA gating, proxy elimination) as the primary mechanism, supported by behavioural controls at the governance layer (FRA oversight, human escalation, consumer redress).

5.3 The Performance–Fairness Trade-Off

A persistent objection to fairness constraints in underwriting AI is that they degrade predictive accuracy, increasing credit risk or mis-pricing actuarial risk. This objection conflates two distinct phenomena: (a) genuine risk differentiation that happens to correlate with protected characteristics because of structural economic inequality; and (b) spurious predictive lift generated by proxy features that capture group membership rather than individual risk. Controls in this dimension are designed to eliminate (b) while preserving (a). The Least Discriminatory Alternative analysis required under Section 4.1.4 operationalises this distinction by requiring the deploying organisation to demonstrate that no equally accurate model with lower disparate impact exists before accepting a feature with high proxy risk.

5.4 Why Enhanced Tier Is Appropriate

The Enhanced tier designation reflects that underwriting and pricing decisions are: (i) high-stakes and individually consequential; (ii) subject to specific regulatory obligations in every major jurisdiction; (iii) executed at industrial scale, amplifying any bias; and (iv) increasingly automated in ways that reduce the frequency of human review. The combination of individual harm magnitude, population scale, regulatory exposure, and reduced human oversight justifies controls beyond those applicable to lower-risk AI deployments.

Section 6: Implementation Guidance

Fairness-by-Design Feature Engineering. Implement proxy-risk screening as a formal gate in the model development lifecycle, before model training begins. Compute mutual information scores between all candidate features and every protected characteristic in the PCR for the target jurisdiction. Document results in a feature-level fairness dossier. Treat high-proxy-risk features as requiring affirmative justification rather than default inclusion.

Stratified Train/Validation/Test Splits. Ensure that model validation datasets are stratified to achieve minimum representation thresholds for all protected groups present in the applicant population. A rule of thumb of at least 500 observations per protected group in the test set is advisable to achieve statistical power sufficient to detect disparate impact at the four-fifths threshold with 80% power.

Threshold Calibration by Group. In binary classification underwriting agents, consider post-processing threshold optimisation that equalises false-positive or false-negative rates across groups, depending on which error type causes greater consumer harm. Document the metric selected and the rationale. Implement threshold calibration as a configurable parameter, separate from the model weights, to allow adjustment in response to monitoring signals without full model retraining.

Layered Monitoring Architecture. Implement fairness monitoring at three layers: (i) input-data monitoring for distributional shift in protected-group representation; (ii) score-level monitoring for mean score differences across groups; and (iii) outcome-level monitoring for approval-rate ratios and premium distribution. Input-level shifts may precede outcome-level disparities by weeks; early detection at the input layer allows intervention before consumer harm accumulates.

Differential Privacy for Demographic Data. Where demographic data must be collected for monitoring purposes but regulatory or privacy constraints limit its use, implement differential privacy techniques to enable aggregate fairness statistics while protecting individual-level demographic information.

FRA Governance Cadence. Establish a quarterly FRA review meeting at which fairness monitoring reports, open remediation items, and proposed model changes are reviewed. Maintain formal minutes and decision records. Integrate FRA sign-off into the model deployment checklist as a hard gate that cannot be bypassed by commercial teams.

Retrospective Impact Assessment Protocol. Define a documented protocol for retrospective impact assessments triggered by monitoring alerts, consumer complaints, or regulatory inquiries. The protocol should specify the look-back period, statistical methodology, remediation trigger levels, and consumer notification obligations.

6.2 Explicit Anti-Patterns

Anti-Pattern: Shadow Explanations. Constructing simplified or sanitised explanations for consumer adverse action notices that do not reflect the actual model attribution — colloquially known as "explanation laundering" — is both a regulatory violation and a governance failure. Adverse action notices must be derived mechanically from the same SHAP, LIME, or equivalent explanation mechanism used for internal audit. Any divergence between internal and external explanations invalidates the explainability control and exposes the organisation to significant regulatory risk.

Anti-Pattern: Jurisdiction-Level Scope Limitation. Conducting the pre-deployment DIA only for the primary development jurisdiction and then deploying the model globally without re-testing is a recurring failure mode, illustrated by Example 3.3. Fairness properties of a model are data-distribution-specific; a model that is fair in one demographic context can be severely discriminatory in another. Multi-jurisdiction deployment requires multi-jurisdiction testing.

Anti-Pattern: Treating "No Explicit Protected Feature" as Compliance. Removing explicit protected characteristics from the feature set does not constitute compliance with anti-discrimination obligations. Proxy discrimination through correlated features is well-documented and legally equivalent to direct discrimination in most jurisdictions. The absence of race from a feature set is not evidence of fairness; it is merely the absence of one type of evidence of unfairness.

Anti-Pattern: Single-Threshold Fairness Governance. Relying exclusively on the four-fifths rule as the sole fairness criterion is insufficient. The four-fifths rule is a practical rule of thumb for adverse action rates; it does not capture pricing disparities, calibration differences, or disparities that emerge only at specific score thresholds. A multi-metric approach is required.

Anti-Pattern: One-Time Pre-Deployment Testing Without Ongoing Monitoring. Conducting a thorough pre-deployment DIA and then treating fairness as a closed issue until the next scheduled model retraining is a dangerous gap. Population composition, economic conditions, and product structures change over time, and a model that is fair at deployment can exhibit disparate impact twelve months later due to distributional shift. Fairness is a continuous property, not a one-time certification.

Anti-Pattern: Delegating Fairness Governance to the Modelling Team. Placing the FRA function within the team responsible for commercial model performance creates a structural conflict of interest. The FRA must have institutional independence — reporting to a risk, compliance, or legal function — and must have authority to block deployment, not merely to advise.

6.3 Industry-Specific Considerations

Insurance. Actuarial rating factors have a long-established legal framework distinguishing permitted risk classification from prohibited discrimination. The key distinction is that factors must be causally and statistically linked to the insured risk, not merely correlated with protected characteristics. AI agents must map onto this framework explicitly; the causal pathway analysis in Section 4.3.1 operationalises it.

Mortgage and Consumer Credit. ECOA in the United States, the Consumer Credit Directive in the EU, and equivalent legislation in other jurisdictions impose specific adverse action notification obligations and prohibit the use of protected characteristics in creditworthiness assessments. AI agents in this space must be designed with those obligations built into the decision pipeline, not retrofitted as post-hoc compliance.

Embedded Insurance and Lending. Products distributed through third-party platforms (e-commerce checkout insurance, point-of-sale financing) frequently use behavioural and device-based signals as underwriting features. These signals carry high proxy risk in populations where technology access is unequally distributed across protected groups. Enhanced proxy screening is particularly important in this segment.

6.4 Maturity Model

Maturity LevelCharacteristics
Level 1 — InitialNo formal fairness testing; protected characteristics excluded from features; no monitoring
Level 2 — DevelopingPre-deployment DIA conducted; four-fifths rule applied; no continuous monitoring; no FRA
Level 3 — DefinedFull PCR; pre-deployment DIA with multi-metric analysis; FRA established; quarterly monitoring
Level 4 — ManagedContinuous automated monitoring; cross-border testing; threshold calibration; LDA analysis documented
Level 5 — OptimisingCausal pathway analysis; fairness-by-design feature engineering; public algorithmic fairness reporting; real-time monitoring with automated remediation triggers

Organisations operating at Tier Enhanced under AGS v2.1 are expected to meet Level 4 at minimum within 12 months of initial AG-619 conformance assessment and to target Level 5 within 24 months.

Section 7: Evidence Requirements

7.1 Mandatory Artefacts

ArtefactDescriptionRetention Period
Protected Characteristic Register (PCR)Current and historical versions with version dates and approval signatures7 years
Feature Proxy-Risk AssessmentPer-feature mutual information scores, correlation matrices, high-proxy-risk classification decisions7 years per model version
Causal Pathway DocumentationWritten causal chain for each non-directly-observable feature7 years per model version
Least Discriminatory Alternative AnalysisDocumentation of alternatives considered and rejected, with performance–fairness trade-off data7 years per model version
Pre-Deployment Disparate Impact Assessment ReportFull statistical output including approval-rate ratios, confidence intervals, pricing distribution analysis, FRA sign-off7 years
FRA Meeting Minutes and DecisionsRecords of all FRA reviews, including exception approvals and remediation decisions7 years
Monitoring ReportsMonthly or quarterly fairness monitoring outputs with threshold breach records7 years
Root-Cause Investigation ReportsDocumentation of investigations triggered by monitoring alerts7 years
Remediation Plans and OutcomesTime-bounded remediation commitments and evidence of completion7 years
Adverse Action Audit TrailLinkage of each adverse action notice to model version, feature values, and explanation output7 years or duration of consumer relationship, whichever is longer
Consumer Redress RecordsLog of fairness-related consumer requests and dispositions7 years
Jurisdiction Fairness Compliance MatrixCurrent and historical versions by jurisdictionDuration of market presence plus 7 years
Model Card / Model DocumentationStructured documentation per Section 4.9.1Duration of model production use plus 7 years
Training Data Version RecordsDataset version identifiers linked to each model training run7 years
Third-Party Data Fairness AssessmentsProxy-screening and DIA results for external data feeds7 years

7.2 Regulatory Examination Package

The deploying organisation must be capable of assembling and delivering the complete set of mandatory artefacts listed in Section 7.1 within 20 business days of a formal regulatory request. A documented evidence assembly procedure, including named responsible roles and system locations for each artefact class, is required.

7.3 Internal Audit Access

All artefacts in Section 7.1 must be accessible to the internal audit function without requiring approval from the commercial underwriting function. Access controls must be documented.

Section 8: Test Specification

Each test maps to one or more MUST requirements in Section 4. Conformance scores are assigned on a 0–3 scale: 0 = Non-Conformant (requirement clearly not met); 1 = Partially Conformant (requirement partially met with material gaps); 2 = Substantially Conformant (requirement met with minor gaps); 3 = Fully Conformant (requirement fully met with documented evidence).

Test 8.1 — Protected Characteristic Register Completeness and Currency

Maps to: 4.1.1, 4.1.2

Objective: Verify that a complete, current PCR exists for each deployment jurisdiction.

Procedure:

  1. Request the current PCR for each active deployment jurisdiction.
  2. Cross-reference against legal research for each jurisdiction's protected characteristics under applicable fair-lending, insurance discrimination, and anti-discrimination law.
  3. Verify that the PCR was reviewed within the preceding 12 months, with a signed approval record.
  4. Verify that the PCR reflects any regulatory or legislative changes in the preceding 12 months.

Pass Criteria:

Fail Indicators: PCR absent for any jurisdiction; PCR last reviewed more than 12 months ago; known protected characteristics omitted; no approval record.

Scoring:

ScoreCondition
3PCR complete, current, and approved for all active jurisdictions
2PCR complete and current for primary jurisdiction; minor gaps in secondary jurisdictions
1PCR exists but is materially incomplete or more than 12 months old
0No PCR; or PCR absent for primary deployment jurisdiction

Test 8.2 — Pre-Deployment Disparate Impact Assessment Adequacy

Maps to: 4.2.1, 4.2.2, 4.2.3, 4.2.4

Objective: Verify that a complete pre-deployment DIA was conducted before production deployment, using the required metrics, and that deployment was blocked or subject to documented FRA exception where thresholds were breached.

Procedure:

  1. Request the pre-deployment DIA report for the current production model version.
  2. Verify that the DIA covers all protected characteristics in the PCR for each active jurisdiction.
  3. Verify that the DIA computes: approval-rate ratio; adverse action rate difference; mean score difference with 95% confidence intervals; and mean premium/rate ratio (for pricing agents).
  4. Identify any metric values breaching the four-fifths rule threshold.
  5. For any breach, verify that either (a) deployment was blocked and the model retrained/remediated, or (b) an FRA exception is documented meeting the requirements of 4.2.3 (a), (b), and (c).
  6. Verify that the DIA report is signed and dated before the production deployment date.

Pass Criteria: DIA present, covers all required groups, uses all required metrics, pre-dates deployment, and any threshold breaches are resolved through documented remediation or FRA exception.

Scoring:

ScoreCondition
3DIA complete, pre-deployment dated, covers all groups and metrics, all breaches resolved
2DIA conducted with all required metrics; minor documentation gaps (e.g., confidence intervals missing for one group)
1DIA conducted but covers only primary protected characteristic or uses only four-fifths rule metric; or post-dates deployment
0No pre-deployment DIA; or deployment proceeded despite undocumented threshold breach

Test 8.3 — Proxy Feature Screening and Causal Pathway Documentation

Maps to: 4.1.3, 4.1.4, 4.3.1, 4.3.2, 4.3.3

Objective: Verify that all model features have been screened for proxy risk and that high-proxy-risk features are justified through documented causal pathway analysis and LDA.

Procedure:

  1. Request the feature proxy-risk assessment documentation for the current production model.
  2. Verify that mutual information scores and/or Pearson correlation coefficients between each feature and each PCR-listed protected characteristic have been computed.
  3. Identify all features classified as high-proxy-risk (|r| ≥ 0.4 or MI ≥ 0.25).
  4. For each high-proxy-risk feature included in the model, verify that a causal pathway document and LDA analysis exist, and that FRA approval is documented.
  5. Verify that geographic features (postal code, census tract) that correlate ≥ 0.5 with a protected characteristic have either been excluded or are supported by a qualified actuary's geographic risk study reviewed by the FRA.
  6. Verify that any feature excluded at Section 4.3.2 is documented as excluded with the reason.

Pass Criteria: All features screened; all high-proxy-risk

Section 9: Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 9 (Risk Management System)Direct requirement
SOXSection 404 (Internal Controls Over Financial Reporting)Supports compliance
FCA SYSC6.1.1R (Systems and Controls)Supports compliance
NIST AI RMFGOVERN 1.1, MAP 3.2, MANAGE 2.2Supports compliance
ISO 42001Clause 6.1 (Actions to Address Risks), Clause 8.2 (AI Risk Assessment)Supports compliance

EU AI Act — Article 9 (Risk Management System)

Article 9 requires providers of high-risk AI systems to establish and maintain a risk management system that identifies, analyses, estimates, and evaluates risks. Underwriting Fairness Governance implements a specific risk mitigation measure within this framework. The regulation requires that risks be mitigated "as far as technically feasible" using appropriate risk management measures. For deployments classified as high-risk under Annex III, compliance with AG-619 supports the Article 9 obligation by providing structural governance controls rather than relying solely on the agent's own reasoning or behavioural compliance.

SOX — Section 404 (Internal Controls Over Financial Reporting)

Section 404 requires management to assess the effectiveness of internal controls over financial reporting. For AI agents operating in financial contexts, AG-619 (Underwriting Fairness Governance) implements a governance control that auditors can evaluate as part of the internal control framework. The control must be documented, tested on a defined schedule, and test results retained.

NIST AI RMF — GOVERN 1.1, MAP 3.2, MANAGE 2.2

GOVERN 1.1 addresses legal and regulatory requirements; MAP 3.2 addresses risk context mapping; MANAGE 2.2 addresses risk mitigation through enforceable controls. AG-619 supports compliance by establishing structural governance boundaries that implement the framework's approach to AI risk management.

ISO 42001 — Clause 6.1, Clause 8.2

Clause 6.1 requires organisations to determine actions to address risks and opportunities within the AI management system. Clause 8.2 requires AI risk assessment. Underwriting Fairness Governance implements a risk treatment control within the AI management system, directly satisfying the requirement for structured risk mitigation.

Section 10: Failure Severity

FieldValue
Severity RatingHigh
Blast RadiusBusiness-unit level — affects the deploying team and downstream consumers of agent outputs
Escalation PathSenior management notification within 24 hours; regulatory disclosure assessment within 72 hours

Consequence chain: Failure of underwriting fairness governance creates significant operational risk within the agent deployment. The absence of this control allows agent behaviour to deviate from governance intent in ways that may not be immediately visible but accumulate material exposure over time. The impact extends beyond the immediate deployment to affect downstream consumers of agent outputs, stakeholder trust, and regulatory standing. Detection of the failure may be delayed, increasing the remediation scope and cost. Regulatory consequences may include supervisory findings, required corrective actions, and increased scrutiny of the organisation's AI governance programme.

Cite this protocol
AgentGoverning. (2026). AG-619: Underwriting Fairness Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-619