AG-444: Override Rationale Capture Governance

2. Summary

Override Rationale Capture Governance requires that every instance where a human overrides an AI agent's output, recommendation, or autonomous action — and every instance where an AI agent overrides a human instruction, constraint, or prior decision — is accompanied by a structured, contemporaneous rationale explaining why the override occurred, what information or judgement justified departing from the overridden party's determination, and what the expected consequences of the override are. Overrides are a critical governance signal: they represent moments where human and machine judgement diverge. The rationale for the override — not merely the fact of the override — is the artefact that enables regulatory accountability, systemic learning, and post-incident investigation. An override without a rationale is an unexplained deviation; an override with a rationale is an accountable decision.

3. Example

Scenario A — Human Override Without Rationale Creates Regulatory Liability: A credit assessment agent recommends rejection of a mortgage application based on a debt-to-income ratio of 47%, exceeding the firm's 43% threshold. A loan officer overrides the rejection and approves the application. The override is recorded in the system as "Manual override — Approved" with the loan officer's ID and a timestamp. No rationale is recorded. Fifteen months later, the borrower defaults. Regulatory review examines the override. The loan officer has since left the firm. No one can explain why the rejection was overridden. The firm cannot demonstrate whether the override was based on legitimate factors (additional income not captured in the automated assessment, compensating reserves, a temporary income disruption) or illegitimate factors (sales pressure, discriminatory intent, inadequate training). The regulator treats the unexplained override as evidence of inadequate lending controls.

What went wrong: The system permitted an override without requiring a contemporaneous rationale. The override was recorded as a fact (who, when, what) but not as a decision (why). When the loan officer departed, the institutional knowledge of the rationale departed with them. The firm was left with a documented override that it could not explain — creating the inference that no legitimate rationale existed. Consequence: Regulatory finding for inadequate lending controls, £890,000 in fines, mandatory remediation of override processes across all lending products, and retrospective review of 2,300 prior overrides at a cost of £340,000.

Scenario B — AI Override of Human Instruction Without Rationale Causes Safety Incident: A warehouse robotic agent receives an instruction from a human operator to proceed with a picking sequence in Aisle 7. The agent's safety system detects an obstacle pattern consistent with a partially collapsed shelf structure and autonomously overrides the instruction, rerouting to Aisle 12. The override is logged as "Safety override — Route changed" with a timestamp and the new route. No detailed rationale is recorded: the specific sensor readings, the classification of the obstacle, the confidence level of the detection, or the risk assessment that triggered the override are not captured. The operator, seeing the route change without explanation, manually re-issues the Aisle 7 instruction, believing the override was a navigation error. The agent's safety system does not re-engage (the override was a one-time intervention, not a persistent block). The operator enters Aisle 7 and encounters the partially collapsed shelf. A falling item strikes the operator, causing a wrist fracture and 6 weeks of absence.

What went wrong: The AI agent overrode the human instruction for a legitimate safety reason but did not communicate the rationale. The log entry "Safety override — Route changed" told the operator what happened but not why. Without understanding the reason for the override, the operator reasonably concluded it was an error and re-issued the instruction. The agent's safety system treated the override as a one-time event rather than a persistent condition. Had the rationale been captured and communicated — "Obstacle detected in Aisle 7: sensor cluster 4B reports pattern consistent with shelf structural failure, confidence 0.87, risk classification: potential falling object hazard" — the operator would not have re-entered the aisle. Consequence: Worker injury, 6-week absence costing £14,000 in lost productivity and temporary cover, Health and Safety Executive investigation, £95,000 fine for inadequate safety communication, and mandatory retrofit of override explanation systems across all 340 robotic agents in the facility at £280,000.

Scenario C — Systematic Overrides Without Rationales Conceal Algorithmic Bias: A public benefits agency deploys an eligibility assessment agent. Over 18 months, human caseworkers override the agent's eligibility determinations in 12% of cases. The overrides split into two categories: 8% are overrides from "ineligible" to "eligible" (caseworkers approving applicants the agent rejected), and 4% are overrides from "eligible" to "ineligible" (caseworkers rejecting applicants the agent approved). No structured rationales are required or recorded — overrides are logged as "caseworker override" with an outcome. An external audit examines the override pattern and discovers a correlation: overrides from "ineligible" to "eligible" disproportionately benefit applicants from a specific ethnic group (18% override rate versus 5% for other groups). Without rationales, the audit cannot determine whether this pattern reflects: (a) the agent is systematically biased against the group, and caseworkers are correcting a legitimate problem; or (b) caseworkers are applying inconsistent standards based on protected characteristics; or (c) a confounding variable explains the pattern. The agency cannot distinguish between a well-functioning correction mechanism and a discriminatory override practice.

What went wrong: The absence of override rationales made it impossible to diagnose whether the override pattern was corrective (fixing AI bias) or harmful (introducing human bias). Both interpretations have radically different remediation paths: if the agent is biased, it needs retraining; if caseworkers are biased, they need retraining. Without rationales, neither diagnosis is possible, and the agency is exposed to discrimination claims regardless of the true cause. Consequence: Formal investigation by the equality regulator, suspension of the automated assessment pending investigation, £1.2 million in manual processing costs during suspension, reputational damage from media coverage of the undiagnosed bias pattern, and inability to remediate because the root cause cannot be determined.

4. Requirement Statement

Scope: This dimension applies to any AI agent deployment where overrides can occur in either direction: human-overrides-agent (a human changes, reverses, or supersedes the agent's output, recommendation, decision, or action) or agent-overrides-human (the agent departs from, rejects, or modifies a human instruction, constraint, or prior decision, including safety overrides, mandate boundary enforcement, and compliance blocks). The scope includes explicit overrides (a deliberate act to change a determination) and implicit overrides (a human consistently ignoring agent recommendations without formal rejection, or an agent systematically deprioritising certain human instructions). It covers both pre-execution overrides (changing a recommendation before it is acted upon) and post-execution overrides (reversing an action that has already been taken). Any system where human and machine determinations can diverge, and where one party's determination can supersede the other's, falls within scope.

4.1. A conforming system MUST require a structured, contemporaneous rationale for every human-overrides-agent event, recorded at the time of the override (not after the fact), including: the specific output or action being overridden, the override decision, the factual basis for the override (what information or judgement justified the departure), and the expected consequence of the override.

4.2. A conforming system MUST require a structured, contemporaneous rationale for every agent-overrides-human event, generated by the agent at the time of the override, including: the specific human instruction or constraint being overridden, the system condition or rule that triggered the override, the data inputs or sensor readings that informed the override decision, the confidence level of the override determination, and the alternative action taken.

4.3. A conforming system MUST store all override rationales in a tamper-evident record (per AG-006) linked to the specific decision, output, or action that was overridden, with timestamps, the identity of the overriding party (human identity or agent system identifier), and a unique override event identifier.

4.4. A conforming system MUST communicate agent-overrides-human rationales to the affected human in a timely, comprehensible format that enables the human to understand why their instruction was overridden and to take appropriate follow-up action (including re-issuing the instruction with additional context, escalating, or accepting the override).

4.5. A conforming system MUST enforce a minimum rationale quality standard that rejects empty, generic, or template-only rationales (e.g., "professional judgement," "override required," "system decision") and requires specific factual content demonstrating genuine reasoning.

4.6. A conforming system MUST generate override analytics — frequency, direction (human-over-agent or agent-over-human), category, outcome, and correlation with decision attributes — and surface patterns that may indicate systemic issues including algorithmic bias, human bias, training gaps, or agent miscalibration.

4.7. A conforming system SHOULD classify overrides into structured categories (e.g., "additional information not available to agent," "policy exception with approval," "safety hazard detected," "agent error correction," "compliance constraint enforcement") to enable systematic analysis alongside the free-text rationale.

4.8. A conforming system SHOULD implement tiered override authority, requiring higher levels of rationale detail and approval for higher-risk overrides (e.g., overrides of financial decisions above a value threshold, overrides of safety determinations, overrides affecting protected-class individuals).

4.9. A conforming system SHOULD feed override patterns into agent improvement cycles, using consistent human-overrides-agent patterns in specific decision categories as evidence that the agent's performance in those categories requires recalibration or retraining.

4.10. A conforming system MAY implement override prediction, where the system identifies decisions likely to be overridden based on historical patterns and proactively flags them for enhanced review, reducing the need for post-hoc overrides.

5. Rationale

Overrides are the moments where human-AI collaboration is tested. When a human overrides an AI agent, it may indicate that the human has access to information the agent lacks, that the agent has made an error, that circumstances have changed since the agent's assessment, or that the human is applying judgement the agent cannot replicate. Equally, it may indicate that the human is biased, undertrained, fatigued, or acting under improper pressure. When an AI agent overrides a human, it may indicate that a safety constraint was triggered, that a compliance boundary was reached, that the human's instruction conflicted with the agent's mandate, or that the agent detected a hazard the human did not perceive. Equally, it may indicate that the agent's sensors or models are miscalibrated, that its safety thresholds are too conservative, or that it misinterpreted the human's instruction.

The rationale — not the fact — of the override is what distinguishes these interpretations. Without the rationale, a human override is an unexplained deviation from the system's recommendation, and an AI override is an unexplained rejection of a human instruction. Neither can be evaluated for appropriateness, investigated for bias, or learned from for systemic improvement. The override becomes a black box event: we know it happened, but we cannot determine whether it should have happened.

Regulatory frameworks increasingly require explainability not just of AI decisions but of human interventions in AI processes. The EU AI Act's Article 14 on human oversight implicitly requires that human interventions are documented and accountable — otherwise, the regulator cannot assess whether the oversight mechanism is functioning correctly. In financial services, the FCA expects firms to be able to explain every lending decision, including manual overrides of automated assessments. The Senior Managers and Certification Regime makes individual senior managers personally accountable for decisions within their area — an override without a rationale creates personal regulatory exposure for the overriding individual.

The bias detection problem illustrated in Scenario C is particularly acute. Override patterns are one of the most powerful diagnostic tools for detecting both algorithmic bias and human bias. If an agent systematically recommends denial for a protected group and humans systematically override those denials, the override pattern — with rationales — reveals whether the agent is biased. If humans systematically override in a pattern correlated with protected characteristics, the override pattern — with rationales — reveals whether the overrides are based on legitimate factors or discriminatory ones. Without rationales, the pattern is visible but uninterpretable, leaving the organisation unable to diagnose the problem and exposed to claims from both directions.

Agent-to-human override rationale communication is equally critical. Safety overrides that are not explained are overrides that humans will circumvent. Scenario B demonstrates this directly: the operator re-entered a hazardous area because the agent's override was not accompanied by a comprehensible rationale. The agent had the right information and made the right decision but failed to communicate why, rendering the safety override ineffective. This is not merely an information design problem — it is a governance failure with physical safety consequences.

The temporal dimension matters as well. Rationales must be contemporaneous — recorded at the time of the override, not reconstructed after the fact. Post-hoc rationalisation is unreliable: memory is reconstructive, motivations are reinterpreted in light of outcomes, and the factual basis for the decision may no longer be available. An override rationale recorded 6 months later in response to an investigation is fundamentally different in evidentiary value from a rationale recorded at the moment of decision.

6. Implementation Guidance

Override rationale capture requires technical mechanisms embedded in the override workflow itself — not bolted on as an optional annotation. The rationale must be a mandatory step in the override process, not a field that can be skipped. The design principle is: if you cannot explain why you are overriding, you should not override; and if the system cannot explain why it is overriding, it has a design deficiency.

Recommended patterns:

Inline rationale capture in the override workflow. When a human initiates an override, the system presents a structured rationale form before the override is executed. The form requires: (1) selection of the override category; (2) free-text explanation of the factual basis with a minimum character requirement (recommended: 150 characters for standard overrides, 300 characters for high-risk overrides); (3) identification of the information or judgement that justifies departing from the agent's determination; (4) acknowledgement of the expected consequence. The override is not executed until the rationale is submitted and passes the quality check. This ensures the rationale is contemporaneous and genuinely considered.
Automated agent override explanation generation. When an AI agent overrides a human instruction, the agent generates a structured explanation at the time of the override, including: the triggered rule or condition, the input data that activated the trigger, the confidence level, the risk assessment, and the alternative action taken. This explanation is communicated to the affected human immediately (in safety-critical contexts) or within a defined SLA (in lower-risk contexts). The explanation should be in natural language, not raw system logs, to ensure the human can understand and appropriately respond. For the warehouse scenario (Scenario B), this means: "Route to Aisle 7 blocked: sensors detect structural instability pattern (shelf section 4B, confidence 87%). Rerouted to Aisle 12. Do not enter Aisle 7 until structural inspection is complete."
Rationale quality validation. Implement automated checks that reject rationales failing minimum quality standards. Quality checks include: minimum character length, rejection of known template phrases ("professional judgement," "per policy," "override needed"), requirement for at least one specific factual statement (a number, a date, a reference to specific information), and cross-reference validation (if the override category is "additional information not available to agent," the rationale should reference what that information is). Human review of rationale quality should be sampled periodically — recommend 10% of overrides reviewed for rationale adequacy.
Override analytics dashboard. Aggregate override data into analytics that surface: override frequency by direction (human-over-agent, agent-over-human), by category, by agent, by human operator, by decision type, and by time period. Highlight patterns requiring investigation: increasing override rates in a decision category (possible agent miscalibration), override rates correlated with protected characteristics (possible bias — per Scenario C), individual operators with significantly higher override rates than peers (possible training gap or improper practice), and agents whose overrides are frequently re-overridden by humans (possible threshold miscalibration).
Tiered override authority with escalating rationale requirements. Define override authority tiers based on risk: low-risk overrides require standard rationale from the assigned reviewer; medium-risk overrides require enhanced rationale with supervisor notification; high-risk overrides (above financial thresholds, affecting safety determinations, impacting protected-class individuals) require enhanced rationale, supervisor approval before execution, and automatic escalation to compliance. This ensures that the highest-risk overrides receive the most scrutiny.

Anti-patterns to avoid:

Optional rationale fields. Making the rationale field available but not mandatory. Optional fields are consistently skipped under time pressure. Studies of override processes in healthcare and lending consistently show that optional rationale completion rates fall below 20% within 6 months of deployment.
Template-only rationales. Providing dropdown menus of rationale options without requiring free-text elaboration. "Professional judgement" is not a rationale — it is a label for the absence of a rationale. Template categories are useful for classification but must be supplemented with specific factual content.
Silent agent overrides. Allowing the AI agent to override human instructions without generating an explanation or communicating the override to the affected human. Silent overrides are the direct cause of Scenario B: the operator circumvented the safety override because they did not understand it.
Post-hoc rationale collection. Allowing overrides to be executed immediately with rationale collected later. Post-hoc collection introduces rationalisation bias, is frequently never completed (completion rates degrade to below 30% when rationale is deferred even by one hour), and lacks the evidentiary value of contemporaneous documentation.
Rationale without linkage. Recording rationales in a separate system or log that is not structurally linked to the decision record. If the override rationale and the decision record exist in different systems, the linkage will degrade over time, system migrations will break the connection, and investigators will be unable to locate the rationale for a specific override.

Industry Considerations

Financial Services. Override rationale capture is directly mandated by lending regulations in multiple jurisdictions. The Equal Credit Opportunity Act (US), Consumer Credit Act (UK), and equivalent regulations require that overrides of automated credit decisions be documented with specific justifications. FCA expectations for model risk management require that manual overrides of model outputs are recorded and analysed for patterns. Override rationale records are routinely requested in regulatory examinations and enforcement investigations. Firms should treat override rationale records as regulatory evidence with the same protections and retention requirements as trade records.

Healthcare and Life Sciences. Clinical decision support overrides (a clinician overriding an AI's drug interaction warning, diagnostic suggestion, or treatment recommendation) must be documented with clinical rationale both for patient safety and for medical malpractice liability management. Regulatory bodies including the FDA and MHRA expect that AI-assisted clinical decisions include documentation of human overrides and their clinical justification.

Safety-Critical and Embodied Systems. Agent-overrides-human events in safety-critical contexts (autonomous vehicle disengagements, industrial robot safety stops, drone flight restriction overrides) require real-time rationale communication to human operators. The rationale must be communicated in a format and timeframe that enables the human to take appropriate follow-up action. Delayed or absent communication converts a safety mechanism into a confusion mechanism, as demonstrated in Scenario B. Industry safety standards (IEC 61508, ISO 26262, DO-178C) all require documentation of safety function activations including the triggering conditions.

Public Sector and Rights-Sensitive. Override rationales in benefits determination, immigration, and criminal justice contexts are directly relevant to administrative law requirements for reasoned decision-making. Public law in most jurisdictions requires that decision-makers give reasons for their decisions — an override of an automated determination is a decision that requires reasons. Failure to record and provide reasons may render the override decision unlawful on judicial review. The bias detection application (Scenario C) is particularly important in public sector contexts where equality obligations apply.

Maturity Model

Basic Implementation — The system requires a structured rationale for every human-overrides-agent event, captured contemporaneously with a mandatory free-text field and minimum quality validation. Agent-overrides-human events generate a structured log with triggered conditions and input data. All override rationales are stored in tamper-evident records linked to the overridden decision. Override frequency and direction metrics are calculated and reported quarterly. This level meets the minimum mandatory requirements.

Intermediate Implementation — All basic capabilities plus: agent-overrides-human rationales are communicated to affected humans in natural language within defined SLAs. Override categories enable systematic analysis. Rationale quality is validated automatically and sampled for human review. Override analytics surface patterns including bias correlations and operator-level variation. Tiered override authority applies escalating rationale and approval requirements to higher-risk overrides. Override patterns are fed into agent improvement cycles.

Advanced Implementation — All intermediate capabilities plus: override prediction identifies decisions likely to be overridden and flags them for enhanced review. Real-time override dashboards provide governance leadership with current override patterns across all agents and operators. Override rationale quality is independently audited. Cross-agent override correlation identifies systemic issues affecting multiple agents. The organisation can demonstrate through analytics that override patterns are monitored, investigated, and resolved — overrides serve as a diagnostic feedback loop rather than merely a decision correction mechanism.

7. Evidence Requirements

Required artefacts:

Override rationale records. A representative sample of override rationale records from the most recent assessment period, demonstrating that records include: overridden decision reference, override direction (human-over-agent or agent-over-human), override timestamp, overriding party identity, override category, free-text rationale with specific factual content, and expected consequence. Minimum sample: 30 records or 10% of overrides in the assessment period, whichever is greater.
Rationale quality validation specification. Documentation of the quality validation rules applied to human override rationales, including minimum character lengths, rejected template phrases, and required content elements.
Agent override explanation specification. Documentation of the structured explanation format generated by agents for agent-overrides-human events, including the data elements captured (triggered rule, input data, confidence level, alternative action) and the human communication format and SLA.
Override analytics reports. Quantitative reports showing override frequency by direction, category, agent, operator, decision type, and time period, with pattern analysis including bias correlation analysis against protected characteristics.
Tamper-evidence verification records. Evidence that override rationale records are stored with tamper-evident integrity per AG-006, including verification logs from the most recent integrity check.
Override-to-decision linkage verification. Evidence that override rationales are structurally linked to the overridden decision records, demonstrated through a sample retrieval exercise showing that any override can be traced to its decision and vice versa.

Retention requirements:

Override rationale records and analytics: minimum 7 years for regulated financial services and public sector; minimum 5 years for safety-critical domains; minimum 3 years otherwise.
Retention clock starts from the date of the override, not the date of the overridden decision.

Access requirements:

Producible to regulators or auditors within 48 hours of request. For safety-critical domains, agent-overrides-human records must be producible within 24 hours for incident investigation.

8. Test Specification

Test 8.1: Human Override Rationale Capture Completeness

Stimulus: A human reviewer initiates overrides against 5 different agent outputs across different decision categories (lending decision, safety assessment, eligibility determination, content classification, and operational recommendation). For each override, the reviewer provides a structured rationale with category selection and free-text reasoning.
Expected behaviour: Each override rationale is recorded with all mandatory fields: overridden decision reference, override timestamp, reviewer identity, override category, free-text rationale, and expected consequence. The rationale is linked to the specific decision record.
Pass criteria: All 5 override rationale records are complete, correctly linked to the overridden decision, and stored in the tamper-evident audit trail. No mandatory field is missing or empty.
Fail criteria: Any override rationale record is missing, incomplete, not linked to the overridden decision, or not present in the tamper-evident trail.

Test 8.2: Rationale Quality Gate Enforcement

Stimulus: Attempt 5 overrides with inadequate rationales: (a) empty rationale field, (b) rationale below minimum character length, (c) template-only rationale ("professional judgement"), (d) rationale with no specific factual content ("I disagree with this assessment"), (e) rationale that copies the agent's output text without adding reasoning. Verify that all 5 are rejected by the quality gate.
Expected behaviour: The system rejects all 5 override attempts, providing specific feedback on why the rationale is insufficient and what is required.
Pass criteria: All 5 inadequate rationales are rejected. The override is not executed in any case. Specific feedback is provided for each rejection.
Fail criteria: Any inadequate rationale passes the quality gate and the override is executed.

Test 8.3: Agent Override Rationale Generation and Communication

Stimulus: Trigger 3 agent-overrides-human events: (a) a safety override blocking a human instruction due to detected hazard, (b) a compliance override blocking a transaction exceeding a mandate boundary, (c) a data protection override redacting information from a human-requested output. Verify that each override generates a structured rationale and communicates it to the affected human.
Expected behaviour: Each agent override generates a structured explanation including: triggered rule or condition, input data or readings that activated the trigger, confidence level, risk classification, and alternative action. The explanation is communicated to the affected human in natural language within the defined SLA.
Pass criteria: All 3 agent override rationales are generated with complete structured data. All 3 are communicated to the affected human within the SLA. The communications are in natural language comprehensible to a non-technical operator.
Fail criteria: Any agent override lacks a structured rationale, any communication is not delivered within the SLA, or any communication is in raw system log format rather than natural language.

Test 8.4: Override Rationale Tamper-Evidence

Stimulus: Attempt to modify a previously recorded override rationale: (a) alter the free-text reasoning, (b) change the override category, (c) delete the rationale while retaining the override record, (d) change the timestamp. Verify that all four modification attempts are either blocked or detected.
Expected behaviour: The tamper-evident storage (per AG-006) prevents undetected modification. Modification attempts are either rejected by the system or, if permitted for authorised corrections, create a new version with the original preserved and the modification logged.
Pass criteria: All four modification attempts are detected or blocked. The original rationale record remains intact and retrievable. Any authorised modification creates a visible audit entry.
Fail criteria: Any modification to an override rationale is made without detection, or the original record is irretrievably altered.

Test 8.5: Override Analytics and Bias Pattern Detection

Stimulus: Generate a simulated override dataset: 500 overrides across 3 agents, with an embedded bias pattern — one agent's outputs are overridden at 22% for applicants from a specified demographic group versus 6% for other groups. Run the override analytics.
Expected behaviour: The analytics system calculates override rates by agent, operator, decision type, direction, and demographic correlation. The embedded bias pattern is detected and flagged.
Pass criteria: The bias pattern is detected: the 22% vs 6% override rate disparity is identified and flagged as an anomaly. Override rates by agent, operator, and category are accurately calculated. An alert or report highlighting the pattern is generated.
Fail criteria: The bias pattern is not detected, override rate calculations are inaccurate, or no anomaly alert is generated.

Test 8.6: Contemporaneous Capture Enforcement

Stimulus: Attempt to execute an override and defer the rationale entry: (a) initiate an override and attempt to skip the rationale form to proceed directly to execution, (b) initiate an override and attempt to submit the rationale more than 60 seconds after the override execution (simulating post-hoc entry). Verify that both attempts are blocked.
Expected behaviour: The system prevents override execution without a completed rationale. The rationale form is a mandatory step in the override workflow, not a post-execution annotation.
Pass criteria: Override execution without a completed rationale is blocked in 100% of attempts. The system enforces rationale completion before override execution.
Fail criteria: Any override is executed without a completed rationale, or the system allows deferred rationale entry without flagging it as non-contemporaneous.

Test 8.7: Override-to-Decision Bidirectional Linkage

Stimulus: Select 10 override events from the audit trail. For each, navigate from the override record to the overridden decision record. Then, starting from the decision record, navigate to all associated override records. Verify bidirectional linkage.
Expected behaviour: Every override record links to its overridden decision. Every overridden decision links to all associated override records. The linkage is navigable in both directions.
Pass criteria: 100% bidirectional linkage across all 10 tested overrides. No orphaned override records (overrides without linked decisions) and no unlabelled overridden decisions (decisions that were overridden but show no override linkage).
Fail criteria: Any override record does not link to its decision, or any overridden decision does not show its associated override records.

Conformance Scoring

Score 0: No override rationale capture exists — overrides are either not recorded at all, or recorded as bare facts (who, when, what) without any rationale explaining why the override occurred.
Score 1: Override rationales are captured but in an unstructured format (generic comment fields), without quality validation, without tamper-evident storage, and without analytics. Agent-overrides-human events may not generate explanations communicated to the affected human.
Score 2: Structured override rationales are captured contemporaneously for both human-over-agent and agent-over-human events. Quality validation rejects inadequate rationales. Rationales are stored in tamper-evident records with bidirectional decision linkage. Agent override rationales are communicated to affected humans in natural language. Override analytics calculate frequencies by direction, category, agent, and operator with bias pattern detection.
Score 3: Verified by independent audit confirming that override rationale capture is comprehensive, quality validation is effective (demonstrated by sample review), analytics are functioning and bias patterns are investigated, and override patterns are fed into agent improvement cycles with documented evidence of resulting recalibration or retraining.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 14 (Human Oversight)	Direct requirement
EU AI Act	Article 13 (Transparency and Provision of Information)	Supports compliance
SOX	Section 404 (Internal Controls Over Financial Reporting)	Direct requirement
FCA SYSC	6.1.1R (Systems and Controls)	Direct requirement
FCA SYSC	3.2.20R (Effective Challenge)	Supports compliance
NIST AI RMF	GOVERN 1.5, MANAGE 1.3, MANAGE 4.1	Supports compliance
ISO 42001	Clause 8.4 (Operation of AI Systems)	Supports compliance
DORA	Article 9 (ICT Risk Management Framework)	Supports compliance

EU AI Act — Article 14 (Human Oversight)

Article 14 requires that human oversight measures enable the overseer to "correctly interpret the high-risk AI system's output" and "decide, in any particular situation, not to use the high-risk AI system or to otherwise disregard, override or reverse the output." The ability to override is explicitly granted; AG-444 ensures that the exercise of that ability is documented, reasoned, and accountable. An override without a rationale is an exercise of Article 14 authority without the accountability that makes that authority trustworthy. The Commission's guidance makes clear that human oversight must be "effective" — effectiveness requires that overrides are not only possible but documented in a manner that enables evaluation of their appropriateness.

EU AI Act — Article 13 (Transparency and Provision of Information)

Article 13 requires that high-risk AI systems are designed to be sufficiently transparent. Agent-overrides-human events — where the AI system overrides a human instruction — are transparency events that must be communicated and explained. An agent that silently overrides a human instruction violates the transparency requirement regardless of whether the override was justified. AG-444's requirement for agent override rationale generation and communication directly supports Article 13 compliance.

SOX — Section 404 (Internal Controls Over Financial Reporting)

Overrides of automated financial controls are audit-critical events under SOX. Auditors must assess whether overrides are appropriately authorised, adequately justified, and systematically monitored. An override without a rationale is a control gap — it represents an unexplained departure from the automated control that the auditor cannot evaluate. Persistent patterns of overrides without adequate rationales constitute a material weakness in internal controls. AG-444 ensures that override rationales provide the evidentiary basis auditors require.

FCA SYSC — 6.1.1R (Systems and Controls)

The FCA expects firms to maintain systems and controls that include documentation of manual interventions in automated processes. FCA supervisory reviews of AI-assisted lending, trading, and compliance processes routinely examine override records. The absence of override rationales has been cited in enforcement actions as evidence of inadequate systems and controls. AG-444's quality validation requirement ensures that rationales contain genuine reasoning rather than pro-forma entries.

NIST AI RMF — GOVERN 1.5, MANAGE 1.3, MANAGE 4.1

GOVERN 1.5 addresses organisational processes for AI risk management, which must include override documentation. MANAGE 1.3 addresses response processes when risks are identified during operation — overrides are a form of operational risk response. MANAGE 4.1 addresses post-deployment monitoring, which should include override pattern analysis as a feedback mechanism. AG-444 provides the rationale capture infrastructure that makes these NIST functions operational.

DORA — Article 9 (ICT Risk Management Framework)

DORA requires financial entities to maintain ICT risk management frameworks that include change management and incident documentation. Overrides of AI agent outputs in financial processes are change events that must be documented within the ICT risk management framework. The override rationale provides the documentation that DORA's Article 9 requirements demand.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Per-decision for individual overrides, but systemic when override rationale absence prevents pattern detection across the entire agent portfolio

Consequence chain: When override rationale capture fails, each individual override becomes an unexplained deviation — a documented fact that the system's determination was changed, without any record of why. The immediate consequence is loss of accountability: no one can determine after the fact whether the override was appropriate. For human-overrides-agent events, this creates regulatory liability when the override leads to a negative outcome (Scenario A: £890,000 in fines) and eliminates the ability to detect bias patterns in override behaviour (Scenario C: investigation, £1.2 million in manual processing costs). For agent-overrides-human events, the absence of communicated rationale causes humans to circumvent safety mechanisms they do not understand (Scenario B: worker injury, £375,000 in combined costs). The systemic consequence compounds over time: without override rationales, the organisation cannot distinguish between overrides that correct AI errors (a healthy feedback signal) and overrides that introduce human errors or bias (a governance failure). Both appear identical in the data — a changed determination with no explanation. The organisation loses its primary diagnostic tool for evaluating the human-AI collaboration interface, and regulatory investigations find an organisation that cannot explain its own decisions — the worst possible evidentiary position.

Cross-references: AG-019 (Human Escalation & Override Triggers) defines the escalation infrastructure through which overrides are initiated and routed. AG-006 (Tamper-Evident Record Integrity) provides the storage infrastructure ensuring override rationales cannot be altered after recording. AG-439 (Reviewer Independence Governance) ensures that the humans exercising override authority are structurally independent. AG-440 (Oversight Ergonomic Design Governance) ensures the override interface supports rather than obstructs rationale capture. AG-443 (Reviewer Dissent Capture Governance) captures disagreements that may lead to overrides. AG-415 (Decision Journal Completeness Governance) ensures override rationales are included in the complete decision record. AG-416 (Evidentiary Chain-of-Custody Governance) ensures override rationale records maintain evidentiary integrity. AG-049 (Explainability Governance) provides the broader explainability framework within which override rationales operate.

Cite this protocol

AgentGoverning. (2026). AG-444: Override Rationale Capture Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-444

← Previous Protocol

AG-443

Reviewer Dissent Capture Governance

Next Protocol →

AG-445

Fatigue Monitoring Governance