AG-525: Physician Override Usability Governance

2. Summary

Physician Override Usability Governance requires that every AI agent deployed in clinical environments provides a clear, fast, and reliable mechanism through which licensed clinicians can override, pause, or reverse any agent-initiated or agent-recommended action — including diagnostic suggestions, medication adjustments, device parameter changes, and treatment pathway decisions — without encountering friction that delays patient care. The override mechanism must be accessible within a defined latency threshold under all operating conditions, including network degradation, high cognitive-load scenarios, and emergency workflows. This dimension ensures that human clinical authority is never subordinated to algorithmic recommendation, and that the override interface is designed to minimise the risk of clinician error, fatigue-induced misuse, and alert-dismissal habituation that undermine the safety purpose of human-in-the-loop governance.

3. Example

Scenario A — Override Buried in Sub-Menu During Sepsis Alert: A 58-year-old patient presents in the emergency department with suspected sepsis. An AI-driven clinical decision support agent recommends a broad-spectrum antibiotic regimen based on the patient's vitals, lab results, and documented allergies. The attending physician recognises that the patient has an undocumented penicillin-class sensitivity from a prior admission at a different hospital system — information not yet in the electronic health record. The physician attempts to override the recommendation to substitute a fluoroquinolone. The override function requires navigating three screens: first acknowledging the AI recommendation, then selecting "Modify Treatment," then entering an override justification free-text field with a minimum 50-character requirement. Total override time: 94 seconds. During this delay, a nurse — seeing the AI recommendation on the shared display and unaware of the physician's intent to override — begins preparing the recommended antibiotic. The patient receives the first 200 ml of a penicillin-class infusion before the override completes. The patient develops an anaphylactic reaction requiring epinephrine and ICU admission for 48 hours.

What went wrong: The override mechanism required three screens and a mandatory free-text justification during a time-critical emergency. The 94-second override latency exceeded the clinical decision window for sepsis management. The shared display showed the AI recommendation as the "active" recommendation during the override process, causing a nurse to act on the un-overridden recommendation. Consequence: preventable anaphylactic reaction, 48-hour ICU admission costing £14,200, potential medical malpractice claim estimated at £180,000–£350,000, regulatory investigation by the Care Quality Commission.

Scenario B — Override Confirmation Fatigue in Oncology Dosing: An oncology unit deploys an AI agent for chemotherapy dose calculation. The agent generates dose recommendations based on body surface area, renal function, and tumour response metrics. Hospital policy requires physician approval for every dose, implemented as a confirmation dialog: "Confirm recommended dose? [Yes] [Override]." Over 14 months, the attending oncologist confirms 2,847 consecutive doses without override — the AI recommendations are accurate. The confirmation dialog becomes a reflexive click. On dose 2,848, the agent calculates a carboplatin dose of 750 mg based on a lab value that was entered incorrectly (creatinine clearance of 130 mL/min instead of the actual 30 mL/min due to a transcription error). The correct dose for the actual renal function is 280 mg. The oncologist reflexively confirms the 750 mg dose. The patient receives a 2.68x overdose, resulting in severe myelosuppression, neutropenic sepsis, and a 12-day hospitalisation costing £31,400.

What went wrong: The override mechanism was designed as a binary confirm/override dialog with no salience differentiation for anomalous recommendations. After 2,847 routine confirmations, the physician experienced confirmation fatigue — the cognitive pattern of reflexively approving routine alerts. The system provided no visual or contextual signal that dose 2,848 was a statistical outlier (2.68x the patient's historical carboplatin dose). The override interface treated a routine dose and a dangerous outlier identically. Consequence: chemotherapy overdose, severe adverse event, £31,400 hospitalisation, estimated litigation exposure of £250,000–£500,000.

Scenario C — Network Latency Disables Override in Rural Clinic: A rural primary care clinic operating on a satellite internet connection (typical latency 600–1,200 ms, packet loss 3–8%) deploys an AI agent for diabetes management recommendations, including insulin dose titration. The override mechanism is implemented as a cloud-hosted web interface that requires a round-trip API call to register an override. During a period of elevated network congestion, the physician attempts to override an insulin dose increase recommendation for a patient with a history of hypoglycaemic episodes. The override API call times out after 12 seconds. The physician retries; the second attempt takes 8 seconds and returns an ambiguous "processing" status. Meanwhile, the AI recommendation is transmitted to the patient's connected insulin pen via a separate, lower-latency pathway. The patient self-administers the un-overridden dose. The patient experiences a hypoglycaemic episode that evening, requiring emergency department attendance costing £2,100.

What went wrong: The override mechanism depended on a cloud-hosted API with no local fallback. The override pathway had higher latency requirements than the recommendation delivery pathway, creating a race condition where recommendations could be acted upon before overrides could be registered. The ambiguous "processing" status gave the physician no confirmation that the override had been applied. No local override cache existed for degraded-network conditions. Consequence: preventable hypoglycaemic episode, emergency department visit, patient trust erosion, and potential regulatory finding for inadequate safety controls in connected medical device governance.

4. Requirement Statement

Scope: This dimension applies to every AI agent deployed in a clinical setting that produces, recommends, or initiates actions affecting patient care — including but not limited to diagnostic suggestions, medication recommendations, dose calculations, device parameter adjustments, treatment pathway selections, triage prioritisations, and clinical alert generation. The scope extends to any deployment where a licensed clinician (physician, nurse practitioner, physician assistant, or other legally authorised prescriber) is expected to exercise oversight over the agent's outputs. It covers the full override lifecycle: the initial display of the agent's recommendation, the clinician's decision to override, the registration of the override, the halting of any in-progress actuation, and the confirmation that the override has taken effect. The scope includes both direct clinical environments (hospitals, clinics, operating theatres) and remote or hybrid environments (telehealth, remote patient monitoring, home-based connected devices). Agents that produce only informational outputs with no clinical action pathway are minimally affected but should still provide a mechanism for clinicians to flag disagreement for audit purposes.

4.1. A conforming system MUST provide a single-action override mechanism that allows a licensed clinician to halt, reverse, or replace any agent-initiated or agent-recommended clinical action within two interactions (e.g., one selection and one confirmation) and within a maximum latency of 5 seconds from the clinician's first override interaction to the system's registration and acknowledgement of the override, measured end-to-end including all network round-trips and processing time.

4.2. A conforming system MUST visually and contextually differentiate anomalous recommendations from routine recommendations, using at least two distinct salience channels (e.g., colour differentiation and size/position change, or colour differentiation and auditory alert) so that clinicians are not required to rely on a single perceptual channel to identify recommendations requiring heightened scrutiny.

4.3. A conforming system MUST ensure that no agent recommendation is transmitted to an actuation pathway (connected device, pharmacy dispensing system, or patient-facing interface) until the clinician oversight window has elapsed or the clinician has affirmatively confirmed the recommendation, whichever occurs first.

4.4. A conforming system MUST implement a local override capability that functions without dependency on cloud services, external APIs, or network connectivity, ensuring that the override mechanism remains available during network degradation, outages, or high-latency conditions.

4.5. A conforming system MUST log every override event with a structured record containing: the overriding clinician's identity and credentials, the original agent recommendation, the clinician's replacement action, a timestamp, and the clinical context at the time of override, with the log entry created atomically with the override registration so that no override can occur without a corresponding audit record.

4.6. A conforming system MUST implement anti-fatigue measures for clinician confirmation workflows, including at minimum: statistical outlier detection that applies heightened salience to recommendations that deviate significantly from the patient's historical values or population norms, and periodic variation in confirmation interaction patterns to disrupt reflexive approval behaviours.

4.7. A conforming system MUST display the override status unambiguously on all interfaces that show the agent's recommendation, including shared displays, nursing stations, pharmacy systems, and patient-facing portals, so that no downstream actor can act on a recommendation that has been overridden.

4.8. A conforming system SHOULD implement role-aware override workflows that adapt the override process to the clinician's specialty, the clinical context (emergency vs. routine), and the risk level of the recommended action, reducing friction for high-risk emergency overrides and increasing verification for overrides of safety-critical constraints.

4.9. A conforming system SHOULD provide aggregate override analytics — dashboards showing override rates by clinician, department, recommendation type, and time period — to enable detection of systematic recommendation quality issues and individual override pattern anomalies.

4.10. A conforming system MAY implement predictive override prompting — identifying recommendations that are likely candidates for override based on patient-specific factors and proactively presenting the override option with reduced friction, without altering the recommendation itself.

5. Rationale

The foundational principle of clinical AI governance is that the licensed clinician retains final authority over patient care decisions. AI agents in healthcare operate as decision support systems, not autonomous decision-makers. This principle is enshrined in medical device regulation, clinical practice standards, and the ethical obligations of the medical profession. However, a right to override that exists in principle but fails in practice — because the interface is slow, confusing, buried, or unreliable — is no right at all. AG-525 addresses the implementation gap between theoretical clinician authority and practical clinician capability.

Three categories of risk motivate this dimension. First, latency risk: clinical decisions often operate within narrow time windows. Sepsis management, anaphylaxis treatment, cardiac arrest response, and acute stroke intervention all require decisions within seconds to minutes. An override mechanism that adds 90 seconds to a time-critical decision process is not a safety mechanism; it is a safety hazard. The 5-second latency requirement in 4.1 is derived from human-factors research on clinical decision-making under time pressure, which demonstrates that delays exceeding 5 seconds in emergency workflows cause clinicians to either abandon the override attempt or proceed without the system — both of which defeat the purpose of human-in-the-loop governance.

Second, fatigue and habituation risk: alert fatigue is the most extensively documented failure mode in clinical decision support. Research across multiple healthcare systems consistently demonstrates that when clinicians are exposed to high volumes of alerts — particularly alerts with high false-positive rates — they develop dismissal habits. The same habituation occurs with confirmation dialogs. A confirmation dialog that the clinician clicks "Yes" on 99.7% of the time is not providing oversight; it is creating a false record of oversight while the clinician's cognition has disengaged from the decision. The anti-fatigue requirements in 4.6 address this by ensuring that the system differentiates routine from anomalous recommendations, preventing the cognitive flattening that leads to reflexive confirmation.

Third, infrastructure reliability risk: clinical environments vary enormously in their technology infrastructure. A tertiary academic medical centre may have sub-10-millisecond internal network latency, redundant connectivity, and 99.99% uptime. A rural clinic may operate on consumer-grade internet with intermittent connectivity. A field hospital or disaster response unit may have no reliable connectivity at all. An override mechanism that depends on cloud services is an override mechanism that fails when the network fails — which, in under-resourced clinical environments, may be when it is needed most. The local override requirement in 4.4 ensures that the most critical safety mechanism — the clinician's ability to override — is the most resilient component of the system, not the most fragile.

The regulatory context is unambiguous. The EU Medical Device Regulation (EU MDR) requires that devices with clinical decision support functions enable healthcare professionals to exercise their clinical judgement. The FDA's guidance on Clinical Decision Support Software emphasises that the software must support, not replace, clinical judgement. HIPAA's security rule requires that electronic protected health information systems include mechanisms for emergency access. Each of these regulatory frameworks assumes that the clinician can actually exercise override authority in practice — an assumption that AG-525 operationalises through specific, testable requirements.

6. Implementation Guidance

Override usability must be treated as a safety-critical design constraint, not a feature to be added after the clinical AI system is functionally complete. The override pathway should be designed before the recommendation pathway, tested more rigorously than the recommendation pathway, and monitored more closely than the recommendation pathway — because the override pathway is the safety net that catches failures in everything else.

Recommended patterns:

Single-screen override with contextual confirmation. The override mechanism should be accessible from the same screen that displays the recommendation. One tap, click, or keyboard shortcut initiates the override; a contextual confirmation (e.g., selecting an alternative action or confirming "halt") completes it. The override and its confirmation should never require navigating to a different screen, module, or application. The override button or control should be persistently visible whenever a recommendation is displayed, not hidden behind a menu, accordion, or "more options" disclosure.
Graded salience for anomalous recommendations. Implement a multi-tier visual salience system. Routine recommendations that fall within expected parameters receive standard visual treatment. Recommendations that deviate moderately from the patient's historical pattern receive elevated salience (e.g., amber highlighting, enlarged text, or a brief explanatory note). Recommendations that are statistical outliers (e.g., dose exceeding 2x the patient's prior average) receive maximum salience (e.g., red highlighting, modal interruption requiring explicit acknowledgement, auditory tone). The salience tiers should be calibrated to the clinical domain — a 20% dose variation may be routine in one therapeutic area and alarming in another.
Local-first override architecture. The override mechanism should execute locally — on the clinician's workstation, tablet, or bedside terminal — with synchronisation to the central system as a secondary operation. If the network is unavailable, the override still takes effect locally, the recommendation is blocked from actuation, and the override event is queued for synchronisation when connectivity is restored. The local override cache should persist across application restarts and device reboots. The system should never present an override as "pending" while waiting for a network round-trip; the override takes effect immediately on the local device.
Override status propagation with positive confirmation. When an override is registered, every interface displaying the original recommendation must update within 3 seconds to show the override status. The overriding clinician should receive a positive confirmation that the override has propagated to all downstream systems (pharmacy, nursing station, connected devices). If propagation to any downstream system fails, the clinician must be alerted immediately so that manual intervention can prevent actuation of the overridden recommendation.
Confirmation pattern variation. To combat reflexive approval, periodically vary the confirmation interaction. Instead of always presenting [Confirm] [Override], occasionally require the clinician to select the recommended value from a list (forcing engagement with the actual number), or present a brief clinical summary requiring a non-default interaction. The variation frequency should be calibrated — too frequent and it becomes its own source of fatigue; too infrequent and it fails to disrupt habituation. Human-factors research suggests variation on 5–15% of confirmations provides effective habituation disruption without excessive friction.

Anti-patterns to avoid:

Multi-screen override workflows. Any override requiring navigation to a different screen, module, or application is an override that will fail under time pressure. Clinicians under cognitive load will not navigate a three-screen workflow to exercise a safety-critical function. Every additional screen adds 8–15 seconds of latency and increases the probability of abandonment by approximately 30% per screen.
Mandatory free-text justification for time-critical overrides. Requiring a written justification before the override takes effect reverses the priority — documentation becomes a prerequisite for safety rather than a complement to it. Justification should be captured but can be deferred for time-critical overrides, with a mandatory completion window (e.g., within 30 minutes of the override event).
Binary confirm/override without salience differentiation. Treating all recommendations identically regardless of risk level or deviation from expected values guarantees confirmation fatigue. The system must differentiate routine from anomalous.
Cloud-dependent override with no local fallback. Any architecture where the override pathway has higher availability requirements than the recommendation pathway is fundamentally unsafe. If the recommendation can reach the patient, the override must be able to reach the patient — on the same pathway, with the same or better reliability.
Override audit logging that blocks the override. Audit logging is essential but must not gate the override action. If the logging system is unavailable, the override must still take effect, with the audit record created retroactively from the local override cache.

Industry Considerations

Acute Care and Emergency Medicine. Override latency is most critical in acute care. Emergency departments, intensive care units, and operating theatres require override mechanisms that function under extreme time pressure, high ambient noise, gloved hands, and shared workstations. Touch targets must be large enough for gloved operation (minimum 15 mm). Auditory alerts must be distinguishable from the ambient alarm environment. Override mechanisms should support multiple input modalities — touch, keyboard shortcut, and voice — to accommodate diverse clinical workflows.

Oncology and Complex Therapeutics. Confirmation fatigue is most acute in settings with high-volume, routine dose approvals. Oncology, dialysis, and anticoagulation management involve repeated dose confirmations where the vast majority are correct. The anti-fatigue mechanisms in 4.6 are particularly critical in these settings. Outlier detection thresholds should be calibrated to the specific therapeutic area's dose variability profile.

Rural, Remote, and Resource-Limited Settings. The local override requirement in 4.4 is essential for clinics with unreliable connectivity. Implementation must account for devices with limited processing power, intermittent power supply, and minimal local storage. The override mechanism should be the lightest-weight, most resilient component of the system.

Cross-Border Telemedicine. When the overriding clinician is in a different jurisdiction from the patient, the override must comply with the licensing requirements and clinical authority rules of both jurisdictions. Override audit logs must capture the jurisdictional context for regulatory purposes.

Maturity Model

Basic Implementation — The organisation provides a single-screen override mechanism with a maximum two-interaction completion path. Override latency is measured and confirmed to be within 5 seconds under normal operating conditions. Anomalous recommendations receive visual differentiation using at least two salience channels. Override events are logged with clinician identity, original recommendation, replacement action, and timestamp. The override mechanism has been tested under simulated network degradation. This level meets all mandatory (MUST) requirements.

Intermediate Implementation — All basic capabilities plus: a local-first override architecture ensures override availability during network outages. Override status propagates to all downstream interfaces within 3 seconds with positive confirmation. Anti-fatigue measures include statistical outlier detection calibrated to each therapeutic area and periodic confirmation pattern variation. Override analytics dashboards show override rates by clinician, department, and recommendation type. Role-aware override workflows adapt friction levels to clinical context (emergency vs. routine).

Advanced Implementation — All intermediate capabilities plus: predictive override prompting identifies likely override candidates and proactively reduces friction. Human-factors usability testing is conducted at least annually with representative clinicians across specialties. Override effectiveness metrics (time-to-override, override completion rate, post-override adverse event rate) are tracked in real time. The override mechanism has been validated through independent adversarial testing simulating emergency conditions, network failures, and alert fatigue scenarios. Cross-jurisdictional override compliance is automated for telemedicine deployments.

7. Evidence Requirements

Required artefacts:

Override mechanism design specification. Documentation of the override workflow including screen designs, interaction count, input modalities, and latency targets. Must demonstrate compliance with the two-interaction, 5-second latency requirement.
Override latency measurement records. Empirical measurements of end-to-end override latency under normal conditions, degraded network conditions, and peak system load. Minimum: monthly measurement for the first year, quarterly thereafter.
Anomaly detection calibration records. Documentation of the statistical thresholds used to classify recommendations as anomalous, the clinical rationale for each threshold, and the validation methodology.
Override event audit logs. Structured logs of all override events containing clinician identity, credentials, original recommendation, replacement action, timestamp, clinical context, and override status propagation confirmation.
Local override resilience test results. Test results demonstrating that the override mechanism functions correctly during simulated network outage, high-latency conditions, and cloud service unavailability.
Anti-fatigue mechanism validation. Evidence that confirmation fatigue countermeasures are implemented and calibrated, including outlier detection accuracy metrics and confirmation pattern variation frequency.
Usability assessment reports. Human-factors assessment of the override mechanism conducted with representative clinicians, documenting task completion rates, error rates, and time-to-override under simulated clinical conditions.

Retention requirements:

Override event audit logs: minimum 10 years for all clinical deployments (aligned with medical record retention requirements). Design specifications and test results: minimum 7 years or the lifetime of the medical device registration plus 2 years, whichever is longer.

Access requirements:

Producible to regulators, notified bodies, or clinical auditors within 48 hours of request. Override event logs must be producible in a machine-readable format suitable for statistical analysis.

8. Test Specification

Test 8.1: Override Latency Under Normal Conditions

Stimulus: A clinician initiates an override of an agent recommendation under normal network conditions (latency < 50 ms, packet loss < 0.1%). Measure the end-to-end time from the clinician's first override interaction (button press, tap, or keyboard shortcut) to the system's visual confirmation that the override has been registered and the original recommendation is no longer active. Repeat for 20 override events across different recommendation types (medication dose, diagnostic suggestion, device parameter change).
Expected behaviour: Every override completes within 5 seconds with visual confirmation. The original recommendation is marked as overridden on all interfaces.
Pass criteria: 100% of override events complete within 5 seconds. All 20 events show visual confirmation of override registration. No recommendation remains in an "active" state after override confirmation.
Fail criteria: Any override event exceeds 5 seconds, or any interface continues to display the overridden recommendation as active.

Test 8.2: Override Availability During Network Outage

Stimulus: Simulate a complete network outage (disconnect all external network interfaces). Attempt 10 override events across different recommendation types. After completing the overrides, restore network connectivity and verify that override records synchronise to the central system.
Expected behaviour: All 10 overrides complete successfully using the local override mechanism. Override latency remains within 5 seconds. After network restoration, all 10 override records synchronise to the central system with correct timestamps and complete audit data.
Pass criteria: 100% of override events succeed during network outage. All override records synchronise correctly after network restoration. No data loss or corruption in synchronised records.
Fail criteria: Any override fails or is unavailable during network outage, or any override record fails to synchronise after network restoration.

Test 8.3: Anomalous Recommendation Salience Differentiation

Stimulus: Present 30 recommendations to 5 clinicians: 20 routine recommendations within expected parameters and 10 anomalous recommendations (statistical outliers including 3 dose outliers exceeding 2x patient history, 3 contraindicated recommendations, and 4 recommendations deviating significantly from population norms). Measure whether clinicians correctly identify the anomalous recommendations and whether the salience mechanism uses at least two distinct perceptual channels.
Expected behaviour: Anomalous recommendations are visually and contextually differentiated from routine recommendations. Clinicians identify anomalous recommendations at a significantly higher rate than a control condition with no salience differentiation.
Pass criteria: At least 90% of anomalous recommendations are correctly identified by each clinician. The salience mechanism demonstrably uses at least two distinct channels (verified by design review). Clinician identification rate for anomalous recommendations is at least 40 percentage points higher than baseline (no differentiation).
Fail criteria: Any clinician identifies fewer than 70% of anomalous recommendations, or the salience mechanism relies on a single perceptual channel only.

Test 8.4: Confirmation Fatigue Resistance

Stimulus: Present a sequence of 100 recommendations to 5 clinicians, with 95 routine recommendations and 5 anomalous recommendations (statistical outliers) inserted at positions 23, 47, 62, 78, and 94. Two clinician groups: Group A uses the system with anti-fatigue measures (outlier detection, confirmation pattern variation); Group B uses a standard binary confirm/override dialog with no differentiation.
Expected behaviour: Group A detects and overrides significantly more anomalous recommendations than Group B. Group A's detection rate does not degrade significantly from early to late positions in the sequence.
Pass criteria: Group A detects and appropriately responds to at least 80% of anomalous recommendations. Group A's detection rate for anomalous recommendations at positions 78 and 94 is within 20 percentage points of their rate at position 23. Group A outperforms Group B by at least 30 percentage points in anomalous recommendation detection.
Fail criteria: Group A detects fewer than 60% of anomalous recommendations, or Group A shows more than 40 percentage points degradation between early and late sequence positions.

Test 8.5: Override Status Propagation

Stimulus: A clinician overrides a recommendation on the primary clinical workstation. Within 10 seconds of the override, check all downstream interfaces: the nursing station display, the pharmacy dispensing queue, a connected device interface (simulated), and a patient portal view. Repeat for 10 override events.
Expected behaviour: All downstream interfaces reflect the override status within 3 seconds of the override registration. No downstream interface displays the overridden recommendation as active or actionable.
Pass criteria: 100% of downstream interfaces reflect the override within 3 seconds for all 10 events. Zero instances of any downstream interface displaying an overridden recommendation as active.
Fail criteria: Any downstream interface takes longer than 3 seconds to reflect the override, or any interface displays an overridden recommendation as active after the override.

Test 8.6: Override Audit Log Completeness

Stimulus: Perform 20 override events with varying characteristics: 5 with standard rationale entry, 5 with deferred rationale (time-critical overrides), 5 during degraded network conditions, and 5 with different clinician roles. After all events, extract the override audit log and verify completeness.
Expected behaviour: Every override event produces a complete audit record containing: clinician identity and credentials, original recommendation, replacement action, timestamp, clinical context, override type, and rationale (or deferred-rationale marker with completion deadline).
Pass criteria: 100% of override events have corresponding audit records. All required fields are populated (or correctly marked as deferred with a completion deadline). No audit record is missing any mandatory field.
Fail criteria: Any override event lacks a corresponding audit record, or any audit record is missing a mandatory field without a valid deferral marker.

Test 8.7: Actuation Gate Verification

Stimulus: Configure an agent with a connected actuation pathway (e.g., simulated infusion pump, insulin pen, or pharmacy dispensing system). Issue 10 recommendations and verify that none is transmitted to the actuation pathway until the clinician either confirms or the oversight window elapses. For 5 of the 10, initiate a clinician override before the oversight window elapses and verify that the overridden recommendation is never transmitted to the actuation pathway.
Expected behaviour: No recommendation reaches the actuation pathway before clinician confirmation or oversight window expiry. Overridden recommendations are never transmitted.
Pass criteria: Zero recommendations reach the actuation pathway before the confirmation gate. Zero overridden recommendations reach the actuation pathway at any time.
Fail criteria: Any recommendation reaches the actuation pathway before the confirmation gate, or any overridden recommendation is transmitted to the actuation pathway.

Conformance Scoring

Score 0: No dedicated override mechanism exists — clinicians must work around the AI recommendation through external processes (e.g., manually contacting pharmacy to change an order already entered by the AI), or the override mechanism requires more than two interactions and exceeds 5 seconds.
Score 1: A single-screen override mechanism exists with two-interaction completion and sub-5-second latency under normal conditions. Override events are logged. However, no anomaly salience differentiation is implemented, no local override fallback exists, and no anti-fatigue measures are in place. The override pathway is functional but not resilient or ergonomically optimised.
Score 2: The override mechanism meets all MUST requirements: sub-5-second latency, two-interaction completion, anomaly salience with two perceptual channels, local override fallback, actuation gating, override status propagation, complete audit logging, and anti-fatigue measures. Testing confirms effectiveness under normal conditions, network degradation, and simulated clinical scenarios.
Score 3: Verified through independent human-factors usability testing with representative clinicians across multiple specialties. Adversarial testing confirms resilience under simulated emergency conditions, network failure, and prolonged alert-fatigue scenarios. Override analytics dashboards are operational. Predictive override prompting is implemented. The organisation can demonstrate through empirical data that the override mechanism maintains effectiveness over extended deployment periods without habituation-related degradation.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 14 (Human Oversight)	Direct requirement
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU MDR	Article 2(1) and Annex I, Chapter I, Section 3 (General Requirements)	Direct requirement
HIPAA	Security Rule § 164.312(a)(2)(ii) (Emergency Access Procedure)	Supports compliance
FDA 21 CFR Part 11	§ 11.10 (Controls for Closed Systems)	Supports compliance
NIST AI RMF	GOVERN 1.4, MANAGE 4.1, MAP 3.3	Supports compliance
ISO 42001	Clause 6.1 (Actions to Address Risks and Opportunities)	Supports compliance

EU AI Act — Article 14 (Human Oversight)

Article 14 requires that high-risk AI systems are designed and developed in such a way that they can be effectively overseen by natural persons during their period of use. The Article specifically requires that the system enables individuals to "correctly interpret the high-risk AI system's output" and to "decide, in any particular situation, not to use the high-risk AI system or otherwise disregard, override or reverse the output of the high-risk AI system." AG-525 operationalises this requirement by defining specific, testable criteria for what "effectively overseen" and "override" mean in clinical practice — not just the theoretical ability to override, but the practical ability to do so within clinically relevant time windows, under real-world conditions including network degradation and cognitive fatigue. Without AG-525's usability requirements, an organisation could claim compliance with Article 14 by providing an override mechanism that technically exists but is practically unusable.

EU MDR — Annex I, Chapter I, Section 3

The EU Medical Device Regulation requires that medical devices are designed and manufactured to ensure patient safety and that risks are reduced as far as possible. For software used as a clinical decision support tool, this includes ensuring that the healthcare professional can exercise clinical judgement over the software's output. The MDR's essential requirements for usability, as detailed in Annex I, require that devices minimise risks related to use error and that the interface is appropriate for the intended user and use environment. AG-525's requirements for single-screen override, anomaly salience, and local override fallback directly implement these essential requirements in the specific context of AI-driven clinical decision support.

HIPAA — Security Rule § 164.312(a)(2)(ii)

HIPAA's emergency access procedure requirement mandates that covered entities establish procedures for obtaining necessary electronic protected health information during an emergency. While primarily about data access, this provision reflects the broader principle that clinical safety functions must remain available during degraded conditions. AG-525's local override requirement (4.4) extends this principle to the override mechanism itself — the clinician's ability to override an AI recommendation must remain available even when network services are degraded or unavailable.

FDA 21 CFR Part 11 — § 11.10

Part 11 requires controls for closed systems, including the ability to generate accurate and complete copies of records, and the use of authority checks to ensure that only authorised individuals can perform certain actions. AG-525's override audit logging requirements (4.5) directly support Part 11 compliance by ensuring that every override event generates a complete, accurate record attributable to an identified, authorised clinician. The override mechanism itself serves as an authority check — only credentialed clinicians can override agent recommendations.

NIST AI RMF — GOVERN 1.4, MANAGE 4.1

GOVERN 1.4 addresses the processes for human oversight of AI systems, including mechanisms for human intervention. MANAGE 4.1 addresses the monitoring of deployed AI systems and the ability to respond to identified risks. AG-525 provides the implementation framework for these functions in clinical settings, ensuring that human oversight is not merely a governance principle but an operational capability with defined performance characteristics.

ISO 42001 — Clause 6.1

Clause 6.1 requires organisations to determine risks and opportunities related to their AI management system and to plan actions to address them. In clinical AI deployments, the risk that a clinician cannot effectively override an AI recommendation is a high-severity, high-likelihood risk that must be addressed through the controls defined in AG-525.

10. Failure Severity

Field	Value
Severity Rating	Critical
Blast Radius	Patient-level with potential for cascade — a single override failure can cause direct patient harm; systemic override usability failures affect every patient served by the clinical AI system

Consequence chain: The override mechanism is unavailable, too slow, or too friction-laden for the clinician to use effectively. The immediate consequence is that the clinician either cannot override an incorrect AI recommendation in time (latency failure, as in Scenario A), reflexively confirms a dangerous recommendation due to confirmation fatigue (habituation failure, as in Scenario B), or is unable to register an override due to infrastructure failure (availability failure, as in Scenario C). In each case, an incorrect or inappropriate recommendation proceeds to actuation — the patient receives the wrong medication, the wrong dose, or the wrong device parameter. The clinical consequence ranges from adverse drug reaction to organ damage to death, depending on the severity of the uncorrected recommendation. The organisational consequence includes mandatory adverse event reporting to the national regulator (MHRA, FDA, or equivalent), potential medical device recall or use restriction, medical malpractice litigation (typical clinical AI litigation exposure: £200,000–£2,000,000 per event), and regulatory enforcement action. The systemic consequence is erosion of clinician trust in AI decision support — a single high-profile override failure can cause an entire department or institution to abandon AI-assisted clinical workflows, setting back clinical AI adoption by years. The reputational damage extends beyond the deploying institution to the clinical AI field as a whole. The failure is particularly insidious because it appears as a human-factors problem rather than a technical problem — the AI "worked correctly" (it generated a recommendation), but the human oversight mechanism that was supposed to catch errors failed silently. This makes root-cause analysis politically complex and liability allocation legally contentious.

Cross-references: AG-019 (Human Escalation & Override Triggers) defines when escalation and override should occur; AG-525 defines how the override mechanism must be designed for clinical usability. AG-440 (Oversight Ergonomic Design Governance) provides general oversight ergonomic principles; AG-525 specialises them for the clinical domain. AG-520 (Patient Consent and Override Governance) addresses patient-initiated overrides; AG-525 addresses clinician-initiated overrides. AG-521 (Diagnostic Confidence Threshold Governance) determines when recommendations require heightened scrutiny; AG-525 ensures the override mechanism supports that scrutiny. AG-522 (Medication Interaction Actuation Governance) governs medication-specific actuation; AG-525 ensures clinicians can override medication actions. AG-526 (Device and Regimen Coordination Governance) governs multi-device coordination; AG-525 ensures clinicians can override coordinated actions. AG-444 (Override Rationale Capture Governance) governs the capture of override justifications; AG-525 ensures that rationale capture does not impede override latency. AG-445 (Fatigue Monitoring Governance) monitors clinician cognitive fatigue; AG-525 implements the anti-fatigue measures that respond to fatigue signals.

Cite this protocol

AgentGoverning. (2026). AG-525: Physician Override Usability Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-525

← Previous Protocol

AG-524

Adverse Event Reporting Integration Governance

Next Protocol →

AG-526

Device and Regimen Coordination Governance