Degraded-Mode and Manual Fallback Governance requires that every AI agent operating in a safety-critical or critical infrastructure context has formally defined degraded operating modes and manual fallback procedures that can sustain essential functions when the agent is partially impaired, fully unavailable, or has been deliberately removed from the control loop. This dimension addresses the operational continuity gap between full autonomous operation and the safe state defined under AG-109. A safe state stops the hazard; a degraded mode continues essential operations at reduced capability with increased human involvement. Without this governance, organisations face a binary choice between full AI autonomy and full shutdown — and in critical infrastructure, full shutdown may itself be hazardous or unacceptable (e.g., shutting down a hospital's HVAC in summer, or disconnecting a generation source from a constrained grid).
Scenario A — Hospital HVAC Agent Fails With No Degraded Mode: A hospital deploys an AI agent to optimise heating, ventilation, and air conditioning across 14 operating theatres, maintaining precise temperature (20-22°C), humidity (40-60% RH), and positive pressure differentials to prevent airborne contamination. The agent's machine learning model encounters an out-of-distribution input (an unusual combination of external temperature and building occupancy) and begins producing erroneous setpoints. The governance system detects the anomaly and triggers AG-109 safe-state transition. The safe state is defined as "all HVAC outputs to fixed conservative setpoints." However, the fixed setpoints are a single configuration designed for average conditions — they do not account for the 38°C external temperature on this day. Within 2 hours, operating theatre temperatures exceed 26°C. Surgical schedules are cancelled for 47 patients. One theatre is maintained by a technician who manually adjusts dampers and chiller setpoints, but the remaining 13 theatres have no documented manual operating procedure.
What went wrong: The organisation implemented AG-109 (safe state) but not AG-110 (degraded mode). The safe state was a single fixed configuration that could not adapt to conditions. No degraded-mode profile existed (e.g., "agent provides advisory setpoints only; building technician confirms before execution"). No manual fallback procedure existed (documented steps for manual HVAC operation by qualified facilities staff). Consequence: 47 cancelled surgeries, estimated revenue loss of £235,000, patient safety risk from delayed procedures, CQC investigation into environmental control adequacy.
Scenario B — Automated Port Crane Agent Loses Model Integrity: A container port operates 8 automated gantry cranes controlled by AI agents for container placement optimisation. A software update introduces a regression in the path-planning model for cranes 3-8. Cranes 1-2, which received the update earlier and were validated, operate normally. The governance system detects path-planning anomalies on cranes 3-8 and removes the agents from the control loop. The cranes stop — but the port has no documented procedure for transitioning to manual crane operation. The crane operators were retrained on automated systems 3 years ago and have not maintained manual proficiency. Manual operation requires different console configurations that are not pre-staged. Cranes 3-8 are non-operational for 14 hours while manual procedures are reconstructed and operators re-familiarised.
What went wrong: The fallback from AI to manual control was assumed but not governed. Operator manual proficiency was not maintained. Manual console configurations were not pre-staged. No degraded mode existed (e.g., "agent generates plans, operator approves each move"). Consequence: 14 hours of partial port shutdown, 3 container ships diverted to alternative ports, contractual penalties of £1.7 million, supply chain disruption affecting downstream logistics.
Scenario C — Water Network Agent Degraded Mode Conflicts With Manual Operations: A water distribution network uses an AI agent to manage pressure across 23 pressure-reducing valves (PRVs) to minimise leakage while maintaining minimum service pressure. The agent's optimisation server fails, and the system enters a degraded mode: all PRVs revert to their pre-agent fixed setpoints from 2019. Simultaneously, the network operator dispatches technicians to manually adjust valves based on current demand patterns (which have changed significantly since 2019 due to new housing developments). The degraded mode and the manual adjustments conflict — the system reverts a technician's manual adjustment 12 seconds later because the degraded-mode controller is still enforcing the 2019 setpoints. Three district metering areas lose pressure below minimum service levels, affecting 8,200 properties.
What went wrong: The degraded mode and manual fallback were not coordinated. The system re-asserted degraded-mode setpoints over manual adjustments because the degraded-mode controller had authority priority. No handoff protocol existed to transfer authority from automated degraded mode to manual operation. Consequence: 8,200 properties with inadequate water pressure for 6 hours, potential public health risk (low pressure enables ingress), Ofwat investigation, estimated remediation and compensation cost of £340,000.
Scope: This dimension applies to all AI agents within the scope of AG-109 (Safe-State Transition Governance) — those operating in contexts where agent failure could result in physical harm, infrastructure damage, environmental harm, or disruption to essential services. Additionally, this dimension applies to any AI agent whose removal from operation would cause an unacceptable disruption to essential services, even where physical safety is not directly at risk. Examples include agents managing telecommunications routing, financial market infrastructure, emergency dispatch systems, and logistics coordination for essential supplies. The test is: if this agent becomes unavailable, can the essential function it supports continue at an acceptable level of service through alternative means? If the answer is "no" or "we don't know," the agent is within scope.
4.1. A conforming system MUST define, for every in-scope agent, at least one degraded operating mode that sustains essential functions at reduced capability without the agent's full autonomous operation, specifying which functions are maintained, which are suspended, and what human involvement is required.
4.2. A conforming system MUST define, for every in-scope agent, a manual fallback procedure that enables qualified human operators to perform the agent's essential functions without any AI assistance, including the specific steps, tools, console configurations, and data sources required.
4.3. A conforming system MUST implement a defined handoff protocol for transferring control authority between modes: from autonomous to degraded, from degraded to manual, and from manual back to autonomous. Each handoff MUST be an explicit, logged action — not an automatic transition.
4.4. A conforming system MUST ensure that degraded-mode configurations and manual fallback procedures are tested at defined intervals (at minimum annually) under realistic operational conditions, including with the personnel who would execute them in a real event.
4.5. A conforming system MUST ensure that manual fallback procedures can be initiated within a defined time bound appropriate to the operational context (e.g., 15 minutes for process control, 30 minutes for network operations, as determined by operational continuity analysis).
4.6. A conforming system MUST maintain operator proficiency for manual fallback through documented training and periodic exercises, with proficiency records retained as evidence.
4.7. A conforming system SHOULD define multiple degraded modes of increasing human involvement (e.g., "agent advises, operator confirms" before "agent offline, operator controls directly") to provide graduated fallback options.
4.8. A conforming system SHOULD pre-stage all tools, configurations, and access credentials required for manual fallback so that no setup or provisioning is required during a fallback event.
4.9. A conforming system SHOULD implement conflict resolution logic that prevents degraded-mode automation from overriding manual operator inputs, with clear authority priority rules documented and enforced.
4.10. A conforming system MAY implement automated degraded-mode selection based on the nature of the agent failure (e.g., model degradation triggers "advisory only" mode, while communication loss triggers safe-state transition per AG-109).
Degraded-Mode and Manual Fallback Governance addresses a gap that safe-state transitions alone cannot fill. In many critical infrastructure contexts, the safe state — while preventing immediate hazard — creates its own operational problems. A power generation plant at safe state produces no electricity. A water treatment plant at safe state may treat water conservatively but cannot optimise for varying demand. A hospital HVAC system at fixed setpoints cannot respond to changing conditions. The organisation needs the ability to continue essential operations, at reduced performance, while the agent is being repaired, validated, or replaced.
The degraded-mode concept recognises that AI agent autonomy exists on a spectrum, not as a binary. Between "fully autonomous" and "fully manual" there are valuable intermediate positions: the agent can provide recommendations that a human approves before execution; the agent can control routine parameters while a human manages exceptions; the agent can monitor and alert while a human controls. These intermediate positions allow organisations to maintain operations while reducing the risk associated with a compromised or degraded agent.
The manual fallback requirement addresses an uncomfortable reality: many organisations deploying AI agents in critical infrastructure have allowed the human expertise required for manual operation to atrophy. Operators trained on manual systems retire or transfer. New operators are trained only on the AI-assisted workflow. Documentation for manual procedures is not maintained. Console configurations for manual operation are not preserved. When the agent fails, the organisation discovers that it cannot operate its own critical infrastructure manually. This is not a hypothetical risk — it mirrors documented incidents in aviation (automation dependency), maritime (GPS dependency), and manufacturing (automated-line manual restart failures).
The handoff protocol requirement prevents a specific class of failures where authority is ambiguous during transitions. If both the degraded-mode automation and the human operator believe they have authority, their actions can conflict. If neither believes they have authority, critical functions go unmanaged. The handoff must be explicit, logged, and unambiguous — at any point in time, exactly one entity (automated mode or human operator) has authority over each function.
This dimension builds on AG-109 (which defines what to do when governance fails — reach a safe state) by defining what happens next — how to continue essential operations while the agent is unavailable. It also builds on AG-008 (Governance Continuity Under Failure) by providing the specific operational continuity mechanisms for safety-critical and critical infrastructure contexts.
AG-110 establishes the degraded-mode profile and the manual fallback procedure as the operational continuity artefacts for AI agents in critical infrastructure. A degraded-mode profile specifies: which agent functions continue (at what level), which are suspended, what human role is introduced, and what authority boundaries apply in that mode. A manual fallback procedure specifies: the step-by-step process for human operation, the tools and console configurations required, the data sources available, the decision criteria for key judgements, and the communication protocols for coordination.
Recommended patterns:
Anti-patterns to avoid:
Power Systems. Grid operators must comply with Grid Code requirements for maintaining generation and demand balance. Degraded modes must ensure that the agent's contribution to grid stability (frequency response, voltage regulation) continues at an acceptable level or is explicitly transferred to manual or alternative automated control. The National Grid ESO (UK) or equivalent system operator must be notified when an AI agent managing grid-connected assets enters degraded mode, as this may affect system-wide stability assessments.
Water and Wastewater. The Drinking Water Inspectorate (DWI) requires that treatment processes maintain minimum standards at all times. Degraded modes must ensure regulatory water quality standards are met, even if optimisation efficiency is reduced. Manual fallback must include water quality monitoring procedures independent of the agent's sensor network, as sensor reliability may be compromised by the same failure that affected the agent.
Healthcare. Clinical environments require that degraded modes and manual fallbacks preserve patient safety above operational efficiency. Manual fallback procedures must be developed in collaboration with clinical staff and approved by the responsible clinician. Operator proficiency requirements must align with clinical competency frameworks. CQC (UK) or equivalent regulators require evidence that environmental and equipment controls are maintained during system failures.
Transportation and Logistics. Degraded modes for autonomous vehicle fleets or logistics coordination agents must account for vehicles already in transit. The degraded mode must include a plan for safely managing in-progress operations, not just preventing new ones. Manual fallback for fleet management may require temporary increase in human coordination staff, which must be planned and available at short notice.
Basic Implementation — The organisation has documented at least one degraded mode and a manual fallback procedure for each safety-critical agent deployment. The degraded mode defines which functions continue and which are suspended. The manual fallback procedure exists as a document. Operator training on manual procedures has been conducted at least once. Handoff between modes is manual and procedural (not system-enforced). This level provides a minimum operational continuity capability but is vulnerable to procedure errors, operator proficiency decay, and authority conflicts during transitions.
Intermediate Implementation — Multiple graduated degraded modes are defined with explicit entry and exit criteria. The authority handoff is system-enforced through a state machine that prevents authority gaps or conflicts. Manual fallback procedures are tested annually under realistic conditions with the actual operators. Pre-staged manual operation kits are maintained and inspected quarterly. Operator proficiency is maintained through regular exercises (minimum semi-annually). Conflict resolution priority rules are implemented and prevent degraded-mode automation from overriding manual inputs. This level provides reliable operational continuity with controlled transitions and maintained human capability.
Advanced Implementation — All intermediate capabilities plus: degraded-mode selection is automated based on failure classification (the system selects the appropriate degraded mode based on the nature of the agent failure). Return to full autonomy requires documented root cause analysis and explicit authorisation. Manual fallback procedures are validated through unannounced drills (not just scheduled exercises). Operator proficiency is quantitatively assessed against defined competency thresholds. The organisation can demonstrate to regulators that it can sustain essential operations at defined service levels for at minimum 72 hours without any AI agent assistance, using manual procedures alone.
Required artefacts:
Retention requirements:
Access requirements:
Testing AG-110 compliance requires validation that degraded modes and manual fallbacks are functional, timely, and maintained. The following tests cover the mandatory requirements.
Test 8.1: Degraded-Mode Functional Continuity
Test 8.2: Manual Fallback Execution
Test 8.3: Handoff Protocol Integrity
Test 8.4: Annual Realistic Fallback Test
Test 8.5: Operator Proficiency Verification
Test 8.6: Conflict Resolution Validation
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 14 (Human Oversight) | Direct requirement |
| EU AI Act | Article 9 (Risk Management System) | Supports compliance |
| IEC 61508 | Clause 7.4 (Operation and Maintenance) | Supports compliance |
| IEC 61511 | Clause 16 (SIS Operation and Maintenance) | Direct requirement |
| UK HSE | Management of Change / COMAH Regulations | Supports compliance |
| NIST AI RMF | MANAGE 2.3 (Fallback and Contingency) | Direct requirement |
| ISO 42001 | Clause 8.4 (Operation of AI System) | Supports compliance |
| NIS2 Directive | Article 21 (Cybersecurity Risk Management Measures) | Supports compliance |
| NERC CIP | CIP-008 (Incident Response) | Supports compliance |
Article 14 requires that high-risk AI systems be designed to allow effective human oversight, including the ability for humans to intervene in the system's operation or to stop it. AG-110 directly implements this requirement by mandating that degraded modes with human involvement and fully manual fallback procedures exist, are tested, and are maintained. The regulation's requirement that oversight measures enable the human "to correctly interpret the high-risk AI system's output" maps to the degraded-mode requirement that human operators have the information and tools needed to make decisions the agent would normally make. The requirement for "the ability to decide not to use the high-risk AI system" maps to the manual fallback requirement.
The risk management system must address residual risks that remain after mitigation. Degraded-mode and manual fallback governance mitigates the residual risk that the AI agent itself may fail — a risk that no amount of agent reliability improvement can eliminate entirely.
Clause 16 requires that safety instrumented systems have defined operating procedures for all modes of operation, including degraded operation. For AI agents integrated with or replacing traditional SIS functions, this directly requires degraded-mode definitions and manual fallback procedures. The clause also requires periodic proof testing of safety functions, mapping to the annual testing requirement.
MANAGE 2.3 addresses mechanisms for fallback and contingency when AI systems do not perform as intended. AG-110 provides the detailed implementation framework for these fallback mechanisms in safety-critical and critical infrastructure contexts.
The NIS2 Directive requires essential and important entities to adopt cybersecurity risk management measures, including business continuity and crisis management. For entities using AI agents in essential service delivery, AG-110 implements the operational continuity component — ensuring that essential services continue when the AI agent is compromised, degraded, or unavailable due to cyber incident or other failure.
| Field | Value |
|---|---|
| Severity Rating | Critical |
| Blast Radius | Essential service disruption — potentially affecting thousands to millions of service users; secondary safety impacts from service loss |
Consequence chain: Without degraded-mode and manual fallback governance, an AI agent failure in critical infrastructure creates a binary outcome: either the agent continues operating in a potentially hazardous degraded state (because no controlled degraded mode exists), or the system transitions to safe state and the essential service stops entirely. In the first case, the consequences are those of AG-109 failure — uncontrolled operation leading to physical harm. In the second case, the consequences are service disruption at scale: loss of water treatment for a municipality, loss of power management for a grid region, loss of climate control for a hospital, loss of traffic management for a city. These service disruptions have their own safety consequences — a hospital without adequate climate control, a city without traffic management, a community without treated water. The duration of disruption depends entirely on how quickly human operators can restore manual operation, which without AG-110 governance is unpredictable and potentially measured in hours to days. The financial consequences include regulatory penalties for service disruption (e.g., Ofwat, Ofgem, CQC enforcement), contractual penalties, emergency response costs, and reputational damage. For publicly owned or operated infrastructure, there are additional political and accountability consequences.
Cross-references: AG-109 (Safe-State Transition Governance) defines the safe state that precedes degraded-mode operation in emergency scenarios. AG-001 (Operational Boundary Enforcement) provides the mandate framework that defines what the agent is permitted to do in each mode. AG-008 (Governance Continuity Under Failure) establishes the general continuity principle. AG-038 (Human Control Responsiveness) governs the human override mechanisms that degraded modes depend on. AG-111 (Hazard Analysis Governance) provides the analytical basis for determining which functions are essential and which degraded modes are acceptable. AG-112 (Sector Safety Constraint Governance) defines sector-specific constraints that must be maintained in all operating modes.