AG-836

Physical-Action Reversibility and Fail-Safe-to-Stop

Embodied AI, Humanoids & Robot Fleets ~6 min read AGS v2.1 · 2026-06-06
EU AI Act NIST AI RMF ISO 42001

AGS Embodied AI (Group L) | Embodied AI, Humanoids & Robot Fleets | Version 3.0

1. Definition

Physical-Action Reversibility and Fail-Safe-to-Stop governs the requirement that an embodied agent prefers reversible physical actions, authorises consequential or irreversible actuator actions before executing them, and fails safe — to a controlled stop or safe state — on malfunction, low confidence, proximity breach, or loss of control.

Physical actions can be irreversible and immediately harmful; this dimension provides the pre-action gate and the fail-safe stop that bound the consequences of an embodied agent's mistakes.

2. Scope

In scope: pre-action authorisation for consequential/irreversible physical actions; preference for reversible actions; safety-rated emergency stop and safe-state behaviour on fault/uncertainty/proximity/control-loss; geofenced exclusion.

Out of scope: force/speed limiting and safety classes (AG-835), and validation (AG-837). This dimension governs *reversibility, pre-action authorisation, and fail-safe-to-stop*.

3. Why This Matters

An embodied agent that takes an irreversible physical action on a wrong inference — dropping a load on a person, making an incorrect surgical motion, driving into a hazard — can cause immediate, unrecoverable harm. Preferring reversible actions, gating irreversible ones, and defaulting to a safe stop on any fault or uncertainty bounds the worst case: the agent's errors halt safely rather than completing into harm.

4. Requirements

5. Maturity Model

6. Test Criteria

Test 6.1: Fail-Safe on Fault

Test 6.2: Irreversible-Action Gate

Test 6.3: Policy-Independent Stop

7. Scoring

ScoreCriteria
0No fail-safe stop; irreversible physical actions execute without authorisation
1Emergency stop exists but no pre-action gating or fault-triggered safe state
2Pre-action authorisation, fail-safe on fault/uncertainty/proximity, geofencing, policy-independent stop
3Reversibility-preferring selection, tested non-disable-able stop, logged activations, checked recovery

8. Failure Scenarios

Scenario A — Completed Mistake: A robot arm, acting on a misperception, continues an irreversible motion that injures a worker. A pre-action safety gate and fail-safe-to-stop on the perception anomaly would have halted it before contact.

Scenario B — Stop Defeated by Software: The emergency stop is implemented in the same software stack as the AI policy; when that stack hangs, the stop is unavailable. A safety-rated, independent stop path would have remained effective.

Scenario C — Auto-Resume Into Fault: After a safe stop, the agent automatically resumes into the same unresolved fault and repeats the unsafe action. Checked recovery requiring confirmation would have prevented the loop.

9. Regulatory Mapping

RequirementEU AI ActNIST AI RMFISO 42001
R1: Pre-action authorisation for irreversible actsArt. 14 — Human oversightMAP 3.5 — Human oversightClause 8.1 — Operational control
R2: Prefer reversible actionsArt. 9 — Risk managementMAP 5.1 — Impact magnitudeClause 6.1 — Actions to address risk
R3: Safety-rated independent e-stopArt. 15 — Fail-safeMANAGE 2.4 — DeactivationClause 8.1 — Operational control
R4: Fail-safe on fault/uncertaintyArt. 15 — Robustness, fail-safeMANAGE 2.4 — Fail-safeClause 8.1 — Operational control
R5: Geofenced operating envelopeArt. 9 — Risk managementMAP 3.3 — Application scopeClause 8.1 — Operational control
R6: Non-disable-able, tested stopArt. 15 — RobustnessMEASURE 2.6 — Safety evaluationClause 8.3 — Verification
R7: Logged activations/authorisationsArt. 12 — Record-keepingMEASURE 2.4 — Production monitoringClause 9.1 — Monitoring and measurement
R8: Checked recoveryArt. 14 — Human oversightMANAGE 2.4 — DeactivationClause 8.1 — Operational control

> Standards note: align fail-safe/e-stop design to ISO 10218-2:2025, ISO 3691-4, and IEC 61508 (functional safety / safe-stop performance levels); pre-action authorisation for consequential physical actions reflects emerging embodied-agent safety practice.

EU AI Act — Article 14 and Article 15

Article 14 (human oversight including the ability to stop) and Article 15 (robustness and fail-safe) require that an embodied agent can be brought to a safe state and that irreversible actions remain under control — the core of this dimension.

NIST AI RMF — MANAGE 2.4, MAP 3.5

MANAGE 2.4 (deactivation/safe-state) and MAP 3.5 (human oversight) require a reliable stop and human authorisation for consequential physical action.

ISO 42001 — Clause 8.1, A.6

Clause 8.1 (operational control) and Annex A.6 (lifecycle) require fail-safe operational controls for physical AI systems.

Cite this protocol
AgentGoverning. (2026). AG-836: Physical-Action Reversibility and Fail-Safe-to-Stop. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-836