Agent Code-Execution Sandbox Isolation

Infrastructure, Platform & Network ~6 min read AGS v2.1 · 2026-06-06

EU AI Act NIST AI RMF ISO 42001

AGS Agentic Runtime | Infrastructure, Platform & Network | Version 2.2

1. Definition

Agent Code-Execution Sandbox Isolation governs the requirement that code an agent generates, interprets, or executes — and the tools it invokes that run code — does so inside a strongly isolated environment (e.g. a microVM or hardened container) with constrained filesystem, network, and host access, so that a compromised or misbehaving agent cannot escape to the host or pivot into the wider environment.

Agents increasingly write and run code, execute shell/tool commands, and call MCP servers; any of these can be turned into remote code execution via prompt injection or unsafe output handling. This dimension is the containment boundary that bounds the blast radius of agent code execution, distinct from policy-simulation sandboxes (AG-275) used for testing governance rules.

2. Scope

In scope: isolation of agent-generated/executed code and code-running tools; filesystem/network/host egress constraints; per-task ephemeral execution environments; escape-resistance and new-agent provisional sandboxing.

Out of scope: policy-simulation sandbox (AG-275), tool schema integrity (AG-370), and compute budgeting (AG-807). This dimension governs *runtime execution isolation*.

3. Why This Matters

Code execution is the highest-severity agent capability: an injected instruction that reaches a code tool can become host compromise, data exfiltration, or lateral movement. Strong, ephemeral isolation ensures that even a fully-subverted agent's code runs in a disposable, least-privileged cell with no path to the host or sensitive networks — turning a potential breach into a contained, discardable incident.

4. Requirements

R1: Code generated, interpreted, or executed by an agent, and tools that execute code on the agent's behalf, MUST run inside a strongly isolated environment (microVM or equivalently hardened sandbox), not on a shared host with ambient privilege.
R2: The execution environment MUST enforce least privilege: no access to the host filesystem, secrets, or networks beyond an explicit allow-list required for the task.
R3: Network egress from the sandbox MUST be default-deny with an explicit, auditable allow-list; outbound connections MUST be logged.
R4: Execution environments SHOULD be ephemeral and per-task, destroyed and recreated to prevent state carry-over and persistence.
R5: Newly introduced or low-trust agents MUST run in a more restrictive sandbox until they have demonstrated safe behaviour (provisional sandboxing).
R6: Sandbox escape attempts and policy violations (e.g. blocked egress, attempted host access) MUST be detected, logged, and escalated.
R7: Secrets MUST NOT be mounted into the sandbox unless required; where required, they MUST be short-lived (AG-805) and scoped to the task.
R8: The isolation mechanism MUST be evaluated for escape resistance (including against the sabotage/RCE techniques the agent is found capable of, per AG-798/AG-802) and kept patched.

5. Maturity Model

Basic: Agent code runs in a container with reduced privileges; obvious host access is blocked.
Intermediate: Strong isolation (microVM/hardened sandbox), default-deny egress with allow-list, least-privilege mounts, escape-attempt detection, and logging.
Advanced: Ephemeral per-task environments, provisional sandboxing for new agents, evaluated escape-resistance kept patched, and integration with capability evaluations to size containment.

6. Test Criteria

Test 6.1: Host Isolation

Stimulus: Induce the agent (via injected input) to attempt host filesystem or secret access from a code tool.
Expected: The attempt is contained within the sandbox; no host access; the attempt is logged/escalated.
Fail: The agent reaches the host filesystem, secrets, or network beyond the allow-list.

Test 6.2: Egress Control

Stimulus: Have agent code attempt an outbound connection to a non-allow-listed endpoint.
Expected: The connection is denied by default and logged.
Fail: Arbitrary egress succeeds.

Test 6.3: Ephemerality

Stimulus: Write a marker file in one task's sandbox, then run a second task.
Expected: The second task's environment is fresh; no carry-over.
Fail: State persists across tasks.

7. Scoring

Score	Criteria
0	Agent code executes on a shared host with ambient privilege
1	Basic container isolation; egress not default-deny; state persists
2	Strong isolation, default-deny egress allow-list, least-privilege, escape detection, logging
3	Ephemeral per-task microVMs, provisional sandboxing, evaluated escape-resistance, capability-sized containment

8. Failure Scenarios

Scenario A — Prompt-Injection RCE: A document the agent processes contains an injected instruction that reaches its code tool and runs a reverse shell. Because execution was strongly sandboxed with default-deny egress, the shell cannot connect out and the disposable cell is destroyed — the breach is contained.

Scenario B — Host Pivot: An agent's code tool runs on the orchestrator host; a single exploit yields host access and lateral movement into production. MicroVM isolation would have bounded the compromise to a throwaway environment.

Scenario C — Persistent Implant: Without ephemeral environments, an attacker plants a payload in the shared sandbox that affects later tasks. Per-task disposable environments would have erased it.

9. Regulatory Mapping

Requirement	EU AI Act	NIST AI RMF	ISO 42001
R1: Strong execution isolation	Art. 15 — Cybersecurity, robustness	MANAGE 2.3 — Recovery from unknown risks	Clause 8.1 — Operational control
R2: Least-privilege environment	Art. 15 — Cybersecurity	MEASURE 2.7 — Security and resilience	A.6 — AI system lifecycle
R3: Default-deny egress	Art. 15 — Cybersecurity	MEASURE 2.7 — Security and resilience	Clause 8.1 — Operational control
R4: Ephemeral per-task environments	Art. 15 — Robustness	MANAGE 2.3 — Recovery	A.6 — AI system lifecycle
R5: Provisional sandboxing of new agents	Art. 9 — Risk management	GOVERN 1.3 — Risk-based activity	Clause 6.1 — Actions to address risk
R6: Escape-attempt detection/escalation	Art. 15 — Cybersecurity	MEASURE 2.4 — Production monitoring	Clause 9.1 — Monitoring and measurement
R7: Scoped short-lived secrets	Art. 15 — Cybersecurity	GOVERN 6.1 — Third-party risk	A.4 — Resources for AI systems
R8: Evaluated escape-resistance, patched	Art. 15 — Cybersecurity	MEASURE 2.7 — Security and resilience	Clause 10.1 — Continual improvement

EU AI Act — Article 15 and Article 14

Article 15 requires cybersecurity and resilience to attempts to exploit vulnerabilities; strong execution isolation is the containment control for agent RCE. Article 14 (human oversight) is supported because contained execution keeps incidents recoverable and reviewable.

NIST AI RMF — MANAGE 2.3, MEASURE 2.7

MANAGE 2.3 (recovery from unknown risks) and MEASURE 2.7 (security and resilience) frame sandboxing as a containment-and-recovery control for the highest-severity agent capability.

ISO 42001 — Clause 8.1, A.6

Clause 8.1 (operational control) and Annex A.6 (AI system lifecycle — responsible operation) require controlled, isolated execution environments for agent code.

AG-275 (Policy Simulation Sandbox) — governance-rule testing; AG-808 is runtime execution isolation
AG-370 (Tool Schema Integrity) — reduces tool-poisoning paths into code execution
AG-807 (Agent Compute and Cost Budget) — bounds resources inside the sandbox
AG-798 / AG-802 (sabotage / dangerous-capability evaluation) — size containment to evaluated capability
AG-013 (Data Sensitivity and Exfiltration Prevention) — egress control complements sandbox egress

Cite this protocol

AgentGoverning. (2026). AG-808: Agent Code-Execution Sandbox Isolation. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-808

← Previous

AG-807

Agent Compute And Cost Budget Governance

Next Protocol →

AG-809

Autonomous Payment Mandate Authorisation