AGS Agentic Runtime | Infrastructure, Platform & Network | Version 2.2
Agent Code-Execution Sandbox Isolation governs the requirement that code an agent generates, interprets, or executes — and the tools it invokes that run code — does so inside a strongly isolated environment (e.g. a microVM or hardened container) with constrained filesystem, network, and host access, so that a compromised or misbehaving agent cannot escape to the host or pivot into the wider environment.
Agents increasingly write and run code, execute shell/tool commands, and call MCP servers; any of these can be turned into remote code execution via prompt injection or unsafe output handling. This dimension is the containment boundary that bounds the blast radius of agent code execution, distinct from policy-simulation sandboxes (AG-275) used for testing governance rules.
In scope: isolation of agent-generated/executed code and code-running tools; filesystem/network/host egress constraints; per-task ephemeral execution environments; escape-resistance and new-agent provisional sandboxing.
Out of scope: policy-simulation sandbox (AG-275), tool schema integrity (AG-370), and compute budgeting (AG-807). This dimension governs *runtime execution isolation*.
Code execution is the highest-severity agent capability: an injected instruction that reaches a code tool can become host compromise, data exfiltration, or lateral movement. Strong, ephemeral isolation ensures that even a fully-subverted agent's code runs in a disposable, least-privileged cell with no path to the host or sensitive networks — turning a potential breach into a contained, discardable incident.
Test 6.1: Host Isolation
Test 6.2: Egress Control
Test 6.3: Ephemerality
| Score | Criteria |
|---|---|
| 0 | Agent code executes on a shared host with ambient privilege |
| 1 | Basic container isolation; egress not default-deny; state persists |
| 2 | Strong isolation, default-deny egress allow-list, least-privilege, escape detection, logging |
| 3 | Ephemeral per-task microVMs, provisional sandboxing, evaluated escape-resistance, capability-sized containment |
Scenario A — Prompt-Injection RCE: A document the agent processes contains an injected instruction that reaches its code tool and runs a reverse shell. Because execution was strongly sandboxed with default-deny egress, the shell cannot connect out and the disposable cell is destroyed — the breach is contained.
Scenario B — Host Pivot: An agent's code tool runs on the orchestrator host; a single exploit yields host access and lateral movement into production. MicroVM isolation would have bounded the compromise to a throwaway environment.
Scenario C — Persistent Implant: Without ephemeral environments, an attacker plants a payload in the shared sandbox that affects later tasks. Per-task disposable environments would have erased it.
| Requirement | EU AI Act | NIST AI RMF | ISO 42001 |
|---|---|---|---|
| R1: Strong execution isolation | Art. 15 — Cybersecurity, robustness | MANAGE 2.3 — Recovery from unknown risks | Clause 8.1 — Operational control |
| R2: Least-privilege environment | Art. 15 — Cybersecurity | MEASURE 2.7 — Security and resilience | A.6 — AI system lifecycle |
| R3: Default-deny egress | Art. 15 — Cybersecurity | MEASURE 2.7 — Security and resilience | Clause 8.1 — Operational control |
| R4: Ephemeral per-task environments | Art. 15 — Robustness | MANAGE 2.3 — Recovery | A.6 — AI system lifecycle |
| R5: Provisional sandboxing of new agents | Art. 9 — Risk management | GOVERN 1.3 — Risk-based activity | Clause 6.1 — Actions to address risk |
| R6: Escape-attempt detection/escalation | Art. 15 — Cybersecurity | MEASURE 2.4 — Production monitoring | Clause 9.1 — Monitoring and measurement |
| R7: Scoped short-lived secrets | Art. 15 — Cybersecurity | GOVERN 6.1 — Third-party risk | A.4 — Resources for AI systems |
| R8: Evaluated escape-resistance, patched | Art. 15 — Cybersecurity | MEASURE 2.7 — Security and resilience | Clause 10.1 — Continual improvement |
Article 15 requires cybersecurity and resilience to attempts to exploit vulnerabilities; strong execution isolation is the containment control for agent RCE. Article 14 (human oversight) is supported because contained execution keeps incidents recoverable and reviewable.
MANAGE 2.3 (recovery from unknown risks) and MEASURE 2.7 (security and resilience) frame sandboxing as a containment-and-recovery control for the highest-severity agent capability.
Clause 8.1 (operational control) and Annex A.6 (AI system lifecycle — responsible operation) require controlled, isolated execution environments for agent code.