AG-302

Production Write Isolation Governance

Access, Segmentation & Least Privilege ~16 min read AGS v2.1 · April 2026
EU AI Act SOX FCA NIST

2. Summary

Production Write Isolation Governance requires that all AI agent write operations in production — database writes, API calls with side effects, file system modifications, message dispatches, actuator commands, and any action that changes the state of a production system — are subject to heightened isolation controls, approval gates, and monitoring that are structurally separate from the controls applied to read operations. The principle is asymmetric security: reading production data is a confidentiality risk that is addressed by access controls, but writing to production systems is an integrity and availability risk that demands additional structural safeguards. An agent that can read a production database poses a data leakage risk; an agent that can write to a production database poses a data corruption, financial loss, and operational disruption risk. The write path must be narrower, more tightly controlled, and more heavily monitored than the read path.

3. Example

Scenario A — Unrestricted Production Write Path Enables Cascading Data Corruption: An enterprise AI agent is deployed to automate customer record updates. The agent has direct write access to the production CRM database through a service account with full CRUD permissions on the customer table. A reasoning error causes the agent to misinterpret a batch update instruction and execute an UPDATE statement that modifies the email address field for 47,000 customer records to the same value. The error is detected 4 hours later when customers report not receiving expected communications. Recovery requires restoring from backup, but 4 hours of legitimate updates made after the corruption must be manually reconciled. The reconciliation takes 3 weeks and costs £890,000 in staff time and lost revenue from disrupted customer communications.

What went wrong: The agent had unrestricted write access to the production database with no intermediate validation layer. The write operation was executed directly against the production table without a pre-write validation check, a write rate limiter, or a scope constraint on the number of records a single operation could affect. Consequence: 47,000 corrupted customer records, £890,000 remediation cost, 3-week customer communication disruption, ICO notification for personal data integrity breach.

Scenario B — Agent Actuator Command Without Safety Interlock: A warehouse management AI agent controls inventory sorting actuators through a production API. The agent's production write path includes direct actuator commands with no safety interlock layer. A sensor data anomaly causes the agent to issue conflicting positioning commands to two adjacent actuators, creating a physical collision. The collision damages both actuators (replacement cost: £45,000 each), halts the sorting line for 12 hours, and creates a near-miss safety incident for a nearby warehouse worker.

What went wrong: The agent's production write path to physical actuators had no safety interlock layer between the agent's command output and the actuator execution. Production write isolation for actuator commands requires a safety validation layer that checks commands for physical feasibility, conflict detection, and safety envelope compliance before execution. Consequence: £90,000 in equipment damage, 12-hour operational disruption, near-miss safety incident, HSE investigation.

Scenario C — Bulk Production Write Without Rate Limiting: A financial reconciliation agent is authorised to post journal entries to the production general ledger. The agent has per-entry write authority but no aggregate rate limit on production writes. A data feed error causes the agent to reprocess 30 days of already-reconciled transactions, posting 340,000 duplicate journal entries to the general ledger over 45 minutes. The duplicate entries distort the trial balance by £127 million. The month-end close is delayed by 2 weeks while the finance team identifies and reverses all duplicate entries.

What went wrong: The agent's production write path had no rate limiting, no duplicate detection, and no aggregate volume constraint. Individual writes were within mandate, but the aggregate volume of production writes was unconstrained. Consequence: £127 million trial balance distortion, 2-week month-end close delay, external auditor qualification concern, CFO escalation to board audit committee.

4. Requirement Statement

Scope: This dimension applies to any AI agent with write access to production systems — any system where agent actions change the state of data, infrastructure, communications, financial instruments, physical actuators, or any other production resource. "Write" includes database INSERT, UPDATE, DELETE; API calls with side effects (POST, PUT, PATCH, DELETE); file system create, modify, delete; message queue publish; email or notification dispatch; actuator commands; configuration changes; and any other operation that modifies production state. Read-only agents are excluded. Agents with conditional write access (write capability that can be activated) are in scope because the write capability exists even if not currently active.

4.1. A conforming system MUST route all agent production write operations through a write isolation layer that is structurally separate from the agent's runtime process and evaluates each write operation before execution.

4.2. A conforming system MUST enforce write scope constraints that limit the number of records, value magnitude, and blast radius of any single write operation by an agent.

4.3. A conforming system MUST implement write rate limiting that constrains the volume of write operations an agent can execute within defined time periods (per-minute, per-hour, per-day).

4.4. A conforming system MUST log all production write operations with full attribution (agent identity, timestamp, operation type, affected resources, and pre-write state) in a tamper-evident log that the agent cannot modify.

4.5. A conforming system MUST implement pre-write validation that verifies the write operation's consistency with business rules, referential integrity, and safety constraints before execution.

4.6. A conforming system SHOULD implement reversibility mechanisms for agent production writes — either through soft-delete patterns, append-only data structures, or pre-write state snapshots that enable rollback.

4.7. A conforming system SHOULD require elevated approval for write operations that exceed defined thresholds (e.g., writes affecting more than 100 records, writes exceeding £10,000 in value, writes to safety-critical actuators).

4.8. A conforming system SHOULD implement duplicate detection that prevents an agent from executing the same write operation more than once within a configurable window.

4.9. A conforming system MAY implement shadow-write capabilities that execute write operations against a shadow production replica before the actual production write, enabling pre-production validation of write effects.

5. Rationale

The asymmetry between read and write risk is fundamental to production safety. A read operation that goes wrong returns incorrect data to the agent — a confidentiality or accuracy problem. A write operation that goes wrong changes production state — an integrity, availability, and potentially safety problem. The write cannot be undone by reading correctly next time; the damage persists until actively remediated.

AI agents amplify write risk because of their speed, non-determinism, and susceptibility to reasoning errors. A human operator making manual database updates might execute 50 operations per hour and would notice a systematic error after a few iterations. An AI agent can execute 50,000 operations per hour, and its errors may be systematic — affecting every operation in a batch with the same reasoning error. The combination of speed and systematic error potential means that production write controls must be proportionally more robust than those applied to human operators.

The write isolation layer serves three purposes: it constrains the blast radius of any single write error, it provides a rollback point when errors are detected, and it generates the audit trail needed to understand what happened after an incident. Without write isolation, the first indication of a problem may be the downstream consequence — customers receiving incorrect invoices, actuators exceeding safe operating parameters, or financial reports showing unexplained variances.

6. Implementation Guidance

Production write isolation must be implemented as a structural layer between the agent and all production write targets. The agent submits write requests; the isolation layer validates, constrains, logs, and executes (or rejects) each request.

Recommended patterns:

Anti-patterns to avoid:

Industry Considerations

Financial Services. Production write isolation maps to existing trading system controls: pre-trade risk checks, order validation, and position limits. The FCA expects that AI agents executing financial transactions are subject to the same pre-execution controls as human traders. Write rate limits should align with existing order throttling controls. Write audit trails must meet MiFID II transaction reporting requirements.

Healthcare. Production writes to electronic health records require clinical validation. An AI agent recommending a medication change that triggers a write to the prescription system must pass through a clinical validation layer that checks drug interactions, contraindications, allergy alerts, and dosage ranges. The write isolation layer for healthcare systems is also a clinical safety layer.

Manufacturing and CPS. Production write isolation for actuator commands is a safety-critical control. IEC 61508 safety integrity levels (SIL) should inform the design of the safety interlock layer. The interlock must be independent of the AI agent's software — hardware interlocks are preferred for SIL 3 and SIL 4 applications.

Maturity Model

Basic Implementation — Agent production writes are logged with attribution and timestamp. A write validation layer checks basic business rules before execution. Write rate limits are implemented in the application layer. Pre-write state is not systematically captured. Limitations: rate limits in the application layer may be bypassed; no pre-write state capture for rollback; validation covers basic rules but not comprehensive scope constraints.

Intermediate Implementation — A dedicated write gateway mediates all agent production writes. Write scope constraints (max records, max value, max rate) are enforced at the gateway layer. Pre-write state is captured for every write operation. Write operations exceeding defined thresholds require elevated approval. Duplicate detection prevents repeated execution of the same write. The write audit log is tamper-evident. CQRS separates read and write paths architecturally.

Advanced Implementation — All intermediate capabilities plus: production write isolation has been verified through independent adversarial testing including write amplification attacks, rate limit bypass attempts, and scope constraint evasion. Shadow-write capability validates write effects against a production replica before actual execution. Safety interlock layers for actuator writes are independently certified to relevant safety standards. Write anomaly detection uses statistical baselines to flag unusual write patterns in real time. The organisation can demonstrate to regulators that no agent write operation can bypass isolation controls.

7. Evidence Requirements

Required artefacts:

Retention requirements:

Access requirements:

8. Test Specification

Test 8.1: Write Scope Constraint Enforcement

Test 8.2: Write Rate Limit Enforcement

Test 8.3: Pre-Write State Capture Verification

Test 8.4: Write Isolation Layer Independence

Test 8.5: Duplicate Write Detection

Test 8.6: Write Rollback Capability

Test 8.7: Safety Interlock Enforcement (Actuator Writes)

Conformance Scoring

9. Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 9 (Risk Management System)Supports compliance
EU AI ActArticle 14 (Human Oversight)Supports compliance
SOXSection 404 (Internal Controls Over Financial Reporting)Direct requirement
FCA SYSC6.1.1R (Systems and Controls)Direct requirement
MiFID IIArticle 17 (Algorithmic Trading — Pre-Trade Controls)Direct requirement
IEC 61508SIL Requirements for Safety FunctionsDirect requirement
NIST SP 800-53AU-3 (Content of Audit Records), SI-7 (Software and Information Integrity)Direct requirement
DORAArticle 9 (ICT Risk Management Framework)Supports compliance

SOX — Section 404 (Internal Controls Over Financial Reporting)

Section 404 requires management to assess the effectiveness of internal controls over financial reporting. For AI agents writing to financial systems — general ledger entries, accounts payable postings, accounts receivable adjustments — the write isolation layer is a key internal control. A SOX auditor will examine: Does the write isolation layer prevent the agent from posting entries exceeding authorised thresholds? Does the pre-write state capture enable detection and correction of erroneous entries? Does the tamper-evident log provide a complete audit trail for all agent-generated entries? A material weakness finding is likely if the agent has direct, unconstrained write access to financial systems.

MiFID II — Article 17 (Algorithmic Trading — Pre-Trade Controls)

Article 17 requires investment firms using algorithmic trading systems to implement effective pre-trade controls including price collars, maximum order values, and maximum order volumes. For AI trading agents, the write isolation layer implements these pre-trade controls by validating each order against configured limits before submission to the venue. The write rate limit prevents excessive order flow. The pre-write state capture provides the audit trail required for order reconstruction.

IEC 61508 — SIL Requirements for Safety Functions

For AI agents controlling safety-critical actuators in industrial, manufacturing, or CPS environments, the safety interlock layer within production write isolation constitutes a safety function. IEC 61508 defines safety integrity levels (SIL 1-4) that determine the rigour of design, testing, and verification for safety functions. The safety interlock layer's SIL rating should be determined by the hazard and risk analysis for the specific application. SIL 2 or higher typically requires hardware-diverse redundancy in the interlock — the interlock cannot rely solely on software.

NIST SP 800-53 — AU-3, SI-7

AU-3 requires audit records to contain sufficient information to establish what type of event occurred, when, where, the source, and the outcome. The write audit log with pre-write state capture and full attribution satisfies AU-3 for agent production writes. SI-7 requires detecting unauthorized changes to software and information — the write isolation layer's pre-write validation and tamper-evident logging implement this control.

10. Failure Severity

FieldValue
Severity RatingCritical
Blast RadiusProduction systems — potentially cascading to downstream systems, external parties, and physical safety where actuators are involved

Consequence chain: Failure of production write isolation allows an agent to make unconstrained modifications to production state. The immediate failure is data corruption, erroneous transactions, or unsafe actuator commands. Because writes change state, the damage persists and may propagate: a corrupted customer record feeds into billing systems generating incorrect invoices; an erroneous journal entry distorts financial reports that inform business decisions; an unsafe actuator command creates a physical hazard. The speed of AI agent operations means that by the time the error is detected, thousands or millions of records may be affected. Recovery requires pre-write state data — without it, restoration depends on backups that may not capture the exact pre-error state. The financial consequence includes direct loss from erroneous transactions, remediation cost from data recovery, regulatory penalties for control failures, and liability from downstream harm. For safety-critical applications, the consequence extends to physical injury or death, making production write isolation a life-safety control.

Cross-references: AG-301 (Environment Segregation Governance) ensures that production write paths are not accessible from non-production environments. AG-303 (Data Egress Route Governance) controls outbound data flows that may include write-equivalent operations (e.g., API calls to external systems). AG-304 (Just-in-Time Secrets Issuance Governance) ensures that production write credentials are issued only for the minimum required duration. AG-305 (Privileged Session Recording Governance) provides monitoring for high-risk write sessions. AG-162 (Least-Agency Provisioning) ensures agents receive only the minimum write permissions required.

Cite this protocol
AgentGoverning. (2026). AG-302: Production Write Isolation Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-302