Environment Segregation Governance requires that AI agents operating in development, test, staging, production, and disaster recovery environments are structurally isolated from each other at the infrastructure layer. An agent running in the development environment cannot access production data, production APIs, or production infrastructure — and this separation is enforced by network boundaries, credential scoping, and access controls, not by configuration conventions or naming standards alone. The environments are distinct infrastructure domains with no shared credentials, no shared data stores, and no network paths between them unless explicitly authorised through a controlled promotion process. Without structural environment segregation, a single misconfiguration in a development or test environment can expose production data, corrupt production state, or introduce untested agent behaviour into live operations.
Scenario A — Development Agent Accesses Production Database: An engineering team develops an AI agent for automated invoice processing. During development, the team needs realistic test data and copies the production database connection string into the development environment configuration "temporarily." The developer intends to replace it with a test database connection after initial testing. Six weeks later, the production connection string remains in the development configuration. A junior developer runs a load test that submits 5,000 synthetic invoices to what they believe is the test database. The invoices are created in the production database. The organisation's ERP system picks up the invoices and begins generating payment runs. Accounts payable processes £2.3 million in fraudulent payments before the error is detected the following Monday.
What went wrong: No structural barrier prevented the development environment from accessing the production database. The connection was possible because development and production shared network access to the database server and the development service account had production database credentials. Consequence: £2.3 million in erroneous payments requiring recall, 3 payments unrecoverable (£127,000), regulatory notification to the FCA, external audit of all AI development practices, 4-month remediation programme, 2 staff disciplinary proceedings.
Scenario B — Test Environment Leaks Production Customer Data: An organisation creates a test environment by cloning the production environment, including a full copy of the production database with 890,000 customer records. The test database is not anonymised — it contains real names, addresses, financial data, and government identifiers. The test environment has relaxed security controls to facilitate development. A contractor with test environment access exports the database for local development. The export is stored on a personal laptop that is later compromised in a phishing attack. 890,000 customer records are exfiltrated.
What went wrong: Production data was present in the test environment. Environment segregation should prevent production data from existing in non-production environments without anonymisation. The relaxed security controls in the test environment were appropriate for test data but catastrophic for production data. Consequence: Data breach affecting 890,000 customers, ICO investigation and £4.4 million fine, class action lawsuit, mandatory credit monitoring for all affected customers (cost: £8.9 million), CEO resignation.
Scenario C — Staging Agent Promoted to Production Without Gate: An AI customer service agent is deployed in the staging environment for UAT. The staging environment shares the same Kubernetes cluster as production, separated by namespaces. A deployment automation error promotes the staging agent to the production namespace before UAT is complete. The staging agent, which has a known bug that causes it to hallucinate product specifications, begins serving live customer queries. Over 8 hours, the agent provides 1,247 customers with incorrect product specifications, 89 of whom make purchasing decisions based on the incorrect information.
What went wrong: Staging and production shared infrastructure (same Kubernetes cluster). The promotion process was automated without a structural gate. The namespace separation was a logical boundary, not a structural one — a deployment manifest change was sufficient to cross it. Consequence: Product liability claims from 89 customers, Trading Standards investigation for misleading product information, mandatory recall of AI-assisted customer service pending remediation, estimated cost £670,000.
Scope: This dimension applies to any organisation that develops, tests, and deploys AI agents across multiple environments. The canonical environment set is development, test, staging, production, and disaster recovery, but organisations may define additional environments (e.g., sandbox, pre-production, canary, blue/green). The dimension applies regardless of whether environments are hosted on-premises, in cloud infrastructure, or in hybrid configurations. It applies to all components of the agent stack: the agent runtime, the data stores the agent accesses, the APIs the agent calls, the infrastructure the agent runs on, and the configuration that defines the agent's behaviour. Single-environment deployments (where only production exists) are excluded, though such deployments should consider whether the absence of non-production environments creates other risks (e.g., inability to test safely).
4.1. A conforming system MUST maintain structurally separate infrastructure for each environment (development, test, staging, production, and any additional defined environments), with no shared compute, storage, database, or networking resources between environments.
4.2. A conforming system MUST ensure that credentials (service accounts, API keys, tokens, certificates) are environment-specific — no credential valid in the production environment is valid in any non-production environment, and vice versa.
4.3. A conforming system MUST prevent production data from being present in non-production environments unless the data has been anonymised, pseudonymised, or replaced with synthetic data that cannot be reversed to identify production entities.
4.4. A conforming system MUST enforce network-level separation between environments such that no network path exists between a non-production environment and a production data store, API, or service without traversing an explicit, logged, and auditable gateway.
4.5. A conforming system MUST implement a controlled promotion process for moving agent configurations, code, and models from non-production to production environments, with approval gates, automated testing, and audit trails.
4.6. A conforming system SHOULD implement environment-specific DNS or service discovery such that environment-specific resource names prevent accidental cross-environment connections (e.g., db.prod.internal vs. db.dev.internal with no DNS resolution of db.prod.internal from the development network).
4.7. A conforming system SHOULD tag all infrastructure resources with their environment identifier and implement automated drift detection that alerts when resources are misconfigured across environment boundaries.
4.8. A conforming system SHOULD implement data masking or synthetic data generation for non-production environments, with automated verification that no production PII or sensitive data exists in non-production data stores.
4.9. A conforming system MAY implement ephemeral non-production environments that are created on demand from environment templates and destroyed after use, eliminating persistent non-production environments where configuration drift can accumulate.
Environment segregation is a foundational principle of software engineering that predates AI systems by decades. Its importance is amplified for AI agents because of three factors unique to AI deployments.
First, AI agents interact with external systems — they call APIs, write to databases, send messages, and trigger workflows. A traditional software bug in a development environment might produce incorrect log output. An AI agent bug in a development environment with production access might send 5,000 emails to real customers, place real financial transactions, or modify real database records. The blast radius of a development-environment error is determined by the production access available from that environment.
Second, AI agents require realistic data for effective testing, creating pressure to use production data in non-production environments. This pressure is structurally different from traditional software testing because AI agents' behaviour depends on data distributions, not just data structures. A test database with 100 synthetic records does not exercise the same agent behaviour as a production database with 890,000 real records. This legitimate need for realistic data must be met with anonymisation and synthetic data generation, not with production data copies.
Third, AI agent behaviour is non-deterministic and difficult to predict from code inspection alone. A traditional software promotion from staging to production carries the risk that the code behaves differently in production — but the behaviour is deterministic and reproducible. An AI agent promotion carries the additional risk that the agent's behaviour in production diverges from staging because the production data distribution, user interaction patterns, or API response patterns differ. This makes the promotion gate between environments more critical, not less.
Environment segregation for AI agents requires structural separation across all infrastructure layers — compute, network, storage, identity, and configuration.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. PRA/FCA expects firms to maintain separate environments for development, testing, and production, with controls preventing production data from being used in non-production environments without appropriate protection. MiFID II Article 48 requires that trading systems (including AI trading agents) be tested in environments that do not interact with production markets.
Healthcare. HIPAA requires that ePHI in test environments be de-identified per the Safe Harbor or Expert Determination methods. Using production ePHI in test environments without de-identification is a HIPAA violation regardless of the test environment's security controls.
Government. NIST SP 800-53 CM-2 (Baseline Configuration) and SA-11 (Developer Testing) require environment segregation. FedRAMP requires separate environments with documented boundaries.
Basic Implementation — Separate environment configurations exist for development, test, and production. Credentials are environment-specific. Network access between environments is restricted but not fully isolated (e.g., VPN access to production from development networks exists for troubleshooting). Production data in test environments is partially anonymised. Promotion between environments follows a documented process with manual approval. Limitations: network paths between environments exist; data anonymisation may be incomplete; promotion process relies on manual discipline.
Intermediate Implementation — Environments are deployed in separate cloud accounts or fully isolated network segments. No network path exists between non-production and production environments except through controlled gateways. Credentials are managed in environment-specific vaults with no cross-environment access. Production data is anonymised through an automated pipeline with verification. Promotion gates are automated with mandatory test suite pass, security scan, and multi-party approval. Environment drift detection runs daily.
Advanced Implementation — All intermediate capabilities plus: environment segregation has been verified through independent adversarial testing including cross-environment network scanning, credential scope analysis, and data leakage detection. Non-production environments are ephemeral — created from templates and destroyed after use. Canary deployments provide a controlled intermediate step between staging and full production. The promotion pipeline includes automated behavioural comparison between the agent's staging behaviour and expected production behaviour. The organisation can demonstrate to regulators that no path exists for production data to reach non-production environments or for non-production agent configurations to reach production without passing all promotion gates.
Required artefacts:
Retention requirements:
Access requirements:
Test 8.1: Cross-Environment Network Isolation
Test 8.2: Cross-Environment Credential Invalidity
Test 8.3: Production Data Absence in Non-Production Environments
Test 8.4: Promotion Gate Enforcement
Test 8.5: Default Environment Deny
Test 8.6: Environment Configuration Drift Detection
Test 8.7: Agent Behaviour Isolation Between Environments
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management System) | Supports compliance |
| PRA SS1/21 | Model Risk Management — Development & Testing Controls | Direct requirement |
| MiFID II | Article 48 (Testing of Algorithms) | Direct requirement |
| HIPAA | §164.514 (De-identification of PHI) | Direct requirement |
| NIST SP 800-53 | CM-2 (Baseline Configuration), SA-11 (Developer Testing) | Direct requirement |
| ISO 27001 | A.8.31 (Separation of Development, Test, Production) | Direct requirement |
| DORA | Article 8 (ICT Systems, Protocols, Tools) | Supports compliance |
| FCA SYSC | 6.1.1R (Systems and Controls) | Supports compliance |
PRA SS1/21 expects firms to maintain separate environments for model development, validation, and production deployment. The supervisory statement specifically addresses the risk that development practices may inadvertently affect production systems. For AI agents, this expectation translates to structural environment segregation with controlled promotion gates. The PRA will examine whether a firm's development environment could, through any configuration error or deliberate action, affect the production agent's behaviour or access production data.
Article 48 requires investment firms to test algorithmic trading systems in environments that do not interact with production trading venues. For AI trading agents, this means the testing environment must be structurally isolated from production market connections. The testing environment may connect to exchange-provided test environments (simulators) but must have no network path to production order routing systems. Compliance requires demonstrating that no test order could reach a production trading venue.
The HIPAA Privacy Rule permits the use of de-identified health information without restriction. For AI agent development, this means production PHI can only be used in test environments if it has been de-identified per the Safe Harbor method (removal of 18 specified identifiers) or the Expert Determination method. AG-301's requirement for data anonymisation in non-production environments implements this obligation at the infrastructure level.
ISO 27001 Annex A control A.8.31 directly requires the separation of development, testing, and operational environments to reduce the risks of unauthorised access or changes to the operational environment. AG-301 operationalises this control for AI agent deployments, extending the traditional interpretation to cover the AI-specific risks of non-deterministic behaviour, data distribution sensitivity, and cross-environment agent action execution.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Environment-crossing — production systems affected by non-production activities, or production data exposed in non-production environments |
Consequence chain: Environment segregation failure creates two distinct damage paths. Path 1: Non-production to production contamination — a development or test agent accesses production systems and executes actions against real data, real customers, or real financial instruments. The impact is immediate and potentially irreversible: payments made, emails sent, records modified, orders placed. Recovery requires identifying all cross-environment actions and reversing them, which may be impossible for external-facing actions. Path 2: Production to non-production data leakage — production data is present in non-production environments with weaker security controls, leading to data breach through the non-production environment. The impact scales with the volume of production data present and the number of individuals whose data is exposed. Both paths can result in regulatory enforcement action, mandatory breach notification, financial penalties, and reputational damage. For financial services firms, cross-environment contamination of trading systems may constitute a market integrity incident requiring regulatory disclosure.
Cross-references: AG-302 (Production Write Isolation Governance) provides additional controls specifically for production write paths. AG-299 (Workspace Segmentation Governance) addresses within-environment segmentation. AG-300 (Client-Tenant Segregation Governance) addresses cross-tenant isolation that must be maintained within each environment. AG-015 (Organisational Namespace Isolation) establishes the namespace isolation principle. AG-034 (Cross-Domain Boundary Enforcement) covers domain-level boundary controls. AG-013 (Data Sensitivity and Exfiltration Prevention) addresses data protection controls relevant to production data in non-production environments.