AG-301: Environment Segregation Governance

2. Summary

Environment Segregation Governance requires that AI agents operating in development, test, staging, production, and disaster recovery environments are structurally isolated from each other at the infrastructure layer. An agent running in the development environment cannot access production data, production APIs, or production infrastructure — and this separation is enforced by network boundaries, credential scoping, and access controls, not by configuration conventions or naming standards alone. The environments are distinct infrastructure domains with no shared credentials, no shared data stores, and no network paths between them unless explicitly authorised through a controlled promotion process. Without structural environment segregation, a single misconfiguration in a development or test environment can expose production data, corrupt production state, or introduce untested agent behaviour into live operations.

3. Example

Scenario A — Development Agent Accesses Production Database: An engineering team develops an AI agent for automated invoice processing. During development, the team needs realistic test data and copies the production database connection string into the development environment configuration "temporarily." The developer intends to replace it with a test database connection after initial testing. Six weeks later, the production connection string remains in the development configuration. A junior developer runs a load test that submits 5,000 synthetic invoices to what they believe is the test database. The invoices are created in the production database. The organisation's ERP system picks up the invoices and begins generating payment runs. Accounts payable processes £2.3 million in fraudulent payments before the error is detected the following Monday.

What went wrong: No structural barrier prevented the development environment from accessing the production database. The connection was possible because development and production shared network access to the database server and the development service account had production database credentials. Consequence: £2.3 million in erroneous payments requiring recall, 3 payments unrecoverable (£127,000), regulatory notification to the FCA, external audit of all AI development practices, 4-month remediation programme, 2 staff disciplinary proceedings.

Scenario B — Test Environment Leaks Production Customer Data: An organisation creates a test environment by cloning the production environment, including a full copy of the production database with 890,000 customer records. The test database is not anonymised — it contains real names, addresses, financial data, and government identifiers. The test environment has relaxed security controls to facilitate development. A contractor with test environment access exports the database for local development. The export is stored on a personal laptop that is later compromised in a phishing attack. 890,000 customer records are exfiltrated.

What went wrong: Production data was present in the test environment. Environment segregation should prevent production data from existing in non-production environments without anonymisation. The relaxed security controls in the test environment were appropriate for test data but catastrophic for production data. Consequence: Data breach affecting 890,000 customers, ICO investigation and £4.4 million fine, class action lawsuit, mandatory credit monitoring for all affected customers (cost: £8.9 million), CEO resignation.

Scenario C — Staging Agent Promoted to Production Without Gate: An AI customer service agent is deployed in the staging environment for UAT. The staging environment shares the same Kubernetes cluster as production, separated by namespaces. A deployment automation error promotes the staging agent to the production namespace before UAT is complete. The staging agent, which has a known bug that causes it to hallucinate product specifications, begins serving live customer queries. Over 8 hours, the agent provides 1,247 customers with incorrect product specifications, 89 of whom make purchasing decisions based on the incorrect information.

What went wrong: Staging and production shared infrastructure (same Kubernetes cluster). The promotion process was automated without a structural gate. The namespace separation was a logical boundary, not a structural one — a deployment manifest change was sufficient to cross it. Consequence: Product liability claims from 89 customers, Trading Standards investigation for misleading product information, mandatory recall of AI-assisted customer service pending remediation, estimated cost £670,000.

4. Requirement Statement

Scope: This dimension applies to any organisation that develops, tests, and deploys AI agents across multiple environments. The canonical environment set is development, test, staging, production, and disaster recovery, but organisations may define additional environments (e.g., sandbox, pre-production, canary, blue/green). The dimension applies regardless of whether environments are hosted on-premises, in cloud infrastructure, or in hybrid configurations. It applies to all components of the agent stack: the agent runtime, the data stores the agent accesses, the APIs the agent calls, the infrastructure the agent runs on, and the configuration that defines the agent's behaviour. Single-environment deployments (where only production exists) are excluded, though such deployments should consider whether the absence of non-production environments creates other risks (e.g., inability to test safely).

4.1. A conforming system MUST maintain structurally separate infrastructure for each environment (development, test, staging, production, and any additional defined environments), with no shared compute, storage, database, or networking resources between environments.

4.2. A conforming system MUST ensure that credentials (service accounts, API keys, tokens, certificates) are environment-specific — no credential valid in the production environment is valid in any non-production environment, and vice versa.

4.3. A conforming system MUST prevent production data from being present in non-production environments unless the data has been anonymised, pseudonymised, or replaced with synthetic data that cannot be reversed to identify production entities.

4.4. A conforming system MUST enforce network-level separation between environments such that no network path exists between a non-production environment and a production data store, API, or service without traversing an explicit, logged, and auditable gateway.

4.5. A conforming system MUST implement a controlled promotion process for moving agent configurations, code, and models from non-production to production environments, with approval gates, automated testing, and audit trails.

4.6. A conforming system SHOULD implement environment-specific DNS or service discovery such that environment-specific resource names prevent accidental cross-environment connections (e.g., db.prod.internal vs. db.dev.internal with no DNS resolution of db.prod.internal from the development network).

4.7. A conforming system SHOULD tag all infrastructure resources with their environment identifier and implement automated drift detection that alerts when resources are misconfigured across environment boundaries.

4.8. A conforming system SHOULD implement data masking or synthetic data generation for non-production environments, with automated verification that no production PII or sensitive data exists in non-production data stores.

4.9. A conforming system MAY implement ephemeral non-production environments that are created on demand from environment templates and destroyed after use, eliminating persistent non-production environments where configuration drift can accumulate.

5. Rationale

Environment segregation is a foundational principle of software engineering that predates AI systems by decades. Its importance is amplified for AI agents because of three factors unique to AI deployments.

First, AI agents interact with external systems — they call APIs, write to databases, send messages, and trigger workflows. A traditional software bug in a development environment might produce incorrect log output. An AI agent bug in a development environment with production access might send 5,000 emails to real customers, place real financial transactions, or modify real database records. The blast radius of a development-environment error is determined by the production access available from that environment.

Second, AI agents require realistic data for effective testing, creating pressure to use production data in non-production environments. This pressure is structurally different from traditional software testing because AI agents' behaviour depends on data distributions, not just data structures. A test database with 100 synthetic records does not exercise the same agent behaviour as a production database with 890,000 real records. This legitimate need for realistic data must be met with anonymisation and synthetic data generation, not with production data copies.

Third, AI agent behaviour is non-deterministic and difficult to predict from code inspection alone. A traditional software promotion from staging to production carries the risk that the code behaves differently in production — but the behaviour is deterministic and reproducible. An AI agent promotion carries the additional risk that the agent's behaviour in production diverges from staging because the production data distribution, user interaction patterns, or API response patterns differ. This makes the promotion gate between environments more critical, not less.

6. Implementation Guidance

Environment segregation for AI agents requires structural separation across all infrastructure layers — compute, network, storage, identity, and configuration.

Recommended patterns:

Separate cloud accounts per environment. Use separate cloud accounts (AWS accounts, Azure subscriptions, GCP projects) for each environment. This provides the strongest structural separation because cloud provider IAM boundaries enforce isolation by default. A development account cannot access production account resources without an explicit cross-account role assumption that can be tightly controlled and monitored. Cost: approximately 10-15% higher management overhead, but eliminates entire classes of cross-environment access.
Environment-specific credential vaults. Maintain separate secret management systems (or separate paths within a secret management system) for each environment. Production credentials are stored in the production vault; development credentials are stored in the development vault. No service or user in the development environment has access to the production vault. Credential rotation schedules are environment-specific.
Automated data anonymisation pipelines. Implement automated pipelines that extract production data, anonymise or pseudonymise PII and sensitive fields, and load the anonymised data into non-production environments. The pipeline is the only approved path for production data to reach non-production environments. The pipeline includes verification steps that confirm anonymisation completeness before loading. Anonymisation techniques include: k-anonymity for demographic data, tokenisation for identifiers, differential privacy noise injection for aggregate statistics, and synthetic data generation for training sets.
Promotion gates with automated verification. Implement the promotion process (dev to test to staging to production) as a CI/CD pipeline with structural gates. Each gate requires: automated test suite pass, security scan pass, approval from designated reviewers, and verification that the promotion target is the correct environment. The gate is a structural control — the promotion cannot proceed without all gates passing. Rollback procedures are automated and tested.
Network-level environment isolation. Each environment occupies a separate network segment with no routing between segments. The only cross-environment network path is through a controlled promotion gateway. DNS resolution is environment-specific — production service names do not resolve from non-production networks.

Anti-patterns to avoid:

Shared databases across environments. A development agent and a production agent accessing the same database, even with different table prefixes, creates a cross-environment data path that violates structural segregation.
Production credential reuse in non-production environments. Copying production API keys, database connection strings, or service account credentials into development configurations — even "temporarily" — creates a direct production access path from the non-production environment.
Environment differentiation by configuration only. If the only difference between development and production is an environment variable, a single configuration error collapses the boundary. Structural separation means that even with the wrong configuration, the development environment cannot reach production resources because the network, credentials, and infrastructure do not permit it.
Full production data copies in test environments. Copying the production database to test without anonymisation creates a production data exposure in an environment with weaker security controls. The test environment security posture must match the data sensitivity — or the data sensitivity must be reduced to match the environment's security posture.
Shared Kubernetes clusters across environments. Namespace separation within a shared cluster is a logical boundary, not a structural one. A pod in the development namespace on a shared cluster may be able to reach production services through cluster networking, shared service accounts, or misconfigured network policies.

Industry Considerations

Financial Services. PRA/FCA expects firms to maintain separate environments for development, testing, and production, with controls preventing production data from being used in non-production environments without appropriate protection. MiFID II Article 48 requires that trading systems (including AI trading agents) be tested in environments that do not interact with production markets.

Healthcare. HIPAA requires that ePHI in test environments be de-identified per the Safe Harbor or Expert Determination methods. Using production ePHI in test environments without de-identification is a HIPAA violation regardless of the test environment's security controls.

Government. NIST SP 800-53 CM-2 (Baseline Configuration) and SA-11 (Developer Testing) require environment segregation. FedRAMP requires separate environments with documented boundaries.

Maturity Model

Basic Implementation — Separate environment configurations exist for development, test, and production. Credentials are environment-specific. Network access between environments is restricted but not fully isolated (e.g., VPN access to production from development networks exists for troubleshooting). Production data in test environments is partially anonymised. Promotion between environments follows a documented process with manual approval. Limitations: network paths between environments exist; data anonymisation may be incomplete; promotion process relies on manual discipline.

Intermediate Implementation — Environments are deployed in separate cloud accounts or fully isolated network segments. No network path exists between non-production and production environments except through controlled gateways. Credentials are managed in environment-specific vaults with no cross-environment access. Production data is anonymised through an automated pipeline with verification. Promotion gates are automated with mandatory test suite pass, security scan, and multi-party approval. Environment drift detection runs daily.

Advanced Implementation — All intermediate capabilities plus: environment segregation has been verified through independent adversarial testing including cross-environment network scanning, credential scope analysis, and data leakage detection. Non-production environments are ephemeral — created from templates and destroyed after use. Canary deployments provide a controlled intermediate step between staging and full production. The promotion pipeline includes automated behavioural comparison between the agent's staging behaviour and expected production behaviour. The organisation can demonstrate to regulators that no path exists for production data to reach non-production environments or for non-production agent configurations to reach production without passing all promotion gates.

7. Evidence Requirements

Required artefacts:

Environment inventory. A register of all environments (development, test, staging, production, DR, and any others), including their infrastructure identifiers (cloud account IDs, network ranges, cluster identifiers), and the structural boundaries between them.
Credential separation evidence. Documentation or automated scan results demonstrating that credentials are environment-specific — no credential valid in production is present in or accessible from a non-production environment.
Data anonymisation verification. Results from automated or manual verification that non-production environments contain no production PII or sensitive data. Minimum quarterly verification.
Promotion audit trail. Logs of all promotions from non-production to production, including approvals, test results, and the specific artefacts promoted. Minimum 12 months retention.
Network separation evidence. Firewall rules, security group configurations, or network scan results demonstrating no unauthorized network path between non-production and production environments.

Retention requirements:

Environment inventories, promotion audit trails, and data anonymisation verification: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request.

8. Test Specification

Test 8.1: Cross-Environment Network Isolation

Stimulus: From a development environment host, attempt to establish a network connection to a production database, API, or service endpoint.
Expected behaviour: The connection attempt fails — no network route exists between the development environment and the production resource.
Pass criteria: No network connectivity between non-production and production environments except through controlled gateways.
Fail criteria: Any direct network connection succeeds between a non-production environment and a production resource.

Test 8.2: Cross-Environment Credential Invalidity

Stimulus: Extract a credential (API key, service account token, database password) from the development environment and attempt to use it to authenticate to a production service.
Expected behaviour: Authentication fails — the development credential is not valid in the production environment.
Pass criteria: No development credential authenticates successfully against any production service.
Fail criteria: Any development credential grants access to a production service.

Test 8.3: Production Data Absence in Non-Production Environments

Stimulus: Scan all non-production data stores for production PII indicators: real customer names, real email addresses, real government identifiers, real financial account numbers.
Expected behaviour: No production PII is detected in any non-production data store.
Pass criteria: Zero production PII records detected in non-production environments.
Fail criteria: Any production PII is detected in a non-production data store.

Test 8.4: Promotion Gate Enforcement

Stimulus: Attempt to promote an agent configuration from staging to production while deliberately failing one promotion gate condition (e.g., failing test suite, missing approval, failing security scan).
Expected behaviour: The promotion is blocked. The agent configuration does not reach production.
Pass criteria: Promotion is blocked when any gate condition is not met.
Fail criteria: Promotion proceeds despite a failed gate condition.

Test 8.5: Default Environment Deny

Stimulus: Deploy an agent without an environment designation and attempt to access production resources.
Expected behaviour: Access is denied — the system does not default to granting production access to undesignated agents.
Pass criteria: No resource access succeeds for an agent without an environment designation.
Fail criteria: Any production resource access succeeds without an explicit environment designation.

Test 8.6: Environment Configuration Drift Detection

Stimulus: Deliberately introduce a cross-environment configuration error (e.g., add a production database connection string to a development environment configuration). Run the drift detection scan.
Expected behaviour: The drift detection scan identifies the cross-environment configuration error and generates an alert.
Pass criteria: The cross-environment configuration error is detected and alerted within the scan interval.
Fail criteria: The cross-environment configuration error is not detected.

Test 8.7: Agent Behaviour Isolation Between Environments

Stimulus: Deploy the same agent in both the staging and production environments. Submit identical queries. Compare responses to verify that each agent is operating against its environment-specific data and services.
Expected behaviour: Responses differ in data-dependent ways, confirming each agent is operating within its own environment.
Pass criteria: Each agent's responses are consistent with its environment's data — no cross-environment data leakage.
Fail criteria: Either agent's responses contain data from the other environment.

Conformance Scoring

Score 0: No environment segregation exists — development, test, and production share infrastructure, credentials, and data stores.
Score 1: Environments are logically separated (different configurations, different namespaces) but share infrastructure, network segments, or credentials — segregation depends on configuration correctness.
Score 2: Environments are structurally separated (different cloud accounts, different network segments, different credentials) with automated promotion gates — segregation is enforced by infrastructure boundaries.
Score 3: Verified by independent adversarial testing — an independent party has attempted cross-environment access through network paths, credential reuse, data leakage, and promotion gate bypass and all attempts have been blocked.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
PRA SS1/21	Model Risk Management — Development & Testing Controls	Direct requirement
MiFID II	Article 48 (Testing of Algorithms)	Direct requirement
HIPAA	§164.514 (De-identification of PHI)	Direct requirement
NIST SP 800-53	CM-2 (Baseline Configuration), SA-11 (Developer Testing)	Direct requirement
ISO 27001	A.8.31 (Separation of Development, Test, Production)	Direct requirement
DORA	Article 8 (ICT Systems, Protocols, Tools)	Supports compliance
FCA SYSC	6.1.1R (Systems and Controls)	Supports compliance

PRA SS1/21 — Model Risk Management

PRA SS1/21 expects firms to maintain separate environments for model development, validation, and production deployment. The supervisory statement specifically addresses the risk that development practices may inadvertently affect production systems. For AI agents, this expectation translates to structural environment segregation with controlled promotion gates. The PRA will examine whether a firm's development environment could, through any configuration error or deliberate action, affect the production agent's behaviour or access production data.

MiFID II — Article 48 (Testing of Algorithms)

Article 48 requires investment firms to test algorithmic trading systems in environments that do not interact with production trading venues. For AI trading agents, this means the testing environment must be structurally isolated from production market connections. The testing environment may connect to exchange-provided test environments (simulators) but must have no network path to production order routing systems. Compliance requires demonstrating that no test order could reach a production trading venue.

HIPAA — §164.514 (De-identification of PHI)

The HIPAA Privacy Rule permits the use of de-identified health information without restriction. For AI agent development, this means production PHI can only be used in test environments if it has been de-identified per the Safe Harbor method (removal of 18 specified identifiers) or the Expert Determination method. AG-301's requirement for data anonymisation in non-production environments implements this obligation at the infrastructure level.

ISO 27001 — A.8.31 (Separation of Development, Test, Production)

ISO 27001 Annex A control A.8.31 directly requires the separation of development, testing, and operational environments to reduce the risks of unauthorised access or changes to the operational environment. AG-301 operationalises this control for AI agent deployments, extending the traditional interpretation to cover the AI-specific risks of non-deterministic behaviour, data distribution sensitivity, and cross-environment agent action execution.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Environment-crossing — production systems affected by non-production activities, or production data exposed in non-production environments

Consequence chain: Environment segregation failure creates two distinct damage paths. Path 1: Non-production to production contamination — a development or test agent accesses production systems and executes actions against real data, real customers, or real financial instruments. The impact is immediate and potentially irreversible: payments made, emails sent, records modified, orders placed. Recovery requires identifying all cross-environment actions and reversing them, which may be impossible for external-facing actions. Path 2: Production to non-production data leakage — production data is present in non-production environments with weaker security controls, leading to data breach through the non-production environment. The impact scales with the volume of production data present and the number of individuals whose data is exposed. Both paths can result in regulatory enforcement action, mandatory breach notification, financial penalties, and reputational damage. For financial services firms, cross-environment contamination of trading systems may constitute a market integrity incident requiring regulatory disclosure.

Cross-references: AG-302 (Production Write Isolation Governance) provides additional controls specifically for production write paths. AG-299 (Workspace Segmentation Governance) addresses within-environment segmentation. AG-300 (Client-Tenant Segregation Governance) addresses cross-tenant isolation that must be maintained within each environment. AG-015 (Organisational Namespace Isolation) establishes the namespace isolation principle. AG-034 (Cross-Domain Boundary Enforcement) covers domain-level boundary controls. AG-013 (Data Sensitivity and Exfiltration Prevention) addresses data protection controls relevant to production data in non-production environments.

Cite this protocol

AgentGoverning. (2026). AG-301: Environment Segregation Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-301

← Previous Protocol

AG-300

Client-Tenant Segregation Governance

Next Protocol →

AG-302

Production Write Isolation Governance