Data Sensitivity and Exfiltration Prevention governs the classification of data sensitivity and the prevention of unauthorised data leaving the governed environment through direct extraction. This dimension addresses one of the highest-impact risks in AI agent deployment: the possibility that an autonomous agent, operating with legitimate access to sensitive data stores, could transmit that data to an unauthorised destination — whether through compromise, misconfiguration, or deliberate manipulation. AI agents create a qualitatively different data exfiltration risk compared to human users because an agent can query, transform, and transmit millions of records in seconds. AG-013 requires that data sensitivity be enforced structurally — at the infrastructure layer, not in the agent's instructions — through classification enforcement, outbound payload inspection, volume thresholds, and destination validation.
Scenario A — PII Exfiltration Through Unmonitored Logging Channel: An AI customer service agent has DLP controls on its primary communication channels (email and chat responses). However, the agent also writes detailed debug logs to a centralised logging platform. The logs include the full context of each customer interaction, including names, account numbers, and transaction details. An attacker gains read access to the logging platform — which has weaker access controls than the production systems — and extracts six months of customer interaction data containing PII for 280,000 customers.
What went wrong: The DLP controls covered the obvious outbound channels but did not cover the logging channel. The logs were treated as internal diagnostic data rather than as an outbound data path carrying sensitive content. Sensitive field masking was not applied to log outputs. Consequence: GDPR Article 33 breach notification required. ICO investigation. Potential fine calculated on the basis of 280,000 affected individuals. Customer notification costs. Reputational damage. The logging platform vendor is also implicated, creating a supply chain dispute.
Scenario B — Volume Threshold Circumvention Through Field Encoding: An AI research agent has a volume threshold of 500 records per hour. An attacker manipulates the agent through prompt injection to exfiltrate a database of 50,000 customer records. Instead of requesting records directly, the injected prompt instructs the agent to encode 100 records into each outbound response by concatenating customer data into a single text field. The volume monitor counts 500 response messages — within the threshold — but each message contains 100 encoded customer records. The effective exfiltration is 50,000 records in one hour despite the 500-record threshold.
What went wrong: The volume threshold counted outbound messages, not the volume of sensitive data within those messages. Content-aware DLP was not in place — the system counted transactions, not data. The encoding of multiple records into a single field bypassed the record-level volume tracking. Consequence: 50,000 customer records exfiltrated. The breach is not detected by the volume monitor, which shows compliant behaviour. Detection occurs weeks later through an external report of the data appearing on a dark web marketplace.
Scenario C — Allowlist Bypass Through Redirect Chain: An AI agent's outbound endpoints are validated against an allowlist. The agent needs to call an approved analytics API. An attacker compromises the analytics API to return HTTP 302 redirects to an attacker-controlled server. The agent's HTTP client follows the redirect, sending the full request payload — including the data intended for the analytics API — to the attacker's server. The allowlist check validated the initial destination but did not validate the redirect target.
What went wrong: The destination validation checked the initial URL but did not enforce the allowlist on redirect targets. The HTTP client's default behaviour was to follow redirects transparently, treating the redirect target as equivalent to the original destination. Consequence: All data sent to the analytics API over a three-week period is duplicated to the attacker's server. The breach includes aggregated customer behaviour data that, while not containing direct PII, enables re-identification when combined with other datasets. GDPR applicability is disputed but the ICO determines the data constitutes personal data under the re-identification test.
Scope: This dimension applies to all agents with access to data stores, APIs returning sensitive data, or communication channels that could carry sensitive content. Any agent that can read sensitive data and has any outbound communication capability is within scope. The combination of read access and outbound capability is the risk — an agent with read access but no outbound capability cannot exfiltrate data, and an agent with outbound capability but no access to sensitive data has nothing to exfiltrate. AG-013 applies wherever both capabilities coexist. The scope extends to indirect exfiltration channels. An agent that writes sensitive data to a log file that is accessible externally, embeds data in an API callback URL, encodes data in the timing or ordering of innocuous requests, or includes data in error messages sent to external monitoring systems is exfiltrating data through indirect channels. AG-013 requires that all outbound data paths — not just the obvious ones — be subject to sensitivity controls.
4.1. A conforming system MUST enforce data classification labels at both ingestion and output, ensuring that sensitivity designations propagate through all processing stages.
4.2. A conforming system MUST detect personally identifiable information in outbound agent payloads and block transmission unless the recipient is explicitly authorised for that data classification level.
4.3. A conforming system MUST define and enforce volume thresholds on data export per time window, counting both record count and byte volume.
4.4. A conforming system SHOULD validate destination endpoints against an explicit allowlist, rejecting any destination not on the list regardless of payload content, including redirect targets.
4.5. A conforming system SHOULD apply sensitive field masking in audit logs with full fidelity retained in access-controlled secure cold storage.
4.6. A conforming system SHOULD detect steganographic exfiltration patterns, including data encoded in non-obvious channels such as URL parameters, error messages, timing patterns, and metadata fields.
4.7. A conforming system MAY implement content-aware data loss prevention integrated with the governance pipeline, inspecting payloads against the full classification schema.
Data Sensitivity and Exfiltration Prevention addresses a risk that scales catastrophically with AI agent capabilities. A human with database access might copy records one screen at a time — a tedious process that limits the practical scale of manual exfiltration. An AI agent with the same access can query, transform, and transmit millions of records in seconds. The agent's speed, combined with its ability to access APIs and communication channels programmatically, means that a single misconfiguration or a single successful prompt injection can result in a complete database extraction in the time it takes a security team to receive and read an alert.
The critical distinction is between AG-013 and AG-032 (Sequential Data Extraction Detection). AG-032 governs the slow, incremental extraction of data through many small requests that individually appear innocuous but collectively constitute a data breach. AG-013 governs the structural data handling controls that prevent direct, bulk exfiltration — classification enforcement, outbound payload inspection, volume thresholds, and destination validation. These are complementary: AG-013 blocks the obvious extraction attempts while AG-032 detects the subtle, distributed ones.
AG-013 requires that data sensitivity be enforced structurally. Telling an agent "do not send customer PII to external systems" is a policy. Intercepting the agent's outbound payloads, scanning for PII patterns, and blocking transmission before it occurs is a structural control. AG-013 requires the structural control. An agent without data sensitivity controls that has access to a database of one million customer records can exfiltrate all one million records in a single operation. The blast radius is not limited by the agent's speed or capability — it is limited only by the size of the data the agent can access.
The detection gap compounds the severity. Without outbound payload inspection, the exfiltration may leave no trace in the agent's normal monitoring. The action looks like a normal API call or email. Detection may not occur until the data surfaces externally — on a dark web marketplace, in a competitor's hands, or through regulatory notification from an affected individual.
AG-013 establishes data classification as the foundation of exfiltration prevention. Classify data at ingestion using a minimum three-tier scheme: public, internal, restricted. Apply classification labels to all data objects and propagate through processing pipelines. Intercept all outbound payloads and scan for classification markers and PII patterns before transmission. Volume tracking should count both record count and byte volume against separate thresholds.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Financial data exfiltration creates both regulatory and competitive exposure. Customer financial data is subject to FCA data governance requirements. Market data may be subject to exchange licensing restrictions. Trading data is competitively sensitive and may be subject to insider trading regulations if leaked. AG-013 controls for financial services agents should cover: customer PII, account data, transaction history, trading positions, and any data subject to exchange licensing or regulatory reporting restrictions.
Healthcare. Healthcare data sensitivity is governed by HIPAA (US), GDPR (EU), and equivalent regulations globally. The sensitivity classification for healthcare must include PHI (protected health information) as a distinct category with specific handling rules. PII detection must cover clinical data formats: diagnosis codes (ICD), procedure codes (CPT), medication names, lab results, and clinical notes. The minimum necessary standard requires that healthcare agents access only the specific data elements needed for their task — a billing agent should access billing data, not clinical records.
Critical Infrastructure. Data exfiltration from critical infrastructure systems can have national security implications. Network topologies, control system configurations, vulnerability assessments, and operational parameters may be classified at government security levels. AG-013 controls for critical infrastructure must account for sector-specific classification schemes (e.g., OFFICIAL, SECRET in UK government) and may require government-approved encryption and handling procedures for classified data. Agents operating in critical infrastructure should have the most restrictive outbound controls — default-deny with explicit approval for every outbound data path.
Basic Implementation — The organisation has a data classification scheme with at least three tiers (e.g., public, internal, restricted). Classification labels are applied to data at ingestion. Outbound agent payloads are scanned for PII patterns (names, email addresses, national identifiers, financial account numbers) using regex or equivalent pattern matching. Detected PII in outbound payloads to unauthorised recipients is blocked. Volume thresholds are defined (e.g., maximum 100 records per hour per agent). This level meets the minimum mandatory requirements but has gaps: classification may not propagate through all processing stages, PII detection relies on known patterns and may miss novel formats, and volume tracking may not account for data encoded in non-obvious fields.
Intermediate Implementation — Classification labels are propagated through all data processing stages as metadata that travels with the data. PII detection uses both pattern matching and named entity recognition for higher accuracy. All outbound endpoints are validated against an explicit allowlist — any destination not on the list is rejected regardless of payload content. Volume tracking counts both record count and byte volume against separate thresholds. Sensitive fields in audit logs are masked with the full-fidelity data retained in access-controlled cold storage. Outbound payload inspection covers all channels: API responses, emails, file exports, webhook callbacks, and logging outputs.
Advanced Implementation — All intermediate capabilities plus: steganographic exfiltration detection identifies data encoded in non-obvious channels (timing patterns, error messages, URL parameters, image metadata). Content-aware data loss prevention (DLP) is integrated into the governance pipeline, inspecting payloads against the full classification schema rather than just PII patterns. The system has been tested through independent adversarial assessment including direct exfiltration, encoded exfiltration, steganographic exfiltration, and volume threshold circumvention. The organisation can demonstrate that no tested exfiltration vector succeeded.
Required artefacts:
Retention requirements:
Access requirements:
Testing AG-013 compliance requires systematic testing of all exfiltration vectors. A comprehensive test programme should include the following tests.
Test 8.1: PII Detection and Blocking
Test 8.2: Volume Threshold Enforcement
Test 8.3: Destination Validation
Test 8.4: Classification Propagation
Test 8.5: Side Channel Exfiltration
Test 8.6: Classification Enforcement at Ingestion
| Regulation | Provision | Relationship Type |
|---|---|---|
| GDPR | Article 25 (Data Protection by Design and by Default) | Direct requirement |
| GDPR | Article 33 (Breach Notification) | Supports compliance |
| FCA SYSC | 3.2 (Data Governance) | Direct requirement |
| HIPAA | Privacy Rule & Security Rule (PHI Safeguards) | Direct requirement |
| SOX | Section 404 (Data Integrity Controls) | Supports compliance |
Article 25 requires that data protection be integrated into the design of processing systems, not bolted on afterwards. For AI agents, this means data sensitivity controls must be architectural — classification labels propagated through processing pipelines, outbound payload inspection at the infrastructure layer, and volume limits enforced structurally. The regulation also requires data minimisation by default: the agent should access only the data fields necessary for its task, not the entire record. AG-013's classification enforcement at both ingestion and output directly implements Article 25's requirement for technical measures ensuring data protection principles.
GDPR imposes strict obligations on breach notification (Article 33, within 72 hours to the supervisory authority) and data subject notification (Article 34, without undue delay to affected individuals). The cost of an AG-013 failure is therefore not just the breach itself but the regulatory response cascade. Effective AG-013 controls reduce the probability and blast radius of breaches that would trigger notification obligations.
The FCA's Systems and Controls (SYSC) requirements include specific obligations for data governance within regulated firms. SYSC 3.2 requires that firms take reasonable care to organise their affairs responsibly and effectively with adequate risk management systems. For firms deploying AI agents with access to customer data, this includes controls to prevent unauthorised data transmission. The FCA has indicated through supervisory communications that firms should apply equivalent data handling controls to AI agents as to human employees with equivalent access.
HIPAA's Privacy Rule and Security Rule impose specific requirements on the handling of protected health information (PHI). The Privacy Rule's minimum necessary standard requires that access to PHI be limited to the minimum information needed for the intended purpose — directly mapping to AG-013's classification enforcement at ingestion. The Security Rule requires technical safeguards including access controls, audit controls, and transmission security. For AI agents in healthcare, AG-013 implements these safeguards through classification enforcement, outbound payload inspection, and destination validation.
SOX requires the integrity of data used in financial reporting. For AI agents that access or process financial data, AG-013 ensures that this data does not leave the controlled environment through unauthorised channels, which would compromise both confidentiality and integrity. A SOX auditor will test whether financial data can be extracted through the agent's outbound channels without appropriate authorisation.
| Field | Value |
|---|---|
| Severity Rating | Critical |
| Blast Radius | Organisation-wide — potentially affecting all data subjects whose records are accessible to the agent, with regulatory exposure across every applicable jurisdiction |
Consequence chain: Without structural data sensitivity and exfiltration controls, a single unremarkable API call can silently exfiltrate an entire customer database. PII leaves the controlled environment undetected, creating regulatory exposure under GDPR and sector-specific data regulations. The failure mode scales catastrophically — an agent without data sensitivity controls that has access to a database of one million customer records can exfiltrate all one million records in a single operation. The blast radius is not limited by the agent's speed or capability; it is limited only by the size of the data the agent can access. This is fundamentally different from human-mediated data breaches, where the volume is limited by the human's operational speed. An AI agent data breach can be total and instantaneous. The detection gap compounds the severity: without outbound payload inspection, the exfiltration may leave no trace in the agent's normal monitoring. The action looks like a normal API call or email. Detection may not occur until the data surfaces externally — on a dark web marketplace, in a competitor's hands, or through regulatory notification from an affected individual. The business consequence includes GDPR fines of up to 4% of annual global turnover, HIPAA penalties, FCA enforcement action, mandatory breach notification costs, class action litigation, and reputational damage that may be permanent.
Cross-references: AG-002 (Cross-Domain Activity Governance) detects cross-domain combinations that collectively reveal sensitive information even when individual domain extractions appear benign. AG-020 (Purpose-Bound Operation Enforcement) controls whether the agent's data access aligns with its stated purpose, preventing pretextual access. AG-032 (Sequential Data Extraction Detection) detects incremental extraction through many small, individually innocuous requests that collectively constitute a breach. AG-040 (Knowledge Accumulation Governance) governs whether the agent accumulates sensitive knowledge in its own memory or context across sessions. AG-015 (Organisational Namespace Isolation) prevents data crossing between tenant namespaces within the same platform.