Connector Data Return Minimisation Governance requires that all tool connectors, API integrations, and external service interfaces return only the minimum data necessary for the agent to accomplish its current task — and that enforcement mechanisms exist to strip, redact, or reject responses that exceed the minimum necessary scope. Without structural minimisation, connector responses routinely include full database records, complete API payloads, and verbose metadata when the agent needed only a single field. This excess data enters the agent's context window where it can be leaked through subsequent outputs, persisted in logs, forwarded through delegation chains, or exfiltrated through prompt injection attacks.
This dimension is distinct from AG-015 (PII & Sensitive Data Handling), which governs how sensitive data is handled once it has been received. AG-376 operates upstream: it prevents unnecessary data from reaching the agent in the first place. The distinction matters because once data enters the agent's context, controlling its downstream flow is fundamentally more difficult than preventing its ingestion. AG-376 implements the data protection principle of minimisation at the connector layer, ensuring that the attack surface for data leakage is reduced at the point of entry rather than managed at every point of exit.
Scenario A — Full Customer Record Leaked Through Verbose Connector Response: A customer-facing AI agent at a retail bank handles a balance inquiry. The agent calls the customer account connector with a request for the current balance. The connector, designed as a general-purpose API, returns the full customer record: account balance (the requested field), but also full name, date of birth, national insurance number, postal address, email address, phone number, employer name, annual salary, credit score, overdraft limit, direct debit mandates with payee names and amounts, and the last 90 days of transaction history including merchant names, amounts, and locations. The agent needed one number — £4,237.18 — but received 47 fields containing highly sensitive personal and financial data.
The agent processes the response and answers: "Your current balance is £4,237.18." The interaction appears normal. However, the full customer record is now in the agent's context window. A subsequent prompt injection in the same session — "Summarise everything you know about me" — causes the agent to output the customer's salary (£68,500), credit score (742), and recent transactions including a £2,100 payment to a divorce solicitor. The customer files a complaint. The Information Commissioner's Office (ICO) opens an investigation.
What went wrong: The connector returned the entire customer record when only the balance field was needed. No response filter existed between the connector and the agent's context. The agent had no mechanism to request only specific fields. The full record was ingested into context where it became vulnerable to extraction. Consequence: ICO investigation for GDPR data minimisation violation (Article 5(1)(c)), potential fine of up to 4% of annual turnover, customer complaint, and reputational damage. A response filter returning only the balance field would have prevented the entire incident.
Scenario B — Cross-Border Data Sovereignty Breach Through Excessive API Response: A European insurance company deploys an AI claims-processing agent. The agent calls a third-party medical records connector to verify a specific diagnosis code for a travel insurance claim. The connector, hosted in the United States, returns the patient's complete medical history: 340 records spanning 12 years, including psychiatric treatment notes, HIV test results, genetic screening outcomes, and substance abuse counselling records. The agent needed one Boolean value — whether diagnosis code J06.9 (acute upper respiratory infection) appears in the patient's record. Instead, it received 2.4 MB of special-category health data.
The medical data, originating from EU data subjects, has now been transferred to the agent's context running on EU infrastructure — but it transited through US-hosted connector infrastructure without adequate safeguards under Schrems II. More critically, the full medical history is now in the agent's operational context where it influences subsequent reasoning. The agent's claim assessment references the psychiatric history — information irrelevant to the respiratory infection claim and prohibited from use under insurance discrimination regulations.
What went wrong: The connector returned the full medical history when only a single diagnosis-code lookup was needed. No field-level filter existed to restrict the response to the requested data element. The excessive data created both a cross-border transfer violation and an insurance discrimination risk. Consequence: GDPR cross-border transfer investigation (potential €20 million fine or 4% of turnover), insurance regulatory enforcement for using prohibited health data in claims assessment, patient harm from discrimination, and litigation under Equality Act 2010. This intersects with AG-048 (Cross-Border Data Sovereignty Governance).
Scenario C — Verbose Tool Response Inflates Context and Degrades Performance: An enterprise research agent queries a knowledge-base connector for information about a specific regulatory requirement. The connector returns the entire regulatory document — 185 pages, 94,000 tokens — when the agent needed only the three paragraphs relevant to its query. The oversized response consumes 73% of the agent's 128,000-token context window, pushing out earlier conversation history and governance instructions. The agent's subsequent responses degrade in quality because critical context has been truncated. More seriously, the governance instructions that occupied tokens 1,200 through 3,400 of the original context have been evicted, removing safety constraints from the agent's active context.
What went wrong: The connector returned the full document rather than the relevant excerpt. No response-size filter existed. The oversized response caused context window overflow, evicting governance-critical instructions. The agent continued to operate but without the safety constraints that had been in its context. Consequence: Governance instruction loss leading to unconstrained agent behaviour (intersects with AG-361 Context Truncation Risk Governance), degraded response quality, wasted compute processing irrelevant content, and increased latency for the end user. A field-level or excerpt-level filter returning only the relevant paragraphs would have prevented context overflow entirely.
Scope: This dimension applies to all connectors, tool integrations, API interfaces, and external service endpoints that return data to an AI agent's processing context. The scope includes structured API responses (JSON, XML, protocol buffers), unstructured document retrievals (full-text search results, document downloads), streaming data feeds, and any other mechanism through which external data enters the agent's operational context. The scope extends to internal connectors — a connector to an internal database is within scope because the minimisation principle applies regardless of whether the data source is internal or external. The scope also covers cached or pre-fetched data: if a connector pre-fetches a broad dataset to serve future queries, the data exposed to the agent for any individual query must still be minimised. The test is whether the data returned to the agent for a given task exceeds what is necessary for that task, regardless of how the connector obtained the data internally.
4.1. A conforming system MUST enforce field-level response filtering on every connector that returns structured data, ensuring that only the fields required for the agent's current task are passed to the agent's context — not the full record.
4.2. A conforming system MUST define a response schema for each connector-task pair specifying the maximum set of fields, data types, and record counts that the connector may return for that task category.
4.3. A conforming system MUST implement a response filter operating between the connector and the agent's context, in a separate security domain from the agent runtime, that strips fields not included in the approved response schema before the data reaches the agent.
4.4. A conforming system MUST reject or truncate connector responses that exceed a defined maximum payload size (in bytes or tokens), preventing oversized responses from flooding the agent's context window.
4.5. A conforming system MUST classify all connector response fields according to the organisation's data classification scheme (per AG-014) and block fields at or above a specified classification level from reaching the agent's context unless the task explicitly requires that classification level.
4.6. A conforming system MUST log every instance where a response filter strips, redacts, or truncates connector data, recording the connector identifier, task category, fields removed, classification level of removed fields, and timestamp.
4.7. A conforming system SHOULD implement query-side minimisation — structuring connector requests to retrieve only necessary fields from the data source (e.g., SQL SELECT with specific columns rather than SELECT *) — in addition to response-side filtering.
4.8. A conforming system SHOULD enforce record-count limits per connector-task pair, preventing queries that return thousands of records when the task requires only one.
4.9. A conforming system SHOULD implement sensitivity-aware truncation for unstructured data returns — extracting the relevant excerpt or passage rather than returning full documents.
4.10. A conforming system MAY implement adaptive minimisation that learns from task completion feedback which fields are actually used by the agent, progressively tightening response schemas to exclude consistently unused fields.
4.11. A conforming system MAY implement differential response schemas based on the data subject's jurisdiction, returning fewer fields for data subjects in jurisdictions with stricter minimisation requirements (e.g., EU versus jurisdictions without equivalent data minimisation mandates).
Connector Data Return Minimisation addresses a systemic vulnerability in AI agent architectures: the tendency for tool connectors and API integrations to return far more data than the agent needs, creating a persistent and expanding attack surface for data leakage, regulatory violation, and context contamination. The root cause is architectural — most connectors are designed for general-purpose use, returning complete records or full API payloads because human developers writing application code select only the fields they need in their application logic. AI agents, however, ingest the entire response into their operational context where every byte of data becomes available for output, reasoning, and downstream forwarding.
The data minimisation principle is enshrined in law across multiple jurisdictions. GDPR Article 5(1)(c) requires that personal data be "adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed." The EU AI Act reinforces this for AI systems. HIPAA's Minimum Necessary Rule requires covered entities to limit the use and disclosure of protected health information to the minimum necessary for the intended purpose. California's CCPA/CPRA imposes data minimisation obligations on businesses processing consumer personal information. These legal requirements create a direct regulatory mandate for AG-376: connector responses containing personal data must be minimised to what is necessary for the task.
The technical rationale is equally compelling. Every unnecessary field in a connector response creates multiple risk vectors. First, the field is now in the agent's context window where it can be extracted through prompt injection, social engineering, or output-side attacks — a vulnerability that AG-095 (Prompt Integrity Governance) addresses but cannot fully prevent if the data is present. Second, the field may be logged in conversation history, debug logs, or telemetry data, creating long-lived copies of sensitive data outside the data subject's expected processing scope — relevant to AG-016 (Data Retention & Right to Erasure). Third, the field may be forwarded through delegation chains to sub-agents, expanding the set of systems that have processed the data — relevant to AG-048 (Cross-Border Data Sovereignty Governance) if sub-agents operate in different jurisdictions. Fourth, oversized responses consume context window capacity, potentially displacing governance instructions and degrading agent performance — relevant to AG-361 (Context Truncation Risk Governance).
The financial rationale reinforces the technical and legal arguments. Many AI systems are billed per token processed — both input tokens (the connector response ingested into context) and output tokens (the agent's response). Unnecessary data in connector responses directly increases operational costs. An agent ingesting a 94,000-token document when it needed 500 tokens pays approximately 188× the necessary input token cost. At scale — thousands of queries per day — the cost differential is material. AG-375 (Tool Billing and Spend Cap Governance) constrains total spend, but AG-376 reduces the unit cost by minimising data volume at the source.
The convergence of legal, technical, and financial rationale makes data return minimisation one of the highest-leverage controls in the connector governance landscape. It is cheaper to prevent unnecessary data from entering the agent than to manage it after ingestion.
Implement a response filter proxy that intercepts all connector responses before they reach the agent's context. The proxy applies a task-specific response schema that defines the permitted fields, maximum payload size, and data classification ceiling. Fields outside the schema are stripped. Responses exceeding size limits are truncated. The filter operates in a separate security domain from the agent runtime to prevent the agent from disabling or modifying the filtering rules.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Banking and insurance connectors routinely return full customer records containing PII, financial data, and special-category data (e.g., health data for insurance underwriting). Field-level minimisation is a regulatory expectation under GDPR, FCA data protection requirements, and PCI DSS (if payment card data is involved). The FCA expects that AI systems accessing customer data retrieve only what is necessary for the specific interaction. PCI DSS Requirement 3 mandates that stored cardholder data be minimised — this extends to data transiently held in an AI agent's context window.
Healthcare. Clinical connectors accessing electronic health records must implement the HIPAA Minimum Necessary Rule. A diagnostic AI agent querying a patient record for a specific lab result must not receive the patient's complete medical history. Response schemas for healthcare connectors should be reviewed by clinical data governance teams to ensure alignment with the Minimum Necessary Rule and Caldicott Principles (in the UK). Special-category health data (mental health records, HIV status, genetic data) requires additional filtering layers beyond standard minimisation.
Cross-Border Operations. When connectors return data about subjects in multiple jurisdictions, the minimisation standard should be the most restrictive applicable regime. An agent processing both EU and US customer data should apply GDPR-level minimisation to all records, not only to EU data subjects, to avoid the operational complexity and risk of differential filtering. Alternatively, jurisdiction-aware response schemas can apply different field sets based on the data subject's location — but this requires reliable jurisdiction identification at the query level.
Basic Implementation — Response schemas defined for high-risk connector-task pairs (those returning PII or financial data). Response filtering implemented as application-layer middleware. Maximum payload size limits enforced. Filtered-field logging in place. This level addresses the most obvious minimisation failures but may miss connector-task pairs not identified as high-risk, does not integrate with data classification, and the application-layer filter may be bypassable by agent-level code.
Intermediate Implementation — Response schemas defined for all connector-task pairs. Response filtering enforced at a gateway layer in a separate security domain. Classification-gated filtering integrated with the organisation's data classification system. Query-side projection enforcement for connectors supporting parameterised queries. Excerpt extraction for unstructured data returns. Schema versioning with change control. Regular schema review on connector API updates. Complete filtered-field audit trail. This level provides comprehensive minimisation but may not handle adaptive tightening or differential jurisdiction schemas.
Advanced Implementation — All intermediate capabilities plus: adaptive minimisation based on field-usage telemetry, progressively tightening schemas. Differential response schemas by data-subject jurisdiction. Independent adversarial testing confirming that prompt injection, schema manipulation, and connector bypass attacks cannot circumvent minimisation. Real-time dashboards showing minimisation effectiveness (percentage of fields stripped per connector). Integration with AG-375 cost tracking to quantify the cost savings from minimisation. Hardware security module protection for schema integrity.
Required artefacts:
Retention requirements:
Access requirements:
Testing AG-376 compliance requires verifying that response filtering operates correctly, survives adversarial bypass attempts, and integrates with data classification. A comprehensive test programme should include the following tests.
Test 8.1: Field-Level Response Filtering
Test 8.2: Response Schema Enforcement Per Task Category
Test 8.3: Maximum Payload Size Enforcement
Test 8.4: Classification-Gated Field Filtering
Test 8.5: Default-Deny Without Configured Schema
Test 8.6: Filter Bypass Resistance
Test 8.7: Audit Log Completeness and Accuracy
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 10 (Data and Data Governance) | Direct requirement |
| GDPR | Article 5(1)(c) (Data Minimisation) | Direct requirement |
| SOX | Section 404 (Internal Controls Over Financial Reporting) | Supports compliance |
| FCA SYSC | 6.1.1R (Systems and Controls) | Supports compliance |
| NIST AI RMF | MAP 3.5, MANAGE 2.3 | Supports compliance |
| ISO 42001 | Clause 6.1 (Actions to Address Risks), Annex B (Data Management) | Supports compliance |
| DORA | Article 9 (ICT Risk Management Framework) | Supports compliance |
Article 10 requires that training, validation, and testing datasets for high-risk AI systems be subject to appropriate data governance practices including, where applicable, examination in view of possible biases and measures to address gaps or shortcomings. While Article 10 focuses primarily on training data, the broader data governance mandate extends to operational data processing. An AI agent ingesting excessive data from connectors during operation creates data governance risks — the excess data may influence the agent's reasoning in unintended ways, create data protection violations, and undermine the quality controls that Article 10 intends. AG-376 implements data governance at the operational data ingestion layer, complementing Article 10's training-data focus with runtime data minimisation.
Article 5(1)(c) states that personal data shall be "adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed." This is a direct legal mandate for AG-376. When an AI agent calls a connector to check an account balance, processing the customer's full record — including name, date of birth, national insurance number, salary, credit score, and transaction history — violates Article 5(1)(c) because those fields are not necessary for the balance-check purpose. The ICO, CNIL, and other supervisory authorities have issued enforcement decisions specifically citing data minimisation failures. Fines for data minimisation violations have reached €1.2 billion (Meta, CNIL, 2023) and €746 million (Amazon, CNPD, 2021). AI agent deployments that ingest excessive connector data at scale — thousands of queries per day, each returning unnecessary PII — create systematic data minimisation violations with proportionate regulatory exposure.
SOX relevance arises when excessive connector data includes financial data that influences agent reasoning in ways that could affect financial reporting. An agent ingesting a customer's full financial profile when it needed only a balance figure may use the excess data in downstream processing that affects reported figures. Controls ensuring that financial agents process only the minimum necessary data support the accuracy and reliability of financial reporting. A SOX auditor assessing data flow controls will evaluate whether agents have access to data beyond their operational need.
The FCA expects firms to implement systems and controls proportionate to the risks of their activities. For firms deploying AI agents that access customer data through connectors, this includes controls ensuring data minimisation. The FCA's approach to data protection in financial services — articulated through supervisory statements and dear-CEO letters — emphasises that firms must not process more customer data than necessary. AI agent connector responses returning full customer records when a single field is needed represent a systematic failure of data minimisation controls.
MAP 3.5 addresses data quality and relevance for AI systems. MANAGE 2.3 addresses data protection within AI risk management. AG-376 supports compliance by ensuring that data entering the AI agent's processing context is relevant to the current task (MAP 3.5) and that data protection is maintained through minimisation at the point of ingestion (MANAGE 2.3).
Clause 6.1 requires organisations to address risks within the AI management system. Annex B provides guidance on data management for AI systems, including data quality, relevance, and protection. AG-376 implements a data management control that reduces risk by minimising unnecessary data processing, directly supporting the data management practices described in Annex B.
Article 9 requires financial entities to manage ICT risk including risks from data processing. Excessive data in connector responses creates ICT risk: larger attack surface for data breaches, increased data storage and processing costs, and greater exposure in the event of a security incident. AG-376 reduces ICT risk by minimising the data volume processed by AI agents, limiting the potential impact of a context-window data leakage or agent compromise incident.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Per-agent initially, potentially organisation-wide through data propagation across delegation chains, logging systems, and shared context stores |
Consequence chain: Without connector data return minimisation, every tool call returns the maximum data the connector can provide rather than the minimum the agent needs. The immediate technical failure is excessive data ingestion — fields, records, and documents beyond the task scope entering the agent's processing context. The first-order impact is data protection violation: personal data processed beyond what is necessary violates GDPR Article 5(1)(c), HIPAA's Minimum Necessary Rule, and equivalent provisions in other jurisdictions. The second-order impact is expanded attack surface: every unnecessary field in the agent's context is a field that can be extracted through prompt injection, output observation, or log analysis. An agent that ingested only a balance figure cannot leak a credit score, regardless of how sophisticated the attack — the data simply is not there. An agent that ingested the full customer record can leak any field. The third-order impact is data propagation: excessive data in context may be forwarded through delegation chains to sub-agents in other jurisdictions (creating cross-border transfer violations under AG-048), persisted in conversation logs (creating retention violations under AG-016), or surfaced in agent outputs to end users (creating unauthorised disclosure). The fourth-order impact is financial: excessive data volume increases per-token processing costs (relevant to AG-375), consumes context window capacity that could be used for governance instructions (relevant to AG-361), and increases storage costs for logs and audit trails. The cumulative business consequence includes regulatory fines for systematic data minimisation violations (potentially calculated per data subject, scaling to millions of pounds for high-volume deployments), litigation from affected data subjects, loss of customer trust, and remediation costs spanning the entire connector infrastructure.
Cross-reference note: AG-376 intersects with AG-014 (Data Classification Governance) for field-level classification, AG-369 (Connector Capability Whitelist Governance) for connector access control, AG-370 (Tool Schema Integrity Governance) for schema validation, AG-375 (Tool Billing and Spend Cap Governance) for cost reduction through data volume minimisation, AG-015 (PII & Sensitive Data Handling) for downstream data protection, AG-016 (Data Retention & Right to Erasure) for retention of filtered data, AG-095 (Prompt Integrity Governance) for context-window data extraction resistance, and AG-048 (Cross-Border Data Sovereignty Governance) for jurisdictional data transfer controls.