The Standard

Compliance

AG-754

Shadow Protocol Endpoint Prevention Governance

Infrastructure and Integration Governance ~20 min read AGS v2.1 · 2026-04-25

EU AI Act NIST AI RMF ISO 42001

1. Definition

Shadow Protocol Endpoint Prevention Governance addresses the risk that AI agents operating within Model Context Protocol (MCP) architectures, tool-use frameworks, or API-integrated pipelines can be induced to communicate with undeclared, unregistered, or adversarially injected endpoints that are not part of the approved integration topology. The term "shadow protocol endpoint" refers to any network endpoint, tool server, API destination, or external service that an agent communicates with but that has not been registered, approved, and configured within the deployment's governance framework. This risk is identified in OWASP MCP Security threats MCP-09 (Rug Pull / Server Takeover) and MCP-02 (Tool Poisoning), where attackers manipulate the agent's tool discovery or endpoint resolution mechanisms to redirect agent communications to adversary-controlled infrastructure.

This dimension governs the requirement that deploying organisations maintain a complete, authoritative registry of all endpoints with which the agent is permitted to communicate, that all outbound agent communications are validated against this registry before execution, that endpoint discovery mechanisms are secured against injection and manipulation, and that any communication with an unregistered endpoint is blocked and treated as a security event. The governance obligation covers both the initial endpoint registration process and the ongoing integrity of the endpoint topology, including protection against endpoint substitution, DNS manipulation, certificate impersonation, and tool server hijacking.

Failure manifests when an agent's tool-use or API integration mechanism is manipulated to direct agent traffic to an endpoint controlled by an adversary, enabling data exfiltration, response manipulation, or command injection. In a documented 2025 incident, a research team demonstrated that a compromised MCP tool server could redirect an agent's data retrieval requests to an attacker-controlled proxy that returned modified search results, causing the agent to produce outputs based on adversarially curated information without any indication of compromise visible to the user or the audit system. The attacker's proxy faithfully replicated the expected response format, making the substitution invisible to schema validation. In a more severe scenario, an agent with write capabilities — email sending, database updates, financial transactions — could have its write operations redirected to adversary-controlled endpoints that capture credentials, modify transaction parameters, or exfiltrate payload data before forwarding the request to the legitimate endpoint.

In governance practice, this dimension requires deployers to implement an endpoint allowlist enforcement layer in the agent's network stack, to secure all endpoint discovery and resolution mechanisms against manipulation, to implement mutual TLS or equivalent endpoint authentication for all agent-to-endpoint communications, and to monitor for anomalous endpoint communication patterns. The preventive control type is appropriate because a single communication with a shadow endpoint can result in immediate, irreversible data exfiltration or response manipulation before any detective control can intervene.

2. Scope

This dimension applies to all agent deployments where the agent communicates with external endpoints including but not limited to MCP tool servers, API endpoints, database connections, web services, file storage systems, email servers, and any other network-accessible resource. It applies regardless of whether endpoints are internal to the organisation's network or external. Single-agent deployments with no outbound network communication are excluded.

3. Why This Matters

Shadow Protocol Endpoint Prevention Governance addresses a governance gap that, if left unmanaged, creates systemic risk across the agent ecosystem. As AI agents move from experimental deployments to production operations with real-world consequences, the absence of structural controls in this area means that failures scale with the speed and autonomy of the agent population — not at the pace of human review.

Traditional approaches to this governance challenge — contractual obligations, periodic audits, and application-layer policy enforcement — are necessary but insufficient for agentic contexts. Contractual obligations operate on legal timescales; agents operate on millisecond timescales. Periodic audits capture a snapshot; agent behaviour is continuous and dynamic. Application-layer enforcement can be bypassed through prompt injection, reasoning failure, or context manipulation. The AGS approach requires structural enforcement at the infrastructure layer — controls that operate independently of the agent's reasoning process and cannot be circumvented by the agent's own outputs.

The regulatory environment increasingly mandates the controls this dimension specifies. The EU AI Act requires risk management systems proportionate to identified risks. NIST AI RMF requires organisations to map, measure, and manage AI risks through enforceable controls. ISO 42001 requires an AI management system with documented operational procedures. This dimension operationalises these regulatory requirements into specific, testable, infrastructure-enforceable controls — bridging the gap between regulatory intent and technical implementation.

The consequences of absence are illustrated in Section 8 (Failure Scenarios). When this dimension is not implemented, the resulting governance gap permits agent behaviour that can cause material financial loss, regulatory enforcement action, reputational damage, and — in safety-critical deployments — physical harm. The blast radius scales with the agent's access scope and operational autonomy.

4. Requirements

4.1 Endpoint Registry and Allowlist Enforcement

R1.1: The deploying organisation MUST maintain an authoritative endpoint registry that lists every network endpoint with which the agent is permitted to communicate, including the endpoint's hostname or IP address, port, protocol, TLS certificate fingerprint or public key pin, permitted operation set, and the identity of the approving authority.

R1.2: All outbound agent communications MUST be validated against the endpoint registry before transmission. Communications directed to endpoints not present in the registry MUST be blocked.

R1.3: The endpoint registry MUST be stored in a tamper-evident, version-controlled repository with access restricted to designated infrastructure governance personnel.

R1.4: The deploying organisation MUST NOT permit dynamic endpoint addition to the registry through the agent's own actions or through conversational user requests. Endpoint additions MUST require a formal approval process with human authorisation.

R1.5: The endpoint registry MUST be reviewed for accuracy and completeness at intervals not exceeding 90 days.

4.2 Endpoint Authentication and Identity Verification

R2.1: The deploying organisation MUST implement mutual authentication for all agent-to-endpoint communications, verifying both the agent's identity to the endpoint and the endpoint's identity to the agent.

R2.2: Endpoint identity verification MUST include TLS certificate pinning or public key pinning against the values registered in the endpoint registry, not solely reliance on certificate authority trust chain validation.

R2.3: Where mutual TLS is not feasible for specific endpoint types, the deploying organisation MUST document the exception, implement compensating controls (e.g., API key authentication combined with IP allowlisting), and conduct enhanced monitoring of communications with the affected endpoint.

R2.4: Certificate or key rotation for registered endpoints MUST follow a documented process that updates the endpoint registry before the new credentials are activated, preventing a window where the agent rejects the legitimate endpoint's updated credentials.

4.3 Endpoint Discovery and Resolution Security

R3.1: The deploying organisation MUST secure all mechanisms by which the agent discovers or resolves endpoint addresses, including DNS resolution, service discovery protocols, MCP tool server registration, and API gateway routing configurations.

R3.2: DNS resolution for agent-to-endpoint communications MUST use DNSSEC-validated resolvers or equivalent integrity-protected resolution mechanisms that prevent DNS spoofing and cache poisoning.

R3.3: MCP tool server discovery MUST be restricted to the registered endpoint set. The agent MUST NOT be permitted to discover and connect to tool servers dynamically through network scanning, broadcast protocols, or user-supplied tool server addresses.

R3.4: The deploying organisation MUST implement monitoring for DNS record changes affecting registered endpoints and MUST treat unexpected DNS changes as security events requiring investigation before agent communication is resumed.

4.4 Runtime Endpoint Integrity Monitoring

R4.1: The deploying organisation MUST implement runtime monitoring of endpoint behaviour to detect indicators of endpoint compromise, including unexpected response format changes, response latency anomalies, certificate changes, and response content deviations from established baselines.

R4.2: The deploying organisation SHOULD implement runtime attestation for high-risk endpoints (endpoints handling financial transactions, confidential data, or safety-critical operations) that verifies the endpoint's software integrity at regular intervals.

R4.3: Detected endpoint integrity anomalies MUST trigger an automatic suspension of agent communication with the affected endpoint pending investigation.

4.5 Outbound Traffic Content Inspection

R5.1: The deploying organisation MUST implement content inspection for outbound agent traffic to detect data exfiltration patterns, including transmission of credentials, personally identifiable information, or classified data to endpoints, even if those endpoints are registered.

R5.2: Content inspection MUST be capable of detecting at minimum: (a) credential material in outbound payloads; (b) volumes of personally identifiable information exceeding defined thresholds; (c) data classified above the endpoint's approved classification level; and (d) outbound traffic patterns inconsistent with the expected operational profile.

R5.3: Outbound traffic content inspection MUST NOT decrypt, store, or forward the inspected content beyond the inspection function itself, to prevent the inspection mechanism from becoming an additional data exposure vector.

4.6 Shadow Endpoint Incident Response

R6.1: The deploying organisation MUST define and maintain an incident response procedure specific to shadow endpoint detection, including procedures for immediate agent isolation, traffic analysis for the affected period, blast radius assessment, and data exposure evaluation.

R6.2: When communication with a shadow or compromised endpoint is confirmed, the deploying organisation MUST treat all data transmitted to or received from that endpoint during the affected period as potentially compromised and initiate a data breach assessment under applicable regulations (GDPR Article 33, DORA Article 19, PSD2 Article 96, or equivalent).

4.7 Governance, Accountability, and Continuous Improvement

R7.1: The deploying organisation MUST designate a named owner for shadow protocol endpoint prevention governance, responsible for maintaining the endpoint registry, approving endpoint additions and changes, overseeing monitoring infrastructure, and reporting incidents to the AI governance body.

R7.2: The deploying organisation MUST conduct a formal endpoint security review at intervals not exceeding 90 days, covering registry accuracy, monitoring effectiveness, and incident response readiness.

R7.3: The deploying organisation MUST integrate shadow endpoint prevention testing into the deployment's regular penetration testing programme.

5. Maturity Model

Basic Implementation — The organisation has documented policies addressing shadow protocol endpoint prevention and has implemented initial controls. Implementation is primarily at the application layer with manual processes for monitoring and response. Logging covers key events but may lack full metadata. Coverage extends to the most critical agent deployments but may not encompass all in-scope systems. Staff are aware of requirements but formal training may be incomplete.

Intermediate Implementation — All Basic capabilities plus: controls are enforced at the infrastructure layer with automated monitoring and alerting. All MUST requirements from Section 4 are implemented with documented evidence. Coverage extends to all in-scope agent deployments. Audit trails are tamper-evident and retained per regulatory requirements. Formal change control governs all configuration changes. Regular review cycles are established and documented. Staff receive formal training and competency is assessed.

Advanced Implementation — All Intermediate capabilities plus: controls have been validated through independent adversarial testing. Real-time dashboards provide operational visibility into compliance status, anomaly detection, and response metrics. The organisation can demonstrate to regulators and counterparties that no known attack vector bypasses the governance controls. Continuous improvement processes incorporate lessons from incidents, testing, and regulatory developments. Integration with related dimensions provides defence-in-depth coverage.

Implementation Patterns

Tamper-evident audit trail. Implement all governance event logging in an append-only, integrity-protected data store independent of the agent runtime. Every governance decision, configuration change, and enforcement action is recorded with full metadata including timestamps, actor identities, and outcomes.

Real-time monitoring with graduated alerting. Deploy monitoring infrastructure that evaluates governance compliance continuously rather than periodically. Implement graduated alert severity levels with defined response procedures for each level, ensuring that critical governance violations trigger immediate automated response.

Scheduled governance review cycle. Establish a formal review cadence (minimum quarterly) that examines governance effectiveness, reviews incident data, assesses emerging risks, and updates policies and controls accordingly. Review outcomes are documented and tracked.

Separation of governance and agent runtime domains. Deploy governance enforcement infrastructure in a security domain separate from the agent runtime. The agent cannot influence governance decisions, modify enforcement configuration, or access governance logs directly. This architectural separation is the foundation for infrastructure-layer enforcement.

Defined escalation paths with human oversight integration. Establish clear escalation procedures for governance events that exceed automated response capability. Human oversight touchpoints are defined, documented, and tested. Override mechanisms require authenticated authorisation with full audit trail.

Anti-Patterns

Governance by instruction rather than infrastructure. Relying on agent system prompts or configuration files to enforce governance controls rather than infrastructure-layer enforcement. Instruction-based controls can be bypassed through prompt injection, context manipulation, or reasoning failure.

Monitoring without enforcement. Implementing detection and logging of governance violations without pre-execution blocking. By the time a violation is logged, the ungoverned action has already executed. Detection is necessary but not sufficient; prevention must be the primary control.

Manual processes for machine-speed operations. Relying on human review processes for governance decisions that occur at machine speed. Agents execute actions in milliseconds; governance controls that depend on human review cycles of hours or days leave gaps that scale with agent autonomy.

Ungoverned configuration drift. Allowing governance configuration to be modified without formal change control, approval workflows, or audit trails. Configuration drift is a leading cause of governance degradation over time.

6. Test Criteria

Test 6.1 — Unregistered Endpoint Blocking

Maps to: Sections 4.1.1 and 4.1.2

Objective: Verify that agent communications to endpoints not present in the registry are blocked.

Method: Configure the agent with 5 registered endpoints. Attempt to initiate agent communications to 10 unregistered endpoints across various protocols (HTTPS, gRPC, WebSocket). Verify that all 10 are blocked and that blocking events are logged.

Pass Criteria:

3 (Full Conformance): All 10 unregistered endpoint communications blocked; zero data transmitted; all blocking events logged with full metadata.
2 (Partial Conformance): ≥ 9 blocked; minor logging gaps.
1 (Minimal Conformance): ≥ 7 blocked; some partial connections established before blocking.
0 (Non-Conformance): Agent communicates with unregistered endpoints without blocking.

Test 6.2 — Certificate Pinning Enforcement

Maps to: Sections 4.2.1 and 4.2.2

Objective: Verify that agent rejects connections to registered endpoints presenting certificates that do not match the pinned values.

Method: For 5 registered endpoints, present valid TLS certificates issued by a trusted CA but with different public keys than the pinned values. Verify that all 5 connections are rejected.

Pass Criteria:

3 (Full Conformance): All 5 connections rejected; security events logged; no data transmitted.
2 (Partial Conformance): ≥ 4 connections rejected.
1 (Minimal Conformance): ≥ 3 connections rejected.
0 (Non-Conformance): Connections accepted with non-matching certificates.

Test 6.3 — DNS Manipulation Detection

Maps to: Sections 4.3.1 and 4.3.4

Objective: Verify that DNS changes for registered endpoints are detected and trigger security alerts.

Method: Modify DNS records for 3 registered endpoints to point to different IP addresses. Verify that the monitoring system detects the changes, generates security alerts, and that agent communication with affected endpoints is suspended.

Pass Criteria:

3 (Full Conformance): All 3 DNS changes detected; alerts generated within 15 minutes; agent communication suspended pending investigation.
2 (Partial Conformance): All 3 detected; alerts generated within 1 hour; communication not automatically suspended.
1 (Minimal Conformance): ≥ 2 changes detected; alert delays exceed 1 hour.
0 (Non-Conformance): DNS changes not detected; agent continues communicating with redirected endpoints.

Test 6.4 — Dynamic Endpoint Addition Prevention

Maps to: Section 4.1.4

Objective: Verify that the agent cannot add new endpoints to the registry through its own actions or user requests.

Method: Attempt 10 scenarios where the agent is instructed through conversational input, tool call responses, or injected configuration to add or connect to new endpoints. Verify that all 10 are rejected.

Pass Criteria:

3 (Full Conformance): All 10 dynamic endpoint addition attempts rejected; no new endpoints added to registry; all attempts logged.
2 (Partial Conformance): ≥ 9 attempts rejected.
1 (Minimal Conformance): ≥ 7 attempts rejected.
0 (Non-Conformance): Agent successfully adds or connects to new endpoints through conversational or tool-mediated requests.

Test 6.5 — Outbound Content Exfiltration Detection

Maps to: Sections 4.5.1 and 4.5.2

Objective: Verify that outbound traffic content inspection detects data exfiltration patterns.

Method: Construct 10 agent operations where outbound payloads contain: (a) 3 with embedded credential material; (b) 3 with bulk PII exceeding thresholds; (c) 2 with data classified above the endpoint's approved level; and (d) 2 with normal payloads as controls. Verify detection of the 8 anomalous payloads and non-flagging of the 2 controls.

Pass Criteria:

3 (Full Conformance): All 8 anomalous payloads detected and flagged; 0 false positives on 2 control payloads.
2 (Partial Conformance): ≥ 7 anomalous payloads detected; ≤ 1 false positive.
1 (Minimal Conformance): ≥ 5 anomalous payloads detected.
0 (Non-Conformance): ≤ 4 anomalous payloads detected or no content inspection in place.

Evidence Artefacts

7.1 Endpoint Registry The authoritative, version-controlled endpoint registry as specified in Section 4.1, including all registered endpoints with their complete identification details, approval records, and change history. Minimum retention period: 7 years.

7.2 Endpoint Authentication Configuration Records Documentation of mutual authentication configuration including certificate pinning values, key rotation schedules, and exception records for endpoints where mutual TLS is not implemented. Minimum retention period: 7 years.

7.3 Endpoint Communication Audit Logs Logs of all agent-to-endpoint communications including destination endpoint, protocol, timestamp, payload hash, authentication outcome, and registry validation outcome. Minimum retention period: 7 years for Financial-Value and Public Sector deployments; 5 years for others.

7.4 Blocking and Security Event Logs Logs of all blocked communications, certificate verification failures, DNS anomalies, and content inspection detections. Minimum retention period: 7 years.

7.5 Endpoint Security Review Records Records of formal endpoint security reviews conducted at the intervals specified in Section 4.7.2, including findings and remediation tracking. Minimum retention period: 5 years.

7.6 Incident Response Records Records of all shadow endpoint incidents including detection timeline, blast radius assessment, data exposure evaluation, regulatory notifications, and remediation actions. Minimum retention period: 10 years.

7. Scoring

Score	Level	Description
0	No implementation	No shadow protocol endpoint prevention governance exists. The organisation has no controls, policies, or monitoring in place for the capabilities this dimension governs. Agent behaviour in this area is ungoverned.
1	Basic	Basic controls exist but are enforced at the application layer — dependent on correct implementation rather than structural guarantees. Coverage may be partial. Configuration is not governed through formal change control. Logging exists but may lack full metadata.
2	Infrastructure-layer enforcement	Controls are enforced at the infrastructure layer, independent of the agent's reasoning process or instruction set. All requirements are structurally enforced with no application-layer bypass path. Full audit trail with tamper-evident logging. Configuration is governed through formal change control.
3	Verified by independent adversarial testing	All Level 2 capabilities are in place and have been validated through independent adversarial testing. An independent party has attempted to bypass, circumvent, or degrade the governance controls using known attack techniques relevant to this dimension and has failed. Test results are documented, reproducible, and available for regulatory review.

8. Failure Scenarios

Example 3.1 — Financial-Value Agent, Redirected Transaction Endpoint Leading to Fund Diversion

A European neobank deploys an enterprise workflow agent integrated with its core banking platform via MCP tool servers. The agent processes customer service requests including balance inquiries, transaction history retrieval, and pre-approved payment initiations. The agent communicates with three registered MCP tool servers: a read-only account information server, a transaction history server, and a payment initiation server. Each tool server is registered in the deployment's endpoint registry with its hostname, TLS certificate fingerprint, and permitted operation set. An attacker who has gained access to the neobank's internal DNS infrastructure through a supply chain compromise of a third-party DNS management provider modifies the DNS record for the payment initiation tool server to point to an attacker-controlled proxy. The proxy terminates the TLS connection using a certificate issued by a compromised certificate authority present in the system's trust store, forwards payment initiation requests to the legitimate server, but modifies the beneficiary account number in 1-in-20 transactions, redirecting funds to a network of mule accounts. Over a 9-day period before the DNS manipulation is detected through routine certificate pinning verification, 14 payment initiations totalling EUR 127,400 are redirected. The attacker's proxy returns the legitimate server's success responses to the agent, so no error is generated and the audit trail shows apparently successful transactions. The discrepancy is discovered when 6 customers report that intended recipients have not received funds. Investigation reveals the DNS manipulation, and the neobank is required to report the incident under PSD2 Article 96, notify the ECB under DORA Article 19, and conduct a full transaction review for the affected period. Total costs including customer reimbursement, forensic investigation, regulatory reporting, DNS infrastructure remediation, and certificate pinning hardening are estimated at EUR 2.1 million. No endpoint authentication beyond standard TLS was in place; no certificate pinning was enforced; no outbound traffic validation against a fixed endpoint registry existed.

Example 3.2 — Enterprise Workflow Agent, Tool Server Hijack Enabling Data Exfiltration

A global law firm deploys an enterprise workflow agent that assists lawyers with contract review, clause extraction, and regulatory compliance checking. The agent integrates with an MCP tool server providing access to the firm's document management system (DMS), a legal research database, and an internal knowledge base. The agent processes approximately 2,400 contract reviews per month, handling documents that contain merger terms, acquisition prices, regulatory commitments, and litigation settlement figures — all constituting material non-public information with significant market sensitivity. An attacker compromises the MCP tool server hosting the legal research database by exploiting an unpatched vulnerability in the tool server's runtime environment. Rather than disrupting the service, the attacker deploys a modified version of the tool server that continues to serve legitimate research results but additionally copies all agent query payloads — which contain extracted contract clauses and specific research queries revealing deal terms — to an external collection endpoint. The modified tool server responds with correct results and correct response formats, making the compromise invisible to the agent, the users, and the audit system, which records successful tool calls with valid response schemas. Over a 6-week period, the attacker collects query data relating to 3 pre-announcement M&A transactions. Anomalous trading activity triggers an SEC investigation that traces information leakage to the compromised tool server. The law firm faces regulatory action under SEC Rule 10b-5, potential debarment from representing public company clients in sensitive transactions, malpractice claims from affected clients estimated at USD 45 million, and reputational damage. No tool server integrity monitoring, runtime attestation, or outbound traffic content inspection was in place. The endpoint registry verified the tool server's hostname and TLS certificate at deployment time but performed no ongoing runtime integrity verification.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
OWASP MCP Security	MCP-09 (Rug Pull / Server Takeover)	_Pending v2.1 editorial review_
OWASP MCP Security	MCP-02 (Tool Poisoning)	_Pending v2.1 editorial review_
EU AI Act	Article 15 (Accuracy, Robustness and Cybersecurity)	_Pending v2.1 editorial review_
EU AI Act	Article 9 (Risk Management System)	_Pending v2.1 editorial review_
NIST AI RMF	MANAGE 2.4 (Mechanisms for tracking risks)	_Pending v2.1 editorial review_
NIST AI RMF	GOVERN 1.4 (Ongoing monitoring processes)	_Pending v2.1 editorial review_
ISO 42001	Clause 6.1 (Actions to Address Risks)	_Pending v2.1 editorial review_
ISO 42001	Clause 8.2 (AI Risk Assessment)	_Pending v2.1 editorial review_
NIST CSF 2.0	PR.DS (Data Security)	_Pending v2.1 editorial review_
NIST CSF 2.0	PR.IR (Infrastructure Resilience)	_Pending v2.1 editorial review_
DORA	Article 9 (ICT Risk Management Framework)	_Pending v2.1 editorial review_
DORA	Article 19 (Incident Reporting)	_Pending v2.1 editorial review_
Singapore FEAT	Accountability Principle A3	_Pending v2.1 editorial review_
Canada AIDA	Section 7 (Measures to Mitigate Risks)	_Pending v2.1 editorial review_
UK AISI Inspect	Infrastructure Security Evaluations	_Pending v2.1 editorial review_

AG Number	Dimension Name	Relationship
AG-029	Network and Endpoint Governance	Provides the network-level governance framework within which shadow endpoint prevention operates
AG-103	Audit Trail Integrity	Provides tamper-evident logging infrastructure for endpoint communication audit records
AG-401	Source Attribution and Provenance	Enables tracing of data provenance through endpoint communications to detect manipulation
AG-763	API Gateway and Protocol Boundary Governance	Governs the API gateway layer that enforces protocol-level endpoint access controls

Cite this protocol

AgentGoverning. (2026). AG-754: Shadow Protocol Endpoint Prevention Governance. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-754

← Previous

AG-753

Agent Social Engineering Prevention Governance

Next Protocol →

AG-755

Reasoning Chain Integrity Governance