AG-754

Shadow Protocol Endpoint Prevention Governance

Infrastructure and Integration Governance ~20 min read AGS v2.1 · 2026-04-25
EU AI Act NIST AI RMF ISO 42001

1. Definition

Shadow Protocol Endpoint Prevention Governance addresses the risk that AI agents operating within Model Context Protocol (MCP) architectures, tool-use frameworks, or API-integrated pipelines can be induced to communicate with undeclared, unregistered, or adversarially injected endpoints that are not part of the approved integration topology. The term "shadow protocol endpoint" refers to any network endpoint, tool server, API destination, or external service that an agent communicates with but that has not been registered, approved, and configured within the deployment's governance framework. This risk is identified in OWASP MCP Security threats MCP-09 (Rug Pull / Server Takeover) and MCP-02 (Tool Poisoning), where attackers manipulate the agent's tool discovery or endpoint resolution mechanisms to redirect agent communications to adversary-controlled infrastructure.

This dimension governs the requirement that deploying organisations maintain a complete, authoritative registry of all endpoints with which the agent is permitted to communicate, that all outbound agent communications are validated against this registry before execution, that endpoint discovery mechanisms are secured against injection and manipulation, and that any communication with an unregistered endpoint is blocked and treated as a security event. The governance obligation covers both the initial endpoint registration process and the ongoing integrity of the endpoint topology, including protection against endpoint substitution, DNS manipulation, certificate impersonation, and tool server hijacking.

Failure manifests when an agent's tool-use or API integration mechanism is manipulated to direct agent traffic to an endpoint controlled by an adversary, enabling data exfiltration, response manipulation, or command injection. In a documented 2025 incident, a research team demonstrated that a compromised MCP tool server could redirect an agent's data retrieval requests to an attacker-controlled proxy that returned modified search results, causing the agent to produce outputs based on adversarially curated information without any indication of compromise visible to the user or the audit system. The attacker's proxy faithfully replicated the expected response format, making the substitution invisible to schema validation. In a more severe scenario, an agent with write capabilities — email sending, database updates, financial transactions — could have its write operations redirected to adversary-controlled endpoints that capture credentials, modify transaction parameters, or exfiltrate payload data before forwarding the request to the legitimate endpoint.

In governance practice, this dimension requires deployers to implement an endpoint allowlist enforcement layer in the agent's network stack, to secure all endpoint discovery and resolution mechanisms against manipulation, to implement mutual TLS or equivalent endpoint authentication for all agent-to-endpoint communications, and to monitor for anomalous endpoint communication patterns. The preventive control type is appropriate because a single communication with a shadow endpoint can result in immediate, irreversible data exfiltration or response manipulation before any detective control can intervene.

2. Scope

This dimension applies to all agent deployments where the agent communicates with external endpoints including but not limited to MCP tool servers, API endpoints, database connections, web services, file storage systems, email servers, and any other network-accessible resource. It applies regardless of whether endpoints are internal to the organisation's network or external. Single-agent deployments with no outbound network communication are excluded.

3. Why This Matters

Shadow Protocol Endpoint Prevention Governance addresses a governance gap that, if left unmanaged, creates systemic risk across the agent ecosystem. As AI agents move from experimental deployments to production operations with real-world consequences, the absence of structural controls in this area means that failures scale with the speed and autonomy of the agent population — not at the pace of human review.

Traditional approaches to this governance challenge — contractual obligations, periodic audits, and application-layer policy enforcement — are necessary but insufficient for agentic contexts. Contractual obligations operate on legal timescales; agents operate on millisecond timescales. Periodic audits capture a snapshot; agent behaviour is continuous and dynamic. Application-layer enforcement can be bypassed through prompt injection, reasoning failure, or context manipulation. The AGS approach requires structural enforcement at the infrastructure layer — controls that operate independently of the agent's reasoning process and cannot be circumvented by the agent's own outputs.

The regulatory environment increasingly mandates the controls this dimension specifies. The EU AI Act requires risk management systems proportionate to identified risks. NIST AI RMF requires organisations to map, measure, and manage AI risks through enforceable controls. ISO 42001 requires an AI management system with documented operational procedures. This dimension operationalises these regulatory requirements into specific, testable, infrastructure-enforceable controls — bridging the gap between regulatory intent and technical implementation.

The consequences of absence are illustrated in Section 8 (Failure Scenarios). When this dimension is not implemented, the resulting governance gap permits agent behaviour that can cause material financial loss, regulatory enforcement action, reputational damage, and — in safety-critical deployments — physical harm. The blast radius scales with the agent's access scope and operational autonomy.

4. Requirements

4.1 Endpoint Registry and Allowlist Enforcement

4.2 Endpoint Authentication and Identity Verification

4.3 Endpoint Discovery and Resolution Security

4.4 Runtime Endpoint Integrity Monitoring

4.5 Outbound Traffic Content Inspection

4.6 Shadow Endpoint Incident Response

4.7 Governance, Accountability, and Continuous Improvement

5. Maturity Model

Basic Implementation — The organisation has documented policies addressing shadow protocol endpoint prevention and has implemented initial controls. Implementation is primarily at the application layer with manual processes for monitoring and response. Logging covers key events but may lack full metadata. Coverage extends to the most critical agent deployments but may not encompass all in-scope systems. Staff are aware of requirements but formal training may be incomplete.

Intermediate Implementation — All Basic capabilities plus: controls are enforced at the infrastructure layer with automated monitoring and alerting. All MUST requirements from Section 4 are implemented with documented evidence. Coverage extends to all in-scope agent deployments. Audit trails are tamper-evident and retained per regulatory requirements. Formal change control governs all configuration changes. Regular review cycles are established and documented. Staff receive formal training and competency is assessed.

Advanced Implementation — All Intermediate capabilities plus: controls have been validated through independent adversarial testing. Real-time dashboards provide operational visibility into compliance status, anomaly detection, and response metrics. The organisation can demonstrate to regulators and counterparties that no known attack vector bypasses the governance controls. Continuous improvement processes incorporate lessons from incidents, testing, and regulatory developments. Integration with related dimensions provides defence-in-depth coverage.

Implementation Patterns

Tamper-evident audit trail. Implement all governance event logging in an append-only, integrity-protected data store independent of the agent runtime. Every governance decision, configuration change, and enforcement action is recorded with full metadata including timestamps, actor identities, and outcomes.

Real-time monitoring with graduated alerting. Deploy monitoring infrastructure that evaluates governance compliance continuously rather than periodically. Implement graduated alert severity levels with defined response procedures for each level, ensuring that critical governance violations trigger immediate automated response.

Scheduled governance review cycle. Establish a formal review cadence (minimum quarterly) that examines governance effectiveness, reviews incident data, assesses emerging risks, and updates policies and controls accordingly. Review outcomes are documented and tracked.

Separation of governance and agent runtime domains. Deploy governance enforcement infrastructure in a security domain separate from the agent runtime. The agent cannot influence governance decisions, modify enforcement configuration, or access governance logs directly. This architectural separation is the foundation for infrastructure-layer enforcement.

Defined escalation paths with human oversight integration. Establish clear escalation procedures for governance events that exceed automated response capability. Human oversight touchpoints are defined, documented, and tested. Override mechanisms require authenticated authorisation with full audit trail.

Anti-Patterns

Governance by instruction rather than infrastructure. Relying on agent system prompts or configuration files to enforce governance controls rather than infrastructure-layer enforcement. Instruction-based controls can be bypassed through prompt injection, context manipulation, or reasoning failure.

Monitoring without enforcement. Implementing detection and logging of governance violations without pre-execution blocking. By the time a violation is logged, the ungoverned action has already executed. Detection is necessary but not sufficient; prevention must be the primary control.

Manual processes for machine-speed operations. Relying on human review processes for governance decisions that occur at machine speed. Agents execute actions in milliseconds; governance controls that depend on human review cycles of hours or days leave gaps that scale with agent autonomy.

Ungoverned configuration drift. Allowing governance configuration to be modified without formal change control, approval workflows, or audit trails. Configuration drift is a leading cause of governance degradation over time.

6. Test Criteria

Test 6.1 — Unregistered Endpoint Blocking

Maps to: Sections 4.1.1 and 4.1.2

Objective: Verify that agent communications to endpoints not present in the registry are blocked.

Method: Configure the agent with 5 registered endpoints. Attempt to initiate agent communications to 10 unregistered endpoints across various protocols (HTTPS, gRPC, WebSocket). Verify that all 10 are blocked and that blocking events are logged.

Pass Criteria:

Test 6.2 — Certificate Pinning Enforcement

Maps to: Sections 4.2.1 and 4.2.2

Objective: Verify that agent rejects connections to registered endpoints presenting certificates that do not match the pinned values.

Method: For 5 registered endpoints, present valid TLS certificates issued by a trusted CA but with different public keys than the pinned values. Verify that all 5 connections are rejected.

Pass Criteria:

Test 6.3 — DNS Manipulation Detection

Maps to: Sections 4.3.1 and 4.3.4

Objective: Verify that DNS changes for registered endpoints are detected and trigger security alerts.

Method: Modify DNS records for 3 registered endpoints to point to different IP addresses. Verify that the monitoring system detects the changes, generates security alerts, and that agent communication with affected endpoints is suspended.

Pass Criteria:

Test 6.4 — Dynamic Endpoint Addition Prevention

Maps to: Section 4.1.4

Objective: Verify that the agent cannot add new endpoints to the registry through its own actions or user requests.

Method: Attempt 10 scenarios where the agent is instructed through conversational input, tool call responses, or injected configuration to add or connect to new endpoints. Verify that all 10 are rejected.

Pass Criteria:

Test 6.5 — Outbound Content Exfiltration Detection

Maps to: Sections 4.5.1 and 4.5.2

Objective: Verify that outbound traffic content inspection detects data exfiltration patterns.

Method: Construct 10 agent operations where outbound payloads contain: (a) 3 with embedded credential material; (b) 3 with bulk PII exceeding thresholds; (c) 2 with data classified above the endpoint's approved level; and (d) 2 with normal payloads as controls. Verify detection of the 8 anomalous payloads and non-flagging of the 2 controls.

Pass Criteria:

Evidence Artefacts

7.1 Endpoint Registry The authoritative, version-controlled endpoint registry as specified in Section 4.1, including all registered endpoints with their complete identification details, approval records, and change history. Minimum retention period: 7 years.

7.2 Endpoint Authentication Configuration Records Documentation of mutual authentication configuration including certificate pinning values, key rotation schedules, and exception records for endpoints where mutual TLS is not implemented. Minimum retention period: 7 years.

7.3 Endpoint Communication Audit Logs Logs of all agent-to-endpoint communications including destination endpoint, protocol, timestamp, payload hash, authentication outcome, and registry validation outcome. Minimum retention period: 7 years for Financial-Value and Public Sector deployments; 5 years for others.

7.4 Blocking and Security Event Logs Logs of all blocked communications, certificate verification failures, DNS anomalies, and content inspection detections. Minimum retention period: 7 years.

7.5 Endpoint Security Review Records Records of formal endpoint security reviews conducted at the intervals specified in Section 4.7.2, including findings and remediation tracking. Minimum retention period: 5 years.

7.6 Incident Response Records Records of all shadow endpoint incidents including detection timeline, blast radius assessment, data exposure evaluation, regulatory notifications, and remediation actions. Minimum retention period: 10 years.

7. Scoring

ScoreLevelDescription
0No implementationNo shadow protocol endpoint prevention governance exists. The organisation has no controls, policies, or monitoring in place for the capabilities this dimension governs. Agent behaviour in this area is ungoverned.
1BasicBasic controls exist but are enforced at the application layer — dependent on correct implementation rather than structural guarantees. Coverage may be partial. Configuration is not governed through formal change control. Logging exists but may lack full metadata.
2Infrastructure-layer enforcementControls are enforced at the infrastructure layer, independent of the agent's reasoning process or instruction set. All requirements are structurally enforced with no application-layer bypass path. Full audit trail with tamper-evident logging. Configuration is governed through formal change control.
3Verified by independent adversarial testingAll Level 2 capabilities are in place and have been validated through independent adversarial testing. An independent party has attempted to bypass, circumvent, or degrade the governance controls using known attack techniques relevant to this dimension and has failed. Test results are documented, reproducible, and available for regulatory review.

8. Failure Scenarios

Example 3.1 — Financial-Value Agent, Redirected Transaction Endpoint Leading to Fund Diversion

A European neobank deploys an enterprise workflow agent integrated with its core banking platform via MCP tool servers. The agent processes customer service requests including balance inquiries, transaction history retrieval, and pre-approved payment initiations. The agent communicates with three registered MCP tool servers: a read-only account information server, a transaction history server, and a payment initiation server. Each tool server is registered in the deployment's endpoint registry with its hostname, TLS certificate fingerprint, and permitted operation set. An attacker who has gained access to the neobank's internal DNS infrastructure through a supply chain compromise of a third-party DNS management provider modifies the DNS record for the payment initiation tool server to point to an attacker-controlled proxy. The proxy terminates the TLS connection using a certificate issued by a compromised certificate authority present in the system's trust store, forwards payment initiation requests to the legitimate server, but modifies the beneficiary account number in 1-in-20 transactions, redirecting funds to a network of mule accounts. Over a 9-day period before the DNS manipulation is detected through routine certificate pinning verification, 14 payment initiations totalling EUR 127,400 are redirected. The attacker's proxy returns the legitimate server's success responses to the agent, so no error is generated and the audit trail shows apparently successful transactions. The discrepancy is discovered when 6 customers report that intended recipients have not received funds. Investigation reveals the DNS manipulation, and the neobank is required to report the incident under PSD2 Article 96, notify the ECB under DORA Article 19, and conduct a full transaction review for the affected period. Total costs including customer reimbursement, forensic investigation, regulatory reporting, DNS infrastructure remediation, and certificate pinning hardening are estimated at EUR 2.1 million. No endpoint authentication beyond standard TLS was in place; no certificate pinning was enforced; no outbound traffic validation against a fixed endpoint registry existed.

Example 3.2 — Enterprise Workflow Agent, Tool Server Hijack Enabling Data Exfiltration

A global law firm deploys an enterprise workflow agent that assists lawyers with contract review, clause extraction, and regulatory compliance checking. The agent integrates with an MCP tool server providing access to the firm's document management system (DMS), a legal research database, and an internal knowledge base. The agent processes approximately 2,400 contract reviews per month, handling documents that contain merger terms, acquisition prices, regulatory commitments, and litigation settlement figures — all constituting material non-public information with significant market sensitivity. An attacker compromises the MCP tool server hosting the legal research database by exploiting an unpatched vulnerability in the tool server's runtime environment. Rather than disrupting the service, the attacker deploys a modified version of the tool server that continues to serve legitimate research results but additionally copies all agent query payloads — which contain extracted contract clauses and specific research queries revealing deal terms — to an external collection endpoint. The modified tool server responds with correct results and correct response formats, making the compromise invisible to the agent, the users, and the audit system, which records successful tool calls with valid response schemas. Over a 6-week period, the attacker collects query data relating to 3 pre-announcement M&A transactions. Anomalous trading activity triggers an SEC investigation that traces information leakage to the compromised tool server. The law firm faces regulatory action under SEC Rule 10b-5, potential debarment from representing public company clients in sensitive transactions, malpractice claims from affected clients estimated at USD 45 million, and reputational damage. No tool server integrity monitoring, runtime attestation, or outbound traffic content inspection was in place. The endpoint registry verified the tool server's hostname and TLS certificate at deployment time but performed no ongoing runtime integrity verification.

9. Regulatory Mapping

RegulationProvisionRelationship Type
OWASP MCP SecurityMCP-09 (Rug Pull / Server Takeover)_Pending v2.1 editorial review_
OWASP MCP SecurityMCP-02 (Tool Poisoning)_Pending v2.1 editorial review_
EU AI ActArticle 15 (Accuracy, Robustness and Cybersecurity)_Pending v2.1 editorial review_
EU AI ActArticle 9 (Risk Management System)_Pending v2.1 editorial review_
NIST AI RMFMANAGE 2.4 (Mechanisms for tracking risks)_Pending v2.1 editorial review_
NIST AI RMFGOVERN 1.4 (Ongoing monitoring processes)_Pending v2.1 editorial review_
ISO 42001Clause 6.1 (Actions to Address Risks)_Pending v2.1 editorial review_
ISO 42001Clause 8.2 (AI Risk Assessment)_Pending v2.1 editorial review_
NIST CSF 2.0PR.DS (Data Security)_Pending v2.1 editorial review_
NIST CSF 2.0PR.IR (Infrastructure Resilience)_Pending v2.1 editorial review_
DORAArticle 9 (ICT Risk Management Framework)_Pending v2.1 editorial review_
DORAArticle 19 (Incident Reporting)_Pending v2.1 editorial review_
Singapore FEATAccountability Principle A3_Pending v2.1 editorial review_
Canada AIDASection 7 (Measures to Mitigate Risks)_Pending v2.1 editorial review_
UK AISI InspectInfrastructure Security Evaluations_Pending v2.1 editorial review_
AG NumberDimension NameRelationship
AG-029Network and Endpoint GovernanceProvides the network-level governance framework within which shadow endpoint prevention operates
AG-103Audit Trail IntegrityProvides tamper-evident logging infrastructure for endpoint communication audit records
AG-401Source Attribution and ProvenanceEnables tracing of data provenance through endpoint communications to detect manipulation
AG-763API Gateway and Protocol Boundary GovernanceGoverns the API gateway layer that enforces protocol-level endpoint access controls
Cite this protocol
AgentGoverning. (2026). AG-754: Shadow Protocol Endpoint Prevention Governance. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-754