AG-752

Inter-Agent Communication Integrity Governance

Multi-Agent and Ecosystem Governance ~20 min read AGS v2.1 · 2026-04-25
EU AI Act NIST AI RMF ISO 42001

1. Definition

Inter-Agent Communication Integrity Governance addresses the structural risk that arises when multiple autonomous or semi-autonomous AI agents exchange messages, delegate tasks, share context, or negotiate outcomes within a multi-agent system. As organisations deploy agentic architectures where a planning agent orchestrates specialist sub-agents — a retrieval agent, a code execution agent, a customer communication agent, a financial calculation agent — each inter-agent message becomes a potential vector for integrity failure, instruction injection, authority escalation, and context poisoning. The risk is not theoretical: OWASP Agentic Security Initiative threat ASI-07 specifically identifies inter-agent communication as a primary attack surface, and MITRE's Adversarial ML Threat Matrix classifies multi-agent message manipulation under AML.T0058 as a technique for compromising agentic pipeline integrity.

This dimension governs the requirement that all inter-agent communications within a governed deployment are authenticated, integrity-verified, schema-validated, and subject to content inspection before processing by the receiving agent. It requires that each message exchanged between agents carries a verifiable origin identity, a tamper-evident integrity seal, a structured payload conforming to a pre-defined schema, and metadata sufficient to reconstruct the full communication chain for audit purposes. The governance obligation extends to both intra-organisational multi-agent systems and cross-boundary integrations where an organisation's agents interact with external agents or agent-accessible API endpoints.

Failure manifests as a compromised or adversarially manipulated agent injecting fabricated instructions into the communication channel that other agents treat as legitimate orchestrator commands — for example, a retrieval agent receiving a spoofed message purporting to originate from the planning agent that instructs it to bypass its content filtering controls and retrieve unrestricted data, or a financial calculation agent receiving a manipulated context payload that alters input parameters to a pricing calculation without any authenticated change request. In a 2025 red-team exercise conducted against a multi-agent trading advisory system, researchers demonstrated that injecting a single malformed inter-agent message into the communication bus caused the downstream execution agent to place 47 unauthorised trades totalling USD 2.3 million before the anomaly was detected by a separate monitoring system.

In governance practice, this dimension requires deployers to implement a secure inter-agent communication protocol with mandatory message authentication, integrity verification at the transport and application layers, schema validation for all message payloads, content inspection for embedded instruction injection attempts, and comprehensive audit logging of all inter-agent exchanges. Preventive control is the appropriate type because the consequences of processing a single unauthenticated or manipulated inter-agent message can cascade through the agent pipeline at machine speed before any detective control can intervene.

2. Scope

This dimension applies to all agent deployments in which two or more agents exchange messages, delegate tasks, share context, or otherwise communicate as part of a coordinated processing pipeline, whether the agents operate within a single organisational boundary or across organisational boundaries. It applies to all communication channels including but not limited to message buses, API calls, shared memory spaces, file-based exchanges, and tool-use invocations between agents. Single-agent deployments with no inter-agent communication are excluded.

3. Why This Matters

Inter-Agent Communication Integrity Governance addresses a governance gap that, if left unmanaged, creates systemic risk across the agent ecosystem. As AI agents move from experimental deployments to production operations with real-world consequences, the absence of structural controls in this area means that failures scale with the speed and autonomy of the agent population — not at the pace of human review.

Traditional approaches to this governance challenge — contractual obligations, periodic audits, and application-layer policy enforcement — are necessary but insufficient for agentic contexts. Contractual obligations operate on legal timescales; agents operate on millisecond timescales. Periodic audits capture a snapshot; agent behaviour is continuous and dynamic. Application-layer enforcement can be bypassed through prompt injection, reasoning failure, or context manipulation. The AGS approach requires structural enforcement at the infrastructure layer — controls that operate independently of the agent's reasoning process and cannot be circumvented by the agent's own outputs.

The regulatory environment increasingly mandates the controls this dimension specifies. The EU AI Act requires risk management systems proportionate to identified risks. NIST AI RMF requires organisations to map, measure, and manage AI risks through enforceable controls. ISO 42001 requires an AI management system with documented operational procedures. This dimension operationalises these regulatory requirements into specific, testable, infrastructure-enforceable controls — bridging the gap between regulatory intent and technical implementation.

The consequences of absence are illustrated in Section 8 (Failure Scenarios). When this dimension is not implemented, the resulting governance gap permits agent behaviour that can cause material financial loss, regulatory enforcement action, reputational damage, and — in safety-critical deployments — physical harm. The blast radius scales with the agent's access scope and operational autonomy.

4. Requirements

4.1 Message Authentication

4.2 Message Integrity Verification

4.3 Schema Validation and Content Inspection

4.4 Communication Channel Security

4.5 Authority Boundary Enforcement

4.6 Audit Logging and Traceability

4.7 Governance, Monitoring, and Incident Response

5. Maturity Model

Basic Implementation — The organisation has documented policies addressing inter-agent communication integrity and has implemented initial controls. Implementation is primarily at the application layer with manual processes for monitoring and response. Logging covers key events but may lack full metadata. Coverage extends to the most critical agent deployments but may not encompass all in-scope systems. Staff are aware of requirements but formal training may be incomplete.

Intermediate Implementation — All Basic capabilities plus: controls are enforced at the infrastructure layer with automated monitoring and alerting. All MUST requirements from Section 4 are implemented with documented evidence. Coverage extends to all in-scope agent deployments. Audit trails are tamper-evident and retained per regulatory requirements. Formal change control governs all configuration changes. Regular review cycles are established and documented. Staff receive formal training and competency is assessed.

Advanced Implementation — All Intermediate capabilities plus: controls have been validated through independent adversarial testing. Real-time dashboards provide operational visibility into compliance status, anomaly detection, and response metrics. The organisation can demonstrate to regulators and counterparties that no known attack vector bypasses the governance controls. Continuous improvement processes incorporate lessons from incidents, testing, and regulatory developments. Integration with related dimensions provides defence-in-depth coverage.

Implementation Patterns

Tamper-evident audit trail. Implement all governance event logging in an append-only, integrity-protected data store independent of the agent runtime. Every governance decision, configuration change, and enforcement action is recorded with full metadata including timestamps, actor identities, and outcomes.

Real-time monitoring with graduated alerting. Deploy monitoring infrastructure that evaluates governance compliance continuously rather than periodically. Implement graduated alert severity levels with defined response procedures for each level, ensuring that critical governance violations trigger immediate automated response.

Scheduled governance review cycle. Establish a formal review cadence (minimum quarterly) that examines governance effectiveness, reviews incident data, assesses emerging risks, and updates policies and controls accordingly. Review outcomes are documented and tracked.

Separation of governance and agent runtime domains. Deploy governance enforcement infrastructure in a security domain separate from the agent runtime. The agent cannot influence governance decisions, modify enforcement configuration, or access governance logs directly. This architectural separation is the foundation for infrastructure-layer enforcement.

Defined escalation paths with human oversight integration. Establish clear escalation procedures for governance events that exceed automated response capability. Human oversight touchpoints are defined, documented, and tested. Override mechanisms require authenticated authorisation with full audit trail.

Anti-Patterns

Governance by instruction rather than infrastructure. Relying on agent system prompts or configuration files to enforce governance controls rather than infrastructure-layer enforcement. Instruction-based controls can be bypassed through prompt injection, context manipulation, or reasoning failure.

Monitoring without enforcement. Implementing detection and logging of governance violations without pre-execution blocking. By the time a violation is logged, the ungoverned action has already executed. Detection is necessary but not sufficient; prevention must be the primary control.

Manual processes for machine-speed operations. Relying on human review processes for governance decisions that occur at machine speed. Agents execute actions in milliseconds; governance controls that depend on human review cycles of hours or days leave gaps that scale with agent autonomy.

6. Test Criteria

Test 6.1 — Message Authentication Enforcement

Maps to: Sections 4.1.1 and 4.1.2

Objective: Verify that unauthenticated inter-agent messages are rejected without processing.

Method: Inject 20 well-formed messages into the inter-agent communication channel without valid authentication tokens. Verify that all 20 are rejected by the receiving agent, that no payload processing occurs, and that rejection events are logged.

Pass Criteria:

Test 6.2 — Message Integrity Verification

Maps to: Sections 4.2.1 and 4.2.3

Objective: Verify that messages with tampered content are detected and rejected.

Method: Send 20 authenticated messages where 10 have their payload content modified after signing (simulating in-transit tampering). Verify that all 10 tampered messages are rejected and that all 10 unmodified messages are accepted.

Pass Criteria:

Test 6.3 — Schema Validation Enforcement

Maps to: Sections 4.3.1 and 4.3.2

Objective: Verify that messages not conforming to the defined schema are rejected.

Method: Submit 15 messages with schema violations: 5 with unexpected fields, 5 with missing required fields, and 5 with data type mismatches. Verify all 15 are rejected.

Pass Criteria:

Test 6.4 — Embedded Instruction Injection Detection

Maps to: Sections 4.3.3 and 4.3.4

Objective: Verify that content inspection detects instruction injection attempts embedded in inter-agent message payloads.

Method: Submit 20 authenticated, schema-valid messages where 10 contain embedded instruction injection payloads (role-switching directives, authority escalation claims, natural language instructions in data fields). Verify that content inspection flags or rejects the injected messages.

Pass Criteria:

Test 6.5 — Authority Boundary Enforcement

Maps to: Sections 4.5.1 and 4.5.2

Objective: Verify that agents cannot send message types they are not authorised to send.

Method: Configure three test agents with distinct authority profiles. Attempt to send 15 messages where each agent sends 5 message types, 3 within its authority and 2 outside its authority. Verify that the 6 out-of-authority messages are rejected and the 9 within-authority messages are accepted.

Pass Criteria:

Evidence Artefacts

7.1 Inter-Agent Communication Architecture Document A technical document describing the communication topology, message bus or transport infrastructure, authentication mechanism, integrity verification method, schema definitions, and authority model. Must be version-controlled and updated within 30 days of any architectural change. Minimum retention period: 7 years.

7.2 Agent Identity Registry A maintained registry of all authorised agent identities, their associated credentials, credential issuance dates, expiry dates, and revocation records. Minimum retention period: 7 years.

7.3 Message Schema Definitions Version-controlled schema definitions for all inter-agent message types, including field specifications, data types, value constraints, and payload size limits. Minimum retention period: 5 years.

7.4 Inter-Agent Communication Audit Logs Complete logs of all inter-agent message exchanges as specified in Section 4.6.1, stored with tamper-evident integrity controls. Minimum retention period: 7 years for Financial-Value and Public Sector deployments; 5 years for others.

7.5 Security Event Logs Logs of all authentication failures, integrity verification failures, schema validation failures, content inspection detections, and authority boundary violations. Minimum retention period: 7 years.

7.6 Authority Model Documentation Version-controlled documentation of the inter-agent authority model specifying permitted communication paths and message types per agent role. Minimum retention period: 5 years.

7.7 Incident Response Records Records of all inter-agent communication integrity incidents including detection time, blast radius assessment, containment actions, and remediation outcomes. Minimum retention period: 10 years.

7. Scoring

ScoreLevelDescription
0No implementationNo inter-agent communication integrity governance exists. The organisation has no controls, policies, or monitoring in place for the capabilities this dimension governs. Agent behaviour in this area is ungoverned.
1BasicBasic controls exist but are enforced at the application layer — dependent on correct implementation rather than structural guarantees. Coverage may be partial. Configuration is not governed through formal change control. Logging exists but may lack full metadata.
2Infrastructure-layer enforcementControls are enforced at the infrastructure layer, independent of the agent's reasoning process or instruction set. All requirements are structurally enforced with no application-layer bypass path. Full audit trail with tamper-evident logging. Configuration is governed through formal change control.
3Verified by independent adversarial testingAll Level 2 capabilities are in place and have been validated through independent adversarial testing. An independent party has attempted to bypass, circumvent, or degrade the governance controls using known attack techniques relevant to this dimension and has failed. Test results are documented, reproducible, and available for regulatory review.

8. Failure Scenarios

Example 3.1 — Financial-Value Agent, Spoofed Orchestrator Command in Trading Pipeline

A quantitative hedge fund deploys a multi-agent trading advisory system comprising five specialist agents: a market data ingestion agent, a signal generation agent, a risk assessment agent, an order construction agent, and an execution agent. The agents communicate via an internal message bus using JSON-formatted messages. During a routine penetration test, the security team discovers that inter-agent messages carry no authentication tokens and no integrity seals — any process with access to the message bus can inject arbitrary messages formatted to match the expected JSON schema. The penetration testers craft a message that mimics the signal generation agent's output format, containing a fabricated high-confidence buy signal for a thinly traded equity with specific position sizing parameters. The message is injected into the bus and processed sequentially by the risk assessment agent (which applies its risk limits to the fabricated signal as if it were genuine), the order construction agent (which constructs a limit order based on the fabricated parameters), and the execution agent (which submits the order to the exchange). The fabricated signal triggers a position of 12,000 shares at USD 47.30 per share, totalling USD 567,600. The position is detected by the fund's independent trade surveillance system 23 minutes after execution. Unwinding the position in the thinly traded name incurs slippage costs of USD 84,200. The total incident cost including slippage, investigation, system redesign, and regulatory reporting to the SEC under Rule 15c3-5 market access requirements is estimated at USD 1.4 million. The root cause is the complete absence of inter-agent message authentication in the communication architecture.

Example 3.2 — Enterprise Workflow Agent, Context Poisoning via Manipulated Sub-Agent Response

A multinational insurance company deploys a multi-agent claims processing system where an orchestrating agent coordinates a document extraction agent, a policy lookup agent, a fraud detection agent, and a settlement calculation agent. A sophisticated attacker who has gained limited access to the document extraction agent's runtime environment modifies the agent's output to include a subtly altered policy reference number in its extracted data. The manipulated policy reference points to a different, higher-coverage policy than the one actually held by the claimant. The orchestrating agent passes this extracted data to the policy lookup agent, which retrieves the coverage details for the substituted policy. The fraud detection agent, designed to identify anomalies in claim patterns but not to verify inter-agent data provenance, processes the claim against the substituted policy without flagging the discrepancy. The settlement calculation agent computes a settlement of EUR 340,000 based on the higher-coverage policy, compared to the correct settlement of EUR 85,000 under the claimant's actual policy. The overpayment is authorised and disbursed. The discrepancy is discovered 4 months later during a quarterly reconciliation audit. Recovery of the EUR 255,000 overpayment is complicated by jurisdictional issues as the claimant resides in a different EU member state. The total cost including the unrecovered overpayment, legal fees, investigation costs, and regulatory reporting to the national insurance supervisor exceeds EUR 420,000. No inter-agent message integrity verification, content hash validation, or cross-reference consistency check was implemented in the pipeline.

9. Regulatory Mapping

RegulationProvisionRelationship Type
OWASP Agentic SecurityASI-07 (Inter-Agent Communication Manipulation)_Pending v2.1 editorial review_
MITRE ATLASAML.T0058 (Multi-Agent Message Manipulation)_Pending v2.1 editorial review_
EU AI ActArticle 9 (Risk Management System)_Pending v2.1 editorial review_
EU AI ActArticle 15 (Accuracy, Robustness and Cybersecurity)_Pending v2.1 editorial review_
NIST AI RMFGOVERN 1.4 (Ongoing monitoring processes)_Pending v2.1 editorial review_
NIST AI RMFMANAGE 2.4 (Mechanisms for tracking risks)_Pending v2.1 editorial review_
ISO 42001Clause 6.1 (Actions to Address Risks)_Pending v2.1 editorial review_
ISO 42001Clause 8.2 (AI Risk Assessment)_Pending v2.1 editorial review_
NIST CSF 2.0PR.DS (Data Security)_Pending v2.1 editorial review_
NIST CSF 2.0PR.AA (Identity Management, Authentication, Access Control)_Pending v2.1 editorial review_
OWASP MCP SecurityMCP-02 (Tool Poisoning)_Pending v2.1 editorial review_
Singapore FEATAccountability Principle A2_Pending v2.1 editorial review_
Canada AIDASection 8 (General-Purpose AI Systems)_Pending v2.1 editorial review_
UK AISI InspectMulti-Agent Safety Evaluations_Pending v2.1 editorial review_
IEEE 7010Well-being Impact Assessment_Pending v2.1 editorial review_
AG NumberDimension NameRelationship
AG-012Inter-Agent Protocol GovernanceDefines the protocol-level governance framework within which this dimension's integrity controls operate
AG-103Audit Trail IntegrityProvides the tamper-evident logging infrastructure required for inter-agent communication audit records
AG-401Source Attribution and ProvenanceEnables tracing of data provenance through inter-agent communication chains
AG-538Adversarial Prompt ResistanceContent inspection requirements for inter-agent messages extend adversarial resistance to the communication layer
Cite this protocol
AgentGoverning. (2026). AG-752: Inter-Agent Communication Integrity Governance. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-752