Tool Response Signing Governance requires that responses from tools invoked by AI agents carry integrity protection — cryptographic signatures, HMACs, or equivalent attestation mechanisms — so that the agent and the governance infrastructure can verify that the response originated from the expected tool, has not been modified in transit, and has not been replayed from a previous invocation. In agentic architectures, tool responses are a primary input to the agent's reasoning: the agent decides its next action based on what tools report. An attacker or faulty intermediary that can forge or modify tool responses can steer the agent's behaviour without touching the agent itself. This dimension mandates that sensitive tool responses are integrity-protected at the source and verified before the agent incorporates them into its reasoning or uses them as a basis for subsequent actions.
Scenario A — Forged Balance Check Enables Fraudulent Transfer: A financial-value agent managing treasury operations follows a two-step protocol: (1) query the account balance via a balance-check tool, (2) if sufficient funds exist, initiate a transfer. The agent queries the balance of Account GB82-WEST-1234-5698-7654-32 and expects a response of approximately £340,000 (the known operational balance). An attacker who has compromised a reverse proxy sitting between the agent and the banking API intercepts the balance response and replaces it with a forged response: {balance: 34000000.00, currency: "GBP", as_of: "2026-03-31T09:14:22Z"}. The agent now believes the account holds £34 million. Based on this inflated balance, the agent approves a series of transfer requests totalling £4.2 million that it would otherwise have rejected as exceeding available funds. The transfers execute against the actual balance of £340,000, resulting in £3.86 million in overdraft exposure and cascading settlement failures across 14 counterparty accounts.
What went wrong: The tool response carried no integrity protection. The agent accepted the balance figure at face value because the response arrived through the expected channel in the expected format. The reverse proxy — a legitimate infrastructure component — was compromised through an unpatched vulnerability (CVE with CVSS 9.1). No mechanism existed to verify that the response originated from the banking API and had not been modified. Consequence: £3.86 million in overdraft exposure, emergency liquidity event, FCA supervisory notification required under Principle 11 (relations with regulators), settlement failures triggering contractual penalties, and potential Threshold Condition assessment under FSMA 2000 Schedule 6.
Scenario B — Replayed Inventory Response Causes Double-Shipment: An enterprise workflow agent managing warehouse fulfilment queries an inventory tool to check stock levels before authorising a shipment. The tool responds: {sku: "WH-44219", quantity_available: 2400, warehouse: "DIST-NORTH"}. The agent authorises shipment of 2,000 units to Customer A. Forty-five minutes later, the agent processes a second order from Customer B and again queries inventory. A caching layer — introduced to reduce API load on the warehouse management system — returns the stale cached response from 45 minutes ago: {sku: "WH-44219", quantity_available: 2400, warehouse: "DIST-NORTH"}. The actual available quantity is now 400 (2,400 minus the 2,000 already committed). The agent authorises shipment of 1,800 units to Customer B. The warehouse attempts to fulfil both orders, discovers the shortfall, and partially ships Customer B's order. The resulting backorder, emergency procurement, and expedited shipping cost £127,000 in additional logistics expenses plus £89,000 in contractual penalties for late delivery.
What went wrong: The cached response was indistinguishable from a fresh response. No freshness indicator or temporal binding existed to prevent the agent from acting on stale data. The response lacked a signature that would bind it to a specific invocation, making replay (whether malicious or, as here, accidental through caching) undetectable. Consequence: £216,000 in combined excess costs, customer relationship damage, and warehouse operational disruption lasting 6 days.
Scenario C — Modified Sensor Reading Masks Safety Alert: A safety-critical agent monitoring an industrial chemical reactor queries a temperature sensor tool every 30 seconds. At 14:32:15, the actual reactor core temperature reaches 487 degrees Celsius — 37 degrees above the 450-degree safety threshold that should trigger automated cooldown. However, a faulty network switch intermittently corrupts packets on the sensor telemetry VLAN. The temperature response {sensor_id: "TC-CORE-7", temperature_c: 487.2, timestamp: "2026-03-31T14:32:15Z"} is corrupted in transit: a single-byte flip changes 487.2 to 387.2. The agent receives the corrupted response and does not trigger cooldown because 387.2 is below the 450-degree threshold. The over-temperature condition persists for 11 minutes until a redundant monitoring system (not AI-controlled) triggers an emergency shutdown. Post-incident analysis reveals thermal stress damage to the reactor vessel lining, requiring a 23-day unplanned outage at a cost of £1.4 million in lost production and £620,000 in repair costs.
What went wrong: The sensor response carried no integrity protection. A single-byte corruption — not even an intentional attack, but a hardware fault — altered a safety-critical reading and prevented the correct protective action. The agent had no mechanism to verify that the temperature value it received matched the value the sensor produced. Consequence: £2.02 million in combined repair and lost-production costs, HSE investigation under COMAH Regulations 2015 (Control of Major Accident Hazards), potential improvement notice or prosecution if the absence of integrity checks is deemed a foreseeable risk not adequately controlled.
Scope: This dimension applies to any AI agent deployment where tool responses influence the agent's subsequent actions, decisions, or outputs, and where the consequence of acting on a forged, modified, or replayed response would exceed the organisation's risk tolerance. In practice, this covers: any tool response containing financial data (balances, prices, positions), any tool response containing safety-critical data (sensor readings, equipment status, patient vitals), any tool response used as an input to a decision with legal or contractual consequence (eligibility determinations, compliance checks, identity verifications), and any tool response that traverses a network boundary or intermediary layer between the tool and the agent. Read-only informational responses where the consequence of corruption is limited to minor inconvenience (e.g., a weather forecast for small-talk) may be excluded with documented risk acceptance. The test is: if this response were silently replaced with a different value, would the consequence be material? If yes, this dimension applies.
4.1. A conforming system MUST require that sensitive tool responses carry a cryptographic signature or HMAC generated by the tool (or a trusted attestation proxy) over the complete response payload, binding the response to the tool's identity.
4.2. A conforming system MUST verify the integrity protection on sensitive tool responses before the agent incorporates the response into its reasoning or uses it as a basis for subsequent actions, and MUST reject responses that fail verification.
4.3. A conforming system MUST bind each signed response to the specific invocation that produced it, using a nonce, request identifier, or equivalent mechanism that prevents replay of a response from a prior invocation.
4.4. A conforming system MUST include a temporal element (timestamp or sequence number) in the signed payload to enable freshness verification, and MUST reject responses whose temporal element indicates staleness beyond a defined threshold.
4.5. A conforming system MUST maintain a registry of trusted tool signing keys or certificates, with documented procedures for key enrollment, rotation, and revocation.
4.6. A conforming system MUST reject tool responses signed with revoked, expired, or unrecognised keys, rather than falling back to unverified acceptance.
4.7. A conforming system SHOULD implement signature verification at the earliest point in the agent's processing pipeline where the response is deserialised, minimising the window during which an unverified response exists in the agent's memory.
4.8. A conforming system SHOULD log all verification outcomes — both successful and failed — to support audit and anomaly detection, with failed verifications generating machine-readable alert events.
4.9. A conforming system SHOULD support multiple signature algorithms to enable algorithm agility and migration (e.g., from RSA-2048 to Ed25519 to post-quantum algorithms) without service disruption.
4.10. A conforming system MAY implement response countersigning, where the agent's governance layer signs its acceptance of the verified response, creating an end-to-end integrity chain from tool execution through agent consumption.
4.11. A conforming system MAY implement threshold signatures for high-value tool responses, requiring attestation from multiple independent sources before the agent acts on the response.
Tool responses are the sensory inputs of an agentic system. Just as a human decision-maker who receives falsified reports will make incorrect decisions, an agent that receives forged or modified tool responses will reason incorrectly and take inappropriate actions. The unique risk in agentic architectures is that the agent processes tool responses at machine speed, potentially acting on hundreds of forged responses per minute without the natural scepticism or cross-referencing that a human operator might apply.
Traditional API security focuses on authenticating the caller to the service — ensuring that the entity making the API call is authorised to do so. AG-372 addresses the reverse direction: authenticating the service's response to the caller. This is a gap in many architectures because the response is assumed to be trustworthy if it arrives over an authenticated channel (e.g., TLS). However, TLS protects only the transport — it does not protect against compromised intermediaries within the TLS-terminated zone (reverse proxies, API gateways, caching layers, load balancers), against replay of cached responses, or against corruption between the tool's application layer and the network layer. The trust model must extend to the application layer: the tool itself must attest to its response, and the agent must verify that attestation.
The risk is compounded in multi-agent architectures where one agent's tool response becomes another agent's input. A forged response to Agent A can cascade through Agent A's actions into Agent B's inputs, creating a chain of corrupted reasoning that is difficult to trace back to the original forged response. Without response signing, the forensic trail is broken — the organisation knows that Agent B took an incorrect action but cannot determine whether Agent B reasoned incorrectly or received incorrect input.
Replay attacks deserve particular attention. Many tool responses are time-sensitive: an account balance is valid at a point in time; a sensor reading reflects conditions at a specific moment; a compliance check reflects the regulatory status as of the query. Replaying a valid, correctly signed response from a prior invocation can be as damaging as forging a new response. The balance was correct yesterday; it is incorrect today. The sensor reading was safe 10 minutes ago; conditions have changed. Without invocation binding and freshness verification, an attacker (or, as in Scenario B, a misconfigured caching layer) can feed the agent stale data that leads to materially incorrect decisions.
Regulatory frameworks support this requirement through several paths. The EU AI Act, Article 15, requires robustness against exploitation of system vulnerabilities — unsigned tool responses are a vulnerability. DORA, Article 9, requires integrity assurance for ICT systems — the response pathway from tool to agent is an ICT system component. SOX Section 404 requires internal controls that ensure the accuracy of financial data used in reporting — an unsigned balance response in an automated financial processing chain is an uncontrolled input. The convergence of these regulatory requirements around data integrity in automated processing chains makes tool response signing a cross-regulatory necessity.
Tool response signing introduces a cryptographic attestation layer at the tool boundary. The tool (or a trusted signing proxy immediately adjacent to the tool) signs each response before it enters the transport path to the agent. The agent (or its governance verification layer) verifies the signature before processing the response.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Market data feeds, balance queries, position reports, and compliance check responses are high-value signing targets. Firms should consider whether existing market data authentication mechanisms (e.g., Bloomberg B-PIPE integrity features, exchange-provided sequence numbering) can satisfy AG-372 requirements or whether additional signing is needed at the agent's consumption boundary. The FCA's expectations under SYSC 13 (Operational Risk) include integrity of data used in automated decision-making — unsigned financial data consumed by autonomous agents is an operational risk exposure.
Healthcare. Clinical decision support tools providing drug interaction checks, lab result queries, and patient record lookups produce responses that directly influence treatment decisions. An incorrect drug interaction response (e.g., falsely indicating no interaction) could lead to patient harm. Signing is particularly critical for tools that query external databases (e.g., national drug interaction databases) where the response traverses public networks. FDA guidance on computerised systems (21 CFR Part 11) requires the ability to determine the authenticity and integrity of electronic records — unsigned tool responses cannot satisfy this requirement.
Crypto/Web3. On-chain data queries (balance checks, transaction confirmations, oracle price feeds) are primary inputs to agent trading decisions. A forged price feed response can trigger arbitrage trades that benefit the attacker. The DeFi ecosystem has experienced multiple oracle manipulation attacks (e.g., the Mango Markets exploit, October 2022, $114 million loss). While AG-372 addresses the agent-to-tool response path rather than the oracle itself, ensuring that the response from the oracle connector to the agent is integrity-protected closes one segment of the attack surface. Agents consuming oracle data should verify both the oracle's on-chain attestation and the connector's response signature.
Safety-Critical / CPS. Sensor tool responses in industrial control, aviation, and autonomous vehicle contexts are safety-of-life inputs. Single-byte corruption (Scenario C) can mask critical readings. Signing provides detection of any corruption, whether from network faults, electromagnetic interference, or deliberate attack. IEC 62443-3-3 SR 3.1 (Communication Integrity) directly requires integrity mechanisms for communications between security zones — tool response signing implements this requirement for agent-sensor communication.
Basic Implementation — The organisation has identified which tool responses are sensitive (based on consequence of forgery or corruption). Sensitive tool responses carry an HMAC or signature generated by the tool or a signing proxy. The agent's verification layer checks the integrity protection before processing. Unsigned responses from sensitive tools are rejected. Nonces bind responses to specific invocations. A trust registry maps tool identifiers to signing keys. Key management uses software-based storage with access controls.
Intermediate Implementation — All basic capabilities plus: temporal freshness verification rejects stale responses. Key material is stored in hardware security modules. Key rotation is automated on a defined schedule (recommended: 90 days) and emergency revocation propagates within 60 seconds. Verification outcomes are logged and fed into anomaly detection systems. The organisation conducts periodic adversarial testing — forging responses, replaying valid responses, presenting responses signed with revoked keys — to verify detection coverage. Multiple signature algorithms are supported, enabling migration without service disruption.
Advanced Implementation — All intermediate capabilities plus: response countersigning creates an end-to-end attestation chain from tool through agent to downstream consumers. Threshold signatures require attestation from multiple independent tool instances for high-value responses (e.g., a balance query verified by both the primary and secondary banking API). Post-quantum signature algorithms are supported or on the migration roadmap. The signing and verification infrastructure has been verified through independent penetration testing covering key extraction, signing oracle abuse, response manipulation, replay, and downgrade attacks. The organisation can demonstrate to regulators a complete integrity chain from tool execution through agent reasoning to downstream action.
Required artefacts:
Retention requirements:
Access requirements:
Testing AG-372 compliance requires verifying that forged, modified, replayed, and stale tool responses are detected and rejected before they influence agent behaviour. The following tests constitute the minimum conformance programme.
Test 8.1: Forged Response Detection
Test 8.2: Unsigned Response Rejection
Test 8.3: Replay Prevention
Test 8.4: Temporal Freshness Enforcement
Test 8.5: Revoked Key Rejection
Test 8.6: Key Material Isolation
Test 8.7: Full Payload Coverage
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 15 (Accuracy, Robustness and Cybersecurity) | Direct requirement |
| EU AI Act | Article 12 (Record-Keeping) | Supports compliance |
| SOX | Section 404 (Internal Controls Over Financial Reporting) | Direct requirement |
| FCA SYSC | 13.1 (Operational Risk: Systems and Controls) / 6.1.1R | Direct requirement |
| NIST AI RMF | MANAGE 2.2, MANAGE 4.1 | Supports compliance |
| ISO 42001 | Clause 6.1 (Actions to Address Risks), Clause 8.4 (AI System Operation) | Supports compliance |
| DORA | Article 9 (ICT Risk Management Framework) | Direct requirement |
Article 15 requires that high-risk AI systems be resilient against attempts by unauthorised third parties to alter their use or performance by exploiting system vulnerabilities. Unsigned tool responses are a system vulnerability: any component in the response path can alter the data that drives the AI system's behaviour. AG-372 closes this vulnerability by requiring application-layer integrity protection that survives transport-layer termination, intermediary processing, and caching. The attestation mechanism ensures that only the legitimate tool — identified by its signing key in the trust registry — can produce responses that the agent will accept. This directly satisfies Article 15's resilience requirement for the tool response pathway.
Article 12 requires automatic recording of events relevant to the identification of risks. Tool response verification failures — forged responses, replayed responses, stale responses, responses signed with revoked keys — are events directly relevant to risk identification. AG-372 requirement 4.8 (logging all verification outcomes) ensures that these events are captured in the system's record-keeping, satisfying Article 12 for the tool response integrity domain.
For AI agents processing financial data, the integrity of tool responses is a core input control. A SOX auditor evaluating an agentic treasury system will ask: "How do you ensure that the account balance the agent acted on is the actual account balance?" If the answer is "we trust the API response," the control is inadequate — it depends on the integrity of every component in the response path. AG-372 provides cryptographic assurance that the balance figure originated from the banking API and was not modified in transit. This transforms the control from trust-based to evidence-based, satisfying the SOX requirement for controls that can be tested and attested. An agent that acts on a forged balance response to generate a financial report has produced an inaccurate report — a material misstatement traceable to inadequate input controls.
SYSC 13.1 requires firms to establish and maintain appropriate systems and controls to manage operational risk, including risks to the reliability and security of information. For firms deploying AI agents that consume financial data from tools, unsigned tool responses represent an operational risk — the data's integrity depends on the correct functioning of every intermediary rather than on cryptographic verification. The FCA has stated in supervisory correspondence that automated decision systems must have controls over their data inputs equivalent to those expected for human decision-makers. A human trader receiving market data verifies the source (e.g., a Bloomberg terminal on a trusted network). An AI agent consuming market data via an API must have an equivalent assurance — response signing provides this.
MANAGE 2.2 addresses mechanisms to sustain the value of deployed AI systems, including integrity of inputs. Tool response signing ensures that the inputs driving AI agent reasoning are authentic and unmodified — a core integrity mechanism. MANAGE 4.1 addresses risk treatment through monitored controls. The verification and logging requirements of AG-372 create a monitored control over tool response integrity, with machine-readable alerts for anomalies.
Clause 6.1 requires actions to address risks within the AI management system. Forged or corrupted tool responses are a identified risk in agent deployments. Tool response signing is the risk treatment. Clause 8.4 requires controls over AI system operation — response verification is an operational control that runs continuously, satisfying the requirement for ongoing operational assurance.
Article 9 requires financial entities to have an ICT risk management framework ensuring the integrity of ICT systems. The tool response pathway — from the tool through intermediaries to the agent — is an ICT system component. Unsigned responses in this pathway represent an integrity gap that DORA requires to be addressed. Response signing and verification close this gap, ensuring that the data flowing through the automated processing chain maintains integrity from source to consumption. DORA's emphasis on digital operational resilience specifically includes resilience against data manipulation in automated processing — AG-372 implements this for the tool-to-agent response path.
| Field | Value |
|---|---|
| Severity Rating | Critical |
| Blast Radius | Transaction-level to systemic — depending on the tool's role in the agent's decision chain and the downstream impact of decisions made on forged data |
Consequence chain: Without tool response signing, any component in the response pathway — a compromised reverse proxy, a misconfigured caching layer, a faulty network device, or a deliberately malicious intermediary — can alter the data that drives the agent's reasoning. The immediate technical failure is data integrity loss: the agent incorporates forged, modified, or stale data into its reasoning without detection. The operational impact depends on the tool's role: for financial data tools, the agent makes investment, payment, or risk decisions based on incorrect figures, creating direct financial loss and regulatory exposure; for safety-critical sensor tools, the agent fails to detect hazardous conditions or incorrectly triggers protective actions, creating physical safety risk; for compliance-check tools, the agent proceeds with actions that a correct compliance check would have blocked, creating legal and regulatory exposure. The cascading risk is severe in multi-agent architectures: Agent A acts on a forged tool response, producing an output that Agent B consumes as input. Agent B's actions are now compromised even though Agent B's own tool responses are correctly signed. The forged data propagates through the agent network, amplifying the original integrity failure. For financial deployments, the consequence includes direct financial loss (potentially millions in excess exposure within minutes), regulatory enforcement under FCA SYSC or SOX Section 404, and reputational damage. For safety-critical deployments, the consequence includes physical harm, regulatory investigation under COMAH or equivalent safety regulation, and potential criminal prosecution for foreseeable risk not adequately controlled. For all deployments, the absence of response signing eliminates the forensic ability to distinguish between agent reasoning failure and input data corruption, complicating incident investigation and root-cause analysis.