AG-427: Mutual Aid and Vendor Coordination Governance

2. Summary

Mutual Aid and Vendor Coordination Governance requires that organisations operating AI agents establish, maintain, and regularly test formal agreements governing how external suppliers, internal service owners, and partner organisations will coordinate during incidents that affect agent operations. Modern agent architectures depend on chains of external providers — model-hosting services, embedding providers, retrieval-augmented-generation data sources, identity brokers, observability platforms, payment processors — and a single-organisation incident response plan is insufficient when the failure originates outside organisational boundaries. This dimension mandates pre-negotiated coordination protocols, mutual aid agreements, joint escalation paths, and shared runbook inventories that ensure every party in the agent's dependency chain knows its role before an incident occurs, rather than improvising coordination under pressure.

3. Example

Scenario A — Uncoordinated Model Provider Outage Cascades Across Three Organisations: A financial-value agent performing automated trade reconciliation relies on a hosted inference endpoint provided by Vendor X. Vendor X suffers a regional data-centre failure at 14:22 UTC on a business day. The organisation's incident response team detects the failure within 8 minutes through latency monitoring, but has no pre-established communication channel with Vendor X's incident team. The organisation submits a support ticket through the standard portal, receiving an automated acknowledgement with a 4-hour SLA. Meanwhile, the agent's reconciliation queue grows at 1,200 transactions per minute. The organisation's downstream settlement partner — a clearing house — begins rejecting batches at 15:45 UTC because reconciliation confirmations are missing. The clearing house has no visibility into the root cause and activates its own incident process, suspecting the organisation's systems. By the time a three-way call is established at 16:30 UTC — 128 minutes after the original failure — 153,600 transactions are queued, settlement deadlines for 3 currency pairs have been missed, and the organisation faces £2.3 million in settlement penalties plus regulatory scrutiny for late reporting under transaction reporting obligations.

What went wrong: No mutual aid agreement existed between the organisation, Vendor X, and the clearing house. There was no pre-negotiated incident communication channel, no shared severity classification allowing the organisation to trigger expedited support with Vendor X, and no joint runbook describing the cascade path from inference-provider failure to settlement failure. The 128-minute coordination delay was entirely avoidable with pre-established protocols.

Scenario B — Safety-Critical Agent Loses Sensor Fusion Partner During Active Operation: An embodied robotic agent operating in a warehouse environment uses a third-party sensor-fusion service to integrate LIDAR, camera, and proximity sensor data into a unified spatial model. The sensor-fusion provider performs an unannounced maintenance operation that degrades API response times from 12ms to 340ms. The robotic agent's safety controller interprets the latency as potential sensor failure and initiates emergency stop procedures for 47 robots simultaneously. The warehouse achieves zero throughput for 3 hours and 14 minutes. Post-incident analysis reveals that the sensor-fusion provider had notified a generic support email address 72 hours before the maintenance window. The notification was not routed to the operations team because no mutual aid agreement defined which notifications required which routing. Direct damages: £410,000 in lost throughput. Indirect damages: £190,000 in contractual penalties for delayed shipments.

What went wrong: The organisation had no vendor coordination agreement specifying that maintenance notifications from the sensor-fusion provider must be routed to the robotics operations team. No joint change-advisory-board process existed between the two organisations. The sensor-fusion provider had no understanding of the downstream impact of latency degradation on safety-critical robotic operations, and the organisation had no mechanism to communicate this dependency relationship formally.

Scenario C — Crypto Agent's Multi-Vendor Oracle Failure During Market Volatility: A Crypto/Web3 agent relies on three independent price-oracle providers for consensus-based pricing before executing trades. During a period of extreme market volatility, two of the three oracle providers experience simultaneous degradation — one due to API rate-limiting under load, the other due to a stale-data bug in a new release deployed without coordinated change management. The agent's consensus mechanism falls to single-oracle reliance, but the remaining oracle's prices diverge from market by 4.7% due to the volatility. The agent executes 23 trades at mispriced levels before the organisation's anomaly detector triggers. Total mispricing loss: $1.8 million. Post-incident investigation reveals that no mutual aid agreement existed among the oracle providers requiring coordinated release management during high-volatility periods, and no shared incident bridge existed for the organisation to rapidly confirm cross-provider degradation.

What went wrong: The organisation treated each oracle provider as an independent, isolated dependency. No coordination agreement addressed correlated failure scenarios. No shared monitoring dashboard or incident bridge existed to provide cross-vendor visibility during degraded states. The organisation had no mechanism to request simultaneous rollback across two independent providers.

4. Requirement Statement

Scope: This dimension applies to every AI agent deployment where the agent's operational dependency chain includes at least one external supplier, partner organisation, or internal service team operating under a separate incident management process. The dependency chain includes but is not limited to: model inference providers, embedding services, retrieval data sources, identity and authentication providers, payment and settlement processors, sensor-data providers, oracle services, observability and monitoring platforms, and communication infrastructure providers. An "external" dependency is any service whose incident response is not directly controlled by the agent's operating team — this includes third-party vendors, internal shared-services teams in large organisations, and partner organisations in federated architectures. The threshold is low by design: if a dependency's failure can degrade the agent's operation and the dependency's incident response is managed by a different team, mutual aid coordination governance applies.

4.1. A conforming system MUST maintain a Vendor and Partner Coordination Register that catalogues every external dependency in the agent's operational chain, including: the dependency's function, the provider's identity, the contractual SLA, the provider's incident communication channel, the provider's escalation path, and the assessed impact of the dependency's failure on agent operations.

4.2. A conforming system MUST establish a mutual aid agreement with every Tier-1 dependency (dependencies whose failure would degrade or disable agent operations within the Recovery Time Objective defined under AG-422), specifying: shared severity classification, escalation triggers, communication channels, response-time commitments, joint runbook references, and post-incident review obligations.

4.3. A conforming system MUST define and document joint escalation paths for every Tier-1 dependency, ensuring that the organisation can reach the dependency provider's incident response team within 15 minutes during a Severity-1 or Severity-2 incident as classified under AG-419, without relying on standard support portals or ticketing systems.

4.4. A conforming system MUST conduct at least one joint incident exercise per year with every Tier-1 dependency provider, testing the mutual aid agreement's communication channels, escalation paths, and joint runbooks under realistic failure scenarios, consistent with the tabletop exercise requirements of AG-420.

4.5. A conforming system MUST implement cascade-impact mapping that documents how a failure in each dependency propagates through the agent's architecture to downstream business processes, identifying secondary and tertiary effects beyond the immediate agent degradation.

4.6. A conforming system MUST require that Tier-1 dependency providers notify the organisation of planned maintenance, infrastructure changes, or software releases that could affect the dependency's service level, with a minimum notification period of 72 hours for non-emergency changes and as-soon-as-practicable for emergency changes.

4.7. A conforming system MUST review and update the Vendor and Partner Coordination Register and all mutual aid agreements at least annually, or within 30 days of any material change to the dependency chain (new provider, provider acquisition, SLA renegotiation, or dependency architecture change).

4.8. A conforming system SHOULD establish shared monitoring or status-page integration with Tier-1 dependency providers, enabling the organisation to detect provider-side degradation independently of provider notification.

4.9. A conforming system SHOULD implement a cross-vendor incident bridge capability — a pre-established communication mechanism (conference bridge, shared channel, or equivalent) that can be activated within 10 minutes to connect incident responders from multiple dependency providers simultaneously during complex incidents.

4.10. A conforming system SHOULD maintain pre-authorised fallback procedures for each Tier-1 dependency, documented in joint runbooks, specifying the actions the organisation may take unilaterally if the dependency provider is unreachable during an incident (e.g., failover to secondary provider, graceful degradation, queue-and-hold).

4.11. A conforming system MAY implement automated dependency health correlation that detects when multiple dependencies degrade simultaneously, triggering enhanced coordination protocols for correlated failure scenarios.

5. Rationale

AI agent architectures are inherently distributed systems. Even a seemingly simple conversational agent may depend on an inference provider, an embedding service, a vector database, an authentication broker, and a logging platform — five external dependencies, each with its own operational team, incident process, and failure modes. More complex agents — financial trading agents, safety-critical robotic agents, cross-border compliance agents — may have dozens of dependencies spanning multiple vendors, jurisdictions, and technology stacks. When an incident occurs that affects one or more of these dependencies, the organisation's ability to respond effectively depends entirely on whether coordination protocols were established before the incident.

The fundamental problem is coordination latency. During an incident, every minute spent establishing communication channels, explaining dependency relationships, negotiating severity classifications, and identifying the right people to contact is a minute of unmitigated impact. In the financial-settlement scenario above, the 128-minute coordination delay cost £2.3 million. Pre-negotiated mutual aid agreements reduce coordination latency from hours to minutes by ensuring that all parties have already agreed on communication channels, severity frameworks, escalation paths, and response commitments.

Regulatory frameworks increasingly recognise third-party dependency risk as a first-class governance concern. The EU's Digital Operational Resilience Act (DORA) explicitly requires financial entities to manage ICT third-party risk, including incident coordination with critical third-party providers. DORA Article 28 mandates that contractual arrangements with ICT third-party service providers include provisions for cooperation during incidents. The FCA's operational resilience framework requires firms to map their important business services, including third-party dependencies, and to set impact tolerances that account for third-party failure. The NIST AI Risk Management Framework's GOVERN function addresses organisational processes and structures that include third-party governance.

Beyond regulatory compliance, mutual aid governance addresses a practical reality: the organisation that deploys the agent bears the full consequence of the agent's failure, regardless of whether the root cause was an internal defect or an external dependency failure. The customer whose trade was mispriced does not care whether the root cause was an internal bug or an oracle-provider outage. The regulator investigating a settlement failure does not accept "our vendor was down" as a complete defence. The organisation must demonstrate that it had reasonable measures in place to manage vendor-dependency risk, including pre-established coordination protocols for incident response.

The mutual aid model is borrowed from emergency services, where fire departments, ambulance services, and law enforcement maintain pre-negotiated agreements specifying how they will coordinate during multi-agency incidents. These agreements define communication channels, command structures, resource-sharing protocols, and joint training requirements. The same principles apply to AI agent dependency chains: pre-negotiation, shared classification frameworks, tested communication channels, and regular joint exercises.

Without this governance, organisations face three specific failure modes. First, coordination vacuum: when an incident occurs, nobody knows who to call, what information to share, or what authority they have to request actions from the dependency provider. Second, severity mismatch: the organisation classifies the incident as Severity-1 (critical business impact) but the dependency provider classifies it as Severity-3 (low priority) because the provider has no understanding of the downstream business impact. Third, cascade blindness: the dependency provider resolves its immediate technical issue but does not understand or address the downstream cascade effects on the organisation's business processes, resulting in a technically resolved but operationally unresolved incident.

6. Implementation Guidance

Mutual Aid and Vendor Coordination Governance requires a systematic approach to mapping, formalising, testing, and maintaining coordination protocols across the agent's dependency chain. The governance framework is only as strong as the weakest link in the coordination chain — a single Tier-1 dependency without a mutual aid agreement represents a single point of coordination failure.

Recommended patterns:

Dependency tiering with impact-based classification. Classify every dependency as Tier-1 (failure degrades or disables agent operations within the RTO), Tier-2 (failure degrades agent performance but does not disable operations), or Tier-3 (failure has minimal or no immediate impact). Tier-1 dependencies require full mutual aid agreements. Tier-2 dependencies require documented escalation paths and contact information. Tier-3 dependencies require only registry entries. This tiering ensures that governance effort is proportional to risk. Review tiering annually or when the dependency chain changes.
Structured mutual aid agreement template. Use a standardised template for all mutual aid agreements, ensuring consistency and completeness. The template should include: parties and scope, shared severity classification matrix (mapped to AG-419), escalation triggers and paths, communication channels (primary and backup), response-time commitments by severity level, joint runbook references, maintenance notification requirements, post-incident review obligations, and agreement review cadence. Standardisation enables comparison, audit, and systematic gap detection across all agreements.
Joint runbook development and maintenance. For each Tier-1 dependency, develop and maintain joint runbooks covering the most likely failure scenarios. Each runbook should specify: detection method (how the organisation detects the dependency failure), immediate actions (what the organisation does while contacting the provider), coordination protocol (how the organisation and provider communicate during the incident), resolution criteria (what "resolved" means from both parties' perspectives), and post-incident activities. Joint runbooks should be reviewed during annual exercises per Requirement 4.4.
Automated dependency monitoring with correlation. Implement monitoring that tracks the health of all Tier-1 and Tier-2 dependencies independently of provider notifications. When the monitoring detects degradation, it automatically initiates the notification routing defined under AG-424 and can pre-stage the incident bridge defined in Requirement 4.9. Correlation across multiple dependencies enables early detection of coordinated or cascading failures.
Post-incident coordination retrospective. After every incident involving a Tier-1 dependency, conduct a joint retrospective with the dependency provider. The retrospective should evaluate: coordination latency (how long it took to establish effective communication), severity alignment (whether both parties classified the incident consistently), runbook accuracy (whether the joint runbook matched the actual incident), and cascade visibility (whether all downstream effects were identified and communicated). Findings feed back into the mutual aid agreement and joint runbooks.

Anti-patterns to avoid:

Support-ticket-only coordination. Relying on standard support portals and ticketing systems as the sole incident communication channel with dependency providers. Standard support channels are designed for routine issues, not incidents. They introduce queue delays, automated triage, and SLAs designed for non-urgent requests. During an incident, the organisation needs direct access to the provider's incident response team.
Undocumented personal relationships. Relying on informal personal relationships between engineers at the organisation and engineers at the dependency provider. While personal relationships are valuable, they create bus-factor risk, are not auditable, and fail when the individual is unavailable. Formal mutual aid agreements survive personnel changes.
Assumption of provider visibility. Assuming that the dependency provider understands the downstream business impact of their service degradation. Providers typically serve hundreds of customers and cannot know the specific impact on each one. The organisation must communicate impact explicitly and proactively, and the mutual aid agreement must include provisions for impact communication.
Annual exercise without scenario variation. Conducting the same tabletop exercise every year with the same dependency provider and the same failure scenario. This produces familiarity with one specific scenario but does not build general coordination capability. Exercises should vary scenarios, test different communication channels, and simulate unexpected complications.

Industry Considerations

Financial services organisations face the most prescriptive requirements under DORA, which mandates specific contractual provisions for ICT third-party service providers including incident notification and cooperation clauses. These organisations should ensure that mutual aid agreements meet DORA Article 28 requirements. Healthcare and safety-critical deployments should establish mutual aid agreements that include safety-specific escalation triggers — for example, a robotic agent's sensor-fusion provider must understand that latency degradation can trigger emergency stops with significant physical safety implications. Public-sector deployments in multi-jurisdictional contexts should address data-sovereignty constraints in mutual aid agreements, ensuring that incident data shared with providers does not violate data-localisation requirements. Crypto and Web3 deployments should address the unique challenge of coordinating with decentralised service providers (oracle networks, validator sets) where traditional mutual aid agreements may need adaptation.

Maturity Model

Basic Implementation — The organisation maintains a Vendor and Partner Coordination Register listing all dependencies with contact information and contractual SLAs. Mutual aid agreements exist for Tier-1 dependencies in document form. Escalation paths are documented. At least one joint exercise has been conducted in the past 12 months. Cascade-impact mapping exists in diagram or document form.

Intermediate Implementation — Mutual aid agreements follow a standardised template and are reviewed annually. Joint runbooks exist for all Tier-1 dependencies covering the top-3 failure scenarios. Automated dependency monitoring is in place for Tier-1 dependencies. Joint exercises are conducted annually with scenario variation. Post-incident retrospectives with dependency providers are standard practice. The coordination register is maintained in a structured, version-controlled format.

Advanced Implementation — All intermediate capabilities plus: automated dependency health correlation detects correlated failures across multiple providers. Cross-vendor incident bridges can be activated within 10 minutes. Cascade-impact mapping is automated and updates dynamically when the dependency chain changes. Mutual aid agreements include measurable KPIs (coordination latency, notification compliance, resolution collaboration time) that are tracked and reported. Joint exercises simulate complex multi-vendor failure scenarios. The coordination framework is independently audited annually.

7. Evidence Requirements

Required artefacts:

Vendor and Partner Coordination Register. Current register listing all dependencies in the agent's operational chain, classified by tier, with provider identity, contractual SLA, incident communication channel, escalation path, and assessed failure impact. Format: structured data (database, structured document, or governance tooling export).
Mutual aid agreements. Signed agreements with every Tier-1 dependency provider, covering all elements specified in Requirement 4.2. Must be current (reviewed within the past 12 months or within 30 days of a material change).
Cascade-impact maps. Documentation showing how each dependency's failure propagates through the agent architecture to downstream business processes. Must identify secondary and tertiary effects.
Joint exercise records. Records of joint incident exercises conducted with Tier-1 dependency providers within the past 12 months, including: exercise scenario, participating organisations, findings, and remediation actions.
Post-incident coordination retrospectives. Records of joint retrospectives conducted after incidents involving Tier-1 dependencies, including: coordination latency analysis, severity alignment assessment, and improvement actions.
Maintenance notification logs. Records of maintenance notifications received from Tier-1 dependency providers, with timestamps showing compliance with the 72-hour notification requirement.

Retention requirements:

Coordination register versions and mutual aid agreements: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.
Exercise records and retrospective reports: same retention as the coordination register.
Maintenance notification logs: minimum 3 years.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Evidence must exist as retained artefacts, not be reconstructable after the fact. Mutual aid agreements must be producible with all provider-identifying information intact (redaction requires prior regulatory agreement).

8. Test Specification

Test 8.1: Coordination Register Completeness

Stimulus: Enumerate all external dependencies in the agent's operational architecture through independent architecture review. Compare against the Vendor and Partner Coordination Register.
Expected behaviour: Every independently identified dependency appears in the register with complete metadata (provider identity, tier classification, SLA, communication channel, escalation path, failure impact).
Pass criteria: 100% of identified dependencies appear in the register. Zero dependencies missing. All required fields populated.
Fail criteria: Any dependency is missing from the register, or any Tier-1 dependency has incomplete metadata.

Test 8.2: Mutual Aid Agreement Coverage and Currency

Stimulus: Retrieve the list of Tier-1 dependencies from the coordination register. For each, verify that a signed mutual aid agreement exists, contains all elements specified in Requirement 4.2, and has been reviewed within the past 12 months or within 30 days of a material change.
Expected behaviour: Every Tier-1 dependency has a current, complete mutual aid agreement.
Pass criteria: 100% of Tier-1 dependencies have compliant agreements. All agreements are within review currency.
Fail criteria: Any Tier-1 dependency lacks a mutual aid agreement, or any agreement is missing required elements, or any agreement has not been reviewed within the required period.

Test 8.3: Escalation Path Reachability

Stimulus: For each Tier-1 dependency, initiate a test escalation using the documented escalation path during business hours. Measure the time to reach a live incident responder at the dependency provider (not an automated system or standard support queue).
Expected behaviour: A live incident responder is reached within 15 minutes via the documented escalation path.
Pass criteria: Live responder contact achieved within 15 minutes for all Tier-1 dependencies tested.
Fail criteria: Any Tier-1 dependency's escalation path fails to reach a live responder within 15 minutes, or the documented escalation path is invalid (wrong number, deactivated channel, unrecognised credentials).

Test 8.4: Joint Exercise Execution Verification

Stimulus: Request records of joint incident exercises conducted with Tier-1 dependency providers within the past 12 months. Verify that at least one exercise was conducted per Tier-1 dependency, that the exercise tested communication channels and escalation paths, and that findings and remediation actions are documented.
Expected behaviour: Complete exercise records exist for every Tier-1 dependency.
Pass criteria: At least one joint exercise per Tier-1 dependency within 12 months. Exercise records include scenario description, participants, communication channel testing results, and documented findings.
Fail criteria: Any Tier-1 dependency has no joint exercise record within 12 months, or exercise records lack required content.

Test 8.5: Cascade-Impact Map Accuracy

Stimulus: Select two Tier-1 dependencies and simulate their failure in a test environment. Trace the actual propagation of the failure through the agent architecture and downstream business processes. Compare the actual propagation path against the documented cascade-impact map.
Expected behaviour: The documented cascade-impact map accurately reflects the actual failure propagation, including secondary and tertiary effects.
Pass criteria: The cascade-impact map correctly identifies at least 90% of observed propagation effects for both tested dependencies. No critical downstream impact is absent from the map.
Fail criteria: The cascade-impact map misses any critical downstream business-process impact, or the map's accuracy falls below 90% of observed effects.

Test 8.6: Maintenance Notification Compliance

Stimulus: Review maintenance notification logs for the past 6 months. For each non-emergency maintenance event that affected a Tier-1 dependency's service level, verify that the organisation received notification at least 72 hours before the maintenance window.
Expected behaviour: All non-emergency maintenance events have compliant notification records.
Pass criteria: 100% of non-emergency maintenance events have notification records showing receipt at least 72 hours before the maintenance window. Emergency maintenance events have as-soon-as-practicable notification with documented justification.
Fail criteria: Any non-emergency maintenance event lacks a 72-hour advance notification, or any maintenance event has no notification record.

Test 8.7: Register Update Timeliness

Stimulus: Identify the most recent material change to the dependency chain (new provider, provider change, SLA renegotiation, or architecture change). Verify that the coordination register and applicable mutual aid agreements were updated within 30 days of the change.
Expected behaviour: Register and agreements reflect the material change with timestamps within 30 days.
Pass criteria: Updates completed within 30 days of the material change. Updated records include change justification and review approval.
Fail criteria: Register or agreements do not reflect the material change, or updates occurred more than 30 days after the change.

Conformance Scoring

Score 0: No coordination register exists. No mutual aid agreements. Vendor incident coordination is ad hoc and improvised.
Score 1: A coordination register exists listing dependencies with basic contact information. Some mutual aid agreements exist but are incomplete, not standardised, or not current. Joint exercises have not been conducted. Cascade-impact mapping is absent or superficial.
Score 2: Complete coordination register with tiered classification. Standardised mutual aid agreements exist for all Tier-1 dependencies with all required elements. Joint exercises are conducted annually. Cascade-impact maps exist and have been validated. Post-incident retrospectives are conducted with dependency providers.
Score 3: Verified by independent audit — an independent party has validated the coordination register's completeness, mutual aid agreement compliance, exercise records, and cascade-impact map accuracy. Automated dependency monitoring and cross-vendor correlation are operational. Coordination KPIs are tracked and reported.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 17 (Quality Management System)	Supports compliance
SOX	Section 404 (Internal Controls)	Supports compliance
FCA SYSC	SYSC 8 (Outsourcing)	Direct requirement
NIST AI RMF	GOVERN 1.5 (Ongoing monitoring of third-party risks)	Supports compliance
ISO 42001	Clause 8.4 (Externally Provided Processes, Products or Services)	Direct requirement
DORA	Article 28 (Key Contractual Provisions for ICT Services)	Direct requirement

EU AI Act — Article 9 (Risk Management System)

Article 9 requires providers of high-risk AI systems to establish and maintain a risk management system that identifies and analyses known and reasonably foreseeable risks. Dependency-chain risk — the risk that external suppliers or partners will fail in ways that degrade agent operations — is a reasonably foreseeable risk that must be identified, analysed, and mitigated. Mutual aid agreements are a primary mitigation for this risk category. Without them, the risk management system has an unaddressed gap for third-party failure scenarios.

SOX — Section 404 (Internal Controls)

For organisations subject to SOX, internal controls over financial reporting must extend to material outsourced processes. When AI agents performing financially significant operations depend on external vendors, the vendor coordination framework constitutes part of the internal control environment. Auditors will examine whether the organisation has adequate controls over vendor-related risks, including incident coordination. Missing mutual aid agreements for financially material dependencies could contribute to a material weakness finding.

FCA SYSC — SYSC 8 (Outsourcing)

FCA SYSC 8 requires firms to take reasonable care to avoid undue operational risk when outsourcing critical or important functions. When AI agent operations depend on external vendors, the dependency relationship is functionally equivalent to outsourcing. SYSC 8.1.7R requires that outsourcing arrangements do not impair the firm's ability to manage risks effectively. Mutual aid agreements directly support this requirement by ensuring that the firm can coordinate incident response with its dependency providers. The escalation-path reachability requirement (4.3) specifically addresses the FCA's expectation that firms maintain control over outsourced functions during disruption.

NIST AI RMF — GOVERN 1.5

GOVERN 1.5 addresses ongoing monitoring processes for risks associated with third parties in the AI lifecycle. Mutual aid agreements, regular joint exercises, and post-incident retrospectives constitute the ongoing monitoring mechanism for third-party incident coordination risk. The coordination register provides the structured inventory of third-party relationships that GOVERN 1.5 expects.

ISO 42001 — Clause 8.4

ISO 42001 Clause 8.4 requires organisations to ensure that externally provided processes, products, or services relevant to the AI management system are controlled. Mutual aid agreements are the control mechanism for external dependencies in the incident response context. Without them, externally provided services operate outside the organisation's incident management framework, violating the control requirement.

DORA — Article 28 (Key Contractual Provisions)

DORA Article 28 prescribes specific provisions that must be included in contractual arrangements with ICT third-party service providers, including: service level descriptions with quantitative and qualitative performance targets, notice periods and reporting obligations for developments that may materially impact the provision of ICT services, and provisions on cooperation during incidents. Mutual aid agreements under AG-427 directly implement Article 28's incident cooperation requirements. The 72-hour maintenance notification requirement in 4.6 aligns with DORA's notice period provisions. The joint exercise requirement in 4.4 supports DORA Article 26's requirement for digital operational resilience testing.

10. Failure Severity

Field	Value
Severity Rating	Critical
Blast Radius	Cross-organisational — affects incident response effectiveness across the agent's entire dependency chain and downstream business processes

Consequence chain: Without mutual aid and vendor coordination governance, the organisation cannot effectively respond to incidents originating in its dependency chain. The immediate failure mode is coordination latency — the organisation wastes critical minutes or hours establishing communication channels, explaining dependency relationships, and negotiating response priorities with providers who have no pre-established framework for cooperation. This coordination delay directly extends incident duration, which amplifies business impact: settlement failures accumulate, safety-critical systems remain in degraded states longer, customer-facing agents deliver incorrect or missing responses for extended periods. The secondary failure mode is severity mismatch — the dependency provider does not understand the downstream business impact and responds with standard-priority processes while the organisation experiences critical impact. The tertiary failure mode is cascade blindness — the incident is technically resolved at the dependency level but operationally unresolved at the business-process level because nobody mapped or communicated the cascade effects. The ultimate business consequence is regulatory and governed exposure: DORA Article 28 non-compliance findings for financial entities, FCA SYSC 8 findings for firms with uncontrolled outsourcing risk, and direct financial losses from extended incident duration. In safety-critical environments, the consequence extends to physical harm if coordination delays prevent timely safety interventions.

Cross-references: AG-420 (Tabletop Exercise Governance) provides the exercise framework that mutual aid agreements must be tested against. AG-424 (Notification Routing Governance) defines how incident notifications are routed to the appropriate parties, including dependency providers. AG-419 (Adverse Event Severity Matrix Governance) provides the severity classification framework that mutual aid agreements reference for shared severity alignment. AG-422 (Recovery Time Objective Governance) defines the RTOs that determine Tier-1 dependency classification. AG-425 (Emergency Change Freeze Governance) governs change-freeze protocols that may need to be coordinated with dependency providers during incidents. AG-426 (Fallback Staffing Governance) addresses internal staffing for incident response, complementing the external coordination governed by AG-427. AG-428 (Crisis Communication Approval Governance) governs external communications during incidents, which must be coordinated with vendor communication protocols. AG-403 (Dependency Failover Validation Governance) validates that failover mechanisms for dependencies function correctly, providing the technical resilience that complements AG-427's coordination resilience.

Cite this protocol

AgentGoverning. (2026). AG-427: Mutual Aid and Vendor Coordination Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-427

← Previous Protocol

AG-426

Fallback Staffing Governance

Next Protocol →

AG-428

Crisis Communication Approval Governance