AG-763

ICT Concentration Risk Governance

Infrastructure and Integration Governance ~19 min read AGS v2.1 · 2026-04-25
EU AI Act NIST AI RMF ISO 42001

1. Definition

This dimension governs the detection, measurement, and management of concentration risk arising from the deployment of AI agents that depend on a limited number of ICT third-party service providers for critical or important functions. ICT concentration risk in the context of agentic AI systems is a systemic governance concern because the AI supply chain is structurally concentrated: a small number of foundation model providers (currently fewer than ten globally with frontier-capability models), a small number of cloud infrastructure providers (three dominate over 65% of the global market), and a small number of inference API providers create dependency graphs where the failure, disruption, policy change, or regulatory action affecting a single provider can simultaneously disable or degrade AI agent operations across multiple firms, sectors, and jurisdictions.

The regulatory framework is explicit and binding for financial services firms. The Digital Operational Resilience Act (DORA), which applies from January 2025 in the EU, addresses ICT concentration risk in Articles 28 through 30, requiring financial entities to identify, assess, and manage concentration risk arising from ICT third-party dependencies, and empowering European Supervisory Authorities to designate critical ICT third-party service providers subject to direct oversight. In the UK, the FCA's policy statement PS7/24 on operational resilience for critical third parties establishes a comparable framework, and the PRA's supervisory expectations require firms to identify and mitigate concentration risk in outsourced and third-party arrangements. For AI agent deployments, these requirements create specific obligations: firms must map their AI agent infrastructure dependencies, quantify the concentration exposure, assess the substitutability of each provider, and implement mitigation strategies that reduce the blast radius of a single-provider failure.

The detective control type is appropriate because concentration risk is a structural property of the deployment architecture that must be continuously measured and monitored rather than prevented at a single point. The risk materialises not through a single event but through the accumulation of dependencies over time as new agent deployments, model upgrades, and infrastructure decisions incrementally increase the concentration exposure. Detection controls that measure concentration metrics, monitor dependency changes, and alert when thresholds are exceeded are the primary governance mechanism, supplemented by preventive architectural decisions (multi-provider strategies, abstraction layers, fallback configurations) that are informed by the detection outputs.

The AI-specific dimensions of ICT concentration risk go beyond traditional IT outsourcing concentration. Foundation model dependencies create a unique form of concentration because: (a) model switching costs are high due to behavioural differences between models that require prompt engineering, evaluation, and guardrail recalibration; (b) training data and fine-tuning investments are provider-specific and non-portable; (c) inference latency, throughput, and cost characteristics vary across providers in ways that affect agent operational parameters; and (d) provider content policies, usage restrictions, and acceptable use terms can change unilaterally, potentially disabling agent capabilities without notice. These factors mean that effective concentration risk governance for AI agents requires provider-specific risk assessment that goes beyond standard vendor management frameworks.

2. Scope

This dimension applies to all agent deployments that depend on external ICT third-party service providers for any component of the agent's critical path, including but not limited to: foundation model inference, embedding generation, cloud compute and storage, vector database services, orchestration platforms, monitoring infrastructure, and security services. It applies with particular force to Financial-Value Agent, Safety-Critical / CPS Agent, and Public Sector / Rights-Sensitive Agent profiles where operational continuity is a regulatory requirement.

3. Why This Matters

ICT Concentration Risk Governance addresses a governance gap that, if left unmanaged, creates systemic risk across the agent ecosystem. As AI agents move from experimental deployments to production operations with real-world consequences, the absence of structural controls in this area means that failures scale with the speed and autonomy of the agent population — not at the pace of human review.

Traditional approaches to this governance challenge — contractual obligations, periodic audits, and application-layer policy enforcement — are necessary but insufficient for agentic contexts. Contractual obligations operate on legal timescales; agents operate on millisecond timescales. Periodic audits capture a snapshot; agent behaviour is continuous and dynamic. Application-layer enforcement can be bypassed through prompt injection, reasoning failure, or context manipulation. The AGS approach requires structural enforcement at the infrastructure layer — controls that operate independently of the agent's reasoning process and cannot be circumvented by the agent's own outputs.

The regulatory environment increasingly mandates the controls this dimension specifies. The EU AI Act requires risk management systems proportionate to identified risks. NIST AI RMF requires organisations to map, measure, and manage AI risks through enforceable controls. ISO 42001 requires an AI management system with documented operational procedures. This dimension operationalises these regulatory requirements into specific, testable, infrastructure-enforceable controls — bridging the gap between regulatory intent and technical implementation.

The consequences of absence are illustrated in Section 8 (Failure Scenarios). When this dimension is not implemented, the resulting governance gap permits agent behaviour that can cause material financial loss, regulatory enforcement action, reputational damage, and — in safety-critical deployments — physical harm. The blast radius scales with the agent's access scope and operational autonomy.

4. Requirements

4.1 Dependency Mapping

4.2 Concentration Measurement

4.3 Provider Risk Assessment

4.4 Mitigation and Resilience

4.5 Governance and Reporting

5. Maturity Model

Basic Implementation — The organisation has documented policies addressing ict concentration risk and has implemented initial controls. Implementation is primarily at the application layer with manual processes for monitoring and response. Logging covers key events but may lack full metadata. Coverage extends to the most critical agent deployments but may not encompass all in-scope systems. Staff are aware of requirements but formal training may be incomplete.

Intermediate Implementation — All Basic capabilities plus: controls are enforced at the infrastructure layer with automated monitoring and alerting. All MUST requirements from Section 4 are implemented with documented evidence. Coverage extends to all in-scope agent deployments. Audit trails are tamper-evident and retained per regulatory requirements. Formal change control governs all configuration changes. Regular review cycles are established and documented. Staff receive formal training and competency is assessed.

Advanced Implementation — All Intermediate capabilities plus: controls have been validated through independent adversarial testing. Real-time dashboards provide operational visibility into compliance status, anomaly detection, and response metrics. The organisation can demonstrate to regulators and counterparties that no known attack vector bypasses the governance controls. Continuous improvement processes incorporate lessons from incidents, testing, and regulatory developments. Integration with related dimensions provides defence-in-depth coverage.

Implementation Patterns

Tamper-evident audit trail. Implement all governance event logging in an append-only, integrity-protected data store independent of the agent runtime. Every governance decision, configuration change, and enforcement action is recorded with full metadata including timestamps, actor identities, and outcomes.

Real-time monitoring with graduated alerting. Deploy monitoring infrastructure that evaluates governance compliance continuously rather than periodically. Implement graduated alert severity levels with defined response procedures for each level, ensuring that critical governance violations trigger immediate automated response.

Scheduled governance review cycle. Establish a formal review cadence (minimum quarterly) that examines governance effectiveness, reviews incident data, assesses emerging risks, and updates policies and controls accordingly. Review outcomes are documented and tracked.

Separation of governance and agent runtime domains. Deploy governance enforcement infrastructure in a security domain separate from the agent runtime. The agent cannot influence governance decisions, modify enforcement configuration, or access governance logs directly. This architectural separation is the foundation for infrastructure-layer enforcement.

Anti-Patterns

Governance by instruction rather than infrastructure. Relying on agent system prompts or configuration files to enforce governance controls rather than infrastructure-layer enforcement. Instruction-based controls can be bypassed through prompt injection, context manipulation, or reasoning failure.

Monitoring without enforcement. Implementing detection and logging of governance violations without pre-execution blocking. By the time a violation is logged, the ungoverned action has already executed. Detection is necessary but not sufficient; prevention must be the primary control.

Manual processes for machine-speed operations. Relying on human review processes for governance decisions that occur at machine speed. Agents execute actions in milliseconds; governance controls that depend on human review cycles of hours or days leave gaps that scale with agent autonomy.

6. Test Criteria

Test 6.1 — Dependency Map Completeness

Maps to: Section 4.1 Objective: Verify that the dependency map accurately captures all ICT third-party providers in each agent's critical path. Method: For 5 representative agent deployments, independently trace the infrastructure dependency chain from agent input to final output. Compare the traced dependencies against the documented dependency map. Identify any providers present in the traced chain but absent from the map. Pass Criteria: All critical-path providers identified in the dependency map for ≥95% of traced dependencies. Non-conformance if any critical-path provider with >10% of inference volume is absent from the map.

Test 6.2 — Concentration Metric Computation

Maps to: Section 4.2 Objective: Verify that concentration metrics are computed correctly and at the organisational level. Method: Retrieve the current concentration metric report. Independently calculate the four required metrics (function count, financial exposure, operational exposure, substitutability index) for the top 3 providers by dependency volume. Compare independent calculations against reported metrics. Pass Criteria: Reported metrics within 5% of independently calculated values for all four metrics across all three providers. Non-conformance if any metric deviates by >15%.

Test 6.3 — Threshold Alert Generation

Maps to: Section 4.2.3 Objective: Verify that concentration threshold breaches generate alerts. Method: Inject synthetic dependency data that causes the concentration metric for a single provider to exceed the defined threshold. Verify that an alert is generated and routed to the governance function within the defined detection cycle. Pass Criteria: Alert generated within one analysis cycle. Alert contains provider identity, metric value, threshold value, and recommended action.

Test 6.4 — Exit Strategy Tabletop Exercise

Maps to: Section 4.4.3 Objective: Verify that exit strategies for critical providers have been tested through tabletop exercises within the required timeframe. Method: Request documentation of the most recent tabletop exercise for each critical provider exit strategy. Verify that: (a) the exercise was conducted within the past 12 months; (b) the exercise covered the documented exit strategy; (c) findings and action items were recorded; (d) action items have been addressed or have documented remediation timelines. Pass Criteria: Tabletop exercises documented for all critical providers within 12 months. Non-conformance if any critical provider exit strategy has not been exercised within 24 months.

Test 6.5 — Provider Risk Assessment Currency

Maps to: Section 4.3 Objective: Verify that provider risk assessments are current and cover all required elements. Method: Review provider risk assessments for the top 5 providers by dependency volume. Verify that each assessment: (a) was completed or refreshed within the past 12 months; (b) covers all five required risk dimensions; (c) incorporates any material provider events since the last assessment. Pass Criteria: All five elements satisfied for ≥80% of assessed providers. Non-conformance if any critical-function provider lacks a current risk assessment.

Evidence Artefacts

7.1 Dependency Map Registry A maintained registry of all ICT third-party provider dependencies for each AI agent deployment, including service descriptions, contractual terms summaries, substitutability assessments, and migration estimates. Updated within 30 days of material changes. For DORA-scope entities, must satisfy Article 28(3) register requirements. Minimum retention: 7 years.

7.2 Concentration Metric Reports Quarterly reports documenting organisational-level concentration metrics across all four required dimensions, threshold comparisons, trend analysis, and any alerts generated. Minimum retention: 7 years.

7.3 Provider Risk Assessment Records Individual risk assessment documents for each critical provider, covering all five required dimensions, with refresh dates and event-triggered updates. Minimum retention: 5 years after termination of the provider relationship.

7.4 Exit Strategy Documentation Documented exit strategies for each critical provider, including timelines, resource requirements, interim operating procedures, and data portability arrangements. Must be updated within 30 days of material contractual or architectural changes. Minimum retention: duration of provider relationship plus 3 years.

7.5 Tabletop and Failover Test Records Reports from tabletop exercises and technical failover tests, including: scenario description, participants, findings, action items, and remediation status. Minimum retention: 5 years.

7.6 Board and Governance Body Reports Annual Board-level and quarterly governance-level concentration risk reports, including acknowledgement of receipt and any actions directed. Minimum retention: 7 years.

7.7 Regulatory Submissions Copies of all concentration-risk-related regulatory submissions, notifications, and responses to supervisory inquiries. Minimum retention: 10 years.

7. Scoring

ScoreLevelDescription
0No implementationNo ict concentration risk governance exists. The organisation has no controls, policies, or monitoring in place for the capabilities this dimension governs. Agent behaviour in this area is ungoverned.
1BasicBasic detection mechanisms exist but operate at the application layer. Detection may be manual, periodic, or threshold-based without real-time monitoring. Alerts are generated but may lack automated response. Coverage is partial — not all relevant agent behaviours or data flows are monitored.
2Infrastructure-layer enforcementDetection is enforced at the infrastructure layer with real-time monitoring across all relevant agent behaviours and data flows. Automated alerting with structured response procedures. Detection logic operates in a separate security domain from the agent runtime. Full audit trail with tamper-evident logging.
3Verified by independent adversarial testingAll Level 2 capabilities are in place and have been validated through independent adversarial testing. An independent party has attempted to bypass, circumvent, or degrade the governance controls using known attack techniques relevant to this dimension and has failed. Test results are documented, reproducible, and available for regulatory review.

8. Failure Scenarios

Example 3.1 — Financial Services Firm Single-Provider Model Dependency

A mid-tier UK asset management firm deploys AI agents across five critical business functions: portfolio risk analysis, client reporting, regulatory filing preparation, trade surveillance, and customer service. All five deployments use the same foundation model provider's API for inference, the same provider's embedding models for retrieval-augmented generation, and the same cloud infrastructure provider for hosting the agent orchestration layer. The firm's total monthly spend with the model provider is GBP 340,000, and the agent-dependent functions process approximately 12,000 consequential decisions per day. In March 2026, the model provider experiences a 14-hour global API outage affecting all inference endpoints. All five of the firm's agent-dependent functions are simultaneously disabled. The portfolio risk analysis agent cannot process overnight risk calculations, causing the firm to miss its T+1 risk reporting deadline to the PRA. The trade surveillance agent cannot process the previous day's trading activity, creating a 14-hour gap in market abuse surveillance that must be reported to the FCA under MAR Article 16. The client reporting agent fails to deliver 2,300 scheduled quarterly reports, triggering client complaints and contractual penalty provisions. The regulatory filing agent misses an EMIR trade reporting deadline, resulting in a EUR 45,000 regulatory penalty. The total direct cost of the 14-hour outage is GBP 2.8 million, including penalties, contractual damages, manual workaround costs, and incident response. The FCA's subsequent review identifies the single-provider dependency as a failure of operational resilience governance under PS7/24, and the PRA requires the firm to submit a remediation plan demonstrating adequate concentration risk mitigation within 90 days. The remediation programme, including multi-provider infrastructure buildout, model diversification, and fallback system implementation, costs GBP 4.6 million over 18 months.

Example 3.2 — Cross-Sector Concentration Through Shared Model Provider Policy Change

A foundation model provider updates its acceptable use policy to prohibit the use of its models for "autonomous decision-making in financial transactions exceeding USD 10,000 without human confirmation." The policy change is announced with 30 days' notice. Three UK-regulated firms — a consumer lender, a payments processor, and an insurance underwriter — all operate Financial-Value Agents that rely on this provider's model for transaction-level decisions. The consumer lender's agent autonomously approves personal loan applications up to GBP 25,000; the payments processor's agent handles fraud screening for transactions averaging GBP 15,000; the insurance underwriter's agent processes commercial insurance quotes with average premiums of GBP 40,000. All three firms must either obtain explicit human confirmation for each decision (which would require hiring approximately 340 additional staff across the three firms at an annual cost of GBP 14.2 million), migrate to an alternative model provider (a 4-6 month engineering effort per firm), or negotiate an enterprise exception with the model provider (uncertain timeline and outcome). The 30-day notice period is insufficient for any of these options. Two of the three firms are forced to temporarily suspend autonomous agent operations in the affected functions, reverting to manual processing. The PRA identifies the event as a materialisation of ICT concentration risk across the financial sector and issues a Dear CEO letter requiring all firms using AI agents in critical functions to demonstrate concentration risk assessment and mitigation plans within 60 days.

9. Regulatory Mapping

RegulationProvisionRelationship Type
#Framework_Pending v2.1 editorial review_
1DORA_Pending v2.1 editorial review_
2DORA_Pending v2.1 editorial review_
3DORA_Pending v2.1 editorial review_
4FCA PS7/24_Pending v2.1 editorial review_
5PRA SS2/21_Pending v2.1 editorial review_
6EU AI Act_Pending v2.1 editorial review_
7EU AI Act_Pending v2.1 editorial review_
8NIST AI RMF_Pending v2.1 editorial review_
9NIST AI RMF_Pending v2.1 editorial review_
10ISO 42001_Pending v2.1 editorial review_
11ISO 42001_Pending v2.1 editorial review_
12EBA Guidelines on Outsourcing_Pending v2.1 editorial review_
13OECD AI Principles_Pending v2.1 editorial review_
14Bank of England Discussion Paper_Pending v2.1 editorial review_
15DSIT AI Regulation White Paper_Pending v2.1 editorial review_
AG DimensionRelationshipDescription
AG-029 — Regulatory Compliance MappingDependencyICT concentration risk governance requirements vary by jurisdiction and regulatory regime; AG-029 provides the regulatory mapping that determines DORA, FCA, and PRA obligations applicable to each deployment
AG-103 — Audit Trail IntegrityDependencyConcentration risk records, dependency maps, and provider assessments must meet the audit trail integrity requirements of AG-103 to be producible for regulatory inquiry
AG-747 — Third-Party AI Service GovernanceRelatedAG-763 addresses concentration risk as a systemic property of the provider dependency graph, while AG-747 governs the individual governance controls for each third-party AI service relationship
AG-754 — Operational Resilience and ContinuityRelatedConcentration risk is a primary threat to operational resilience; AG-763 provides the detection and measurement framework that feeds into the operational resilience controls governed by AG-754
Cite this protocol
AgentGoverning. (2026). AG-763: ICT Concentration Risk Governance. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-763