Recursive Delegation Depth Governance requires that every multi-agent system enforce a configurable, structurally enforced maximum on the number of successive delegation levels permitted before mandatory human escalation or task rejection. When Agent A delegates to Agent B, which delegates to Agent C, which delegates to Agent D, each step increases latency, dilutes accountability, degrades the fidelity of the original instruction, and compounds the probability of error — and beyond a governed threshold, the chain must terminate. Without depth limits, recursive delegation creates unbounded chains where no single agent or human retains meaningful oversight of the eventual action, producing outcomes that are untraceable, unaccountable, and potentially catastrophic.
Scenario A — Unbounded Delegation in Loan Origination: A retail bank deploys an AI-driven loan origination system. The primary underwriting agent receives an application for a £340,000 mortgage. The agent determines it needs a property valuation and delegates to a valuation agent. The valuation agent determines the property is in a flood zone and delegates to a flood-risk specialist agent. The flood-risk specialist determines it needs historical claims data and delegates to an insurance data agent. The insurance data agent encounters a data format it cannot parse and delegates to a data transformation agent. The data transformation agent requires schema mapping and delegates to a schema registry agent. The schema registry agent finds a version conflict and delegates to a version reconciliation agent — which delegates back to the data transformation agent with a different parameter set, creating a delegation loop. The loop executes 847 times over 14 minutes before the system runs out of memory. No result is returned to the original underwriting agent. The customer's application times out after the bank's 15-minute SLA, and the customer is presented with a generic error. When the incident is investigated, the bank cannot determine which agent made which decision because the delegation chain is seven levels deep and includes a loop. The bank discovers that 2,300 other mortgage applications hit the same loop during a three-day period before the pattern was identified.
What went wrong: No maximum delegation depth was configured. No loop detection existed. Each delegation was individually reasonable, but the cumulative chain exceeded any meaningful accountability boundary. The original underwriting agent's mandate was progressively diluted through seven levels of delegation until no agent in the chain had visibility of the original business context. Consequence: 2,300 mortgage applications delayed by an average of 72 hours, £1.8 million in compensation payments to affected customers under the Consumer Duty, FCA investigation under SYSC 6.1.1R for inadequate systems and controls, and senior manager accountability review under the Senior Managers Regime for the technology function head.
Scenario B — Delegation Depth Obscures Governed Exposure: An investment management firm uses a multi-agent system for portfolio rebalancing. The portfolio manager agent delegates rebalancing of a £50 million equity portfolio to a sector allocation agent. The sector allocation agent delegates individual sector trades to five sector-specific agents. Each sector agent delegates order routing to a best-execution agent. Each best-execution agent delegates to a venue selection agent. Each venue selection agent delegates to a latency optimisation agent. The delegation chain is five levels deep with branching — the original portfolio manager agent's single delegation has become 15 concurrent terminal delegations across five levels. The latency optimisation agents, operating at the terminal level, each independently determine that market conditions favour aggressive execution. They collectively submit orders representing £50 million in exposure within a 200-millisecond window — but because each agent only sees its own slice, none detects that the aggregate exposure has exceeded the portfolio's single-day trading limit of £12 million. The portfolio manager agent, five levels above, has no real-time visibility into the terminal agents' actions.
What went wrong: Each delegation reduced the delegating agent's visibility into downstream actions. By level five, no agent in the chain maintained aggregate exposure awareness. The original mandate's £12 million daily trading limit was not propagated through the delegation chain as a binding constraint — each level received only its local task parameters. The depth of the chain meant that by the time terminal agents acted, the connection to the original mandate was purely informational, not structural. Consequence: £50 million in trades executed against a £12 million daily limit, £3.2 million in market impact costs from concentrated execution, regulatory investigation under MiFID II Article 17 (algorithmic trading controls), and potential fine of up to 10% of annual turnover under FCA enforcement guidelines.
Scenario C — Recursive Delegation in Autonomous Vehicle Fleet Coordination: A logistics company operates a fleet of 200 autonomous delivery vehicles coordinated by a multi-agent system. A dispatch agent assigns a delivery to Vehicle Agent 14. Vehicle Agent 14 encounters a road closure and delegates route recalculation to a routing agent. The routing agent determines the alternative route passes through a restricted zone and delegates zone authorisation to a permits agent. The permits agent determines it needs real-time traffic data for the zone and delegates to a traffic monitoring agent. The traffic monitoring agent delegates to a sensor fusion agent. The sensor fusion agent delegates to three individual sensor-specific agents. One sensor agent detects a conflict and delegates conflict resolution to an arbitration agent — which delegates back to the routing agent for an alternative assessment. The delegation chain is now eight levels deep with a cycle. During the 47 seconds this chain takes to partially resolve, Vehicle Agent 14 has continued on its original route (a fail-forward default) and entered the restricted zone without authorisation. The vehicle is stopped by local authorities. The logistics company cannot produce a coherent decision audit trail because the delegation chain spans eight agents across three infrastructure regions.
What went wrong: No delegation depth limit existed. The fail-forward default allowed the vehicle to continue acting while the delegation chain was unresolved. The delegation cycle between the arbitration agent and the routing agent was not detected or prevented. The eight-level chain exceeded any reasonable accountability boundary for a safety-critical decision (route change for a physical vehicle). Consequence: Regulatory investigation by the vehicle licensing authority, £240,000 fine for operating in a restricted zone without authorisation, suspension of autonomous vehicle operating licence pending governance review, insurance claim denied on grounds of inadequate operational controls, and estimated £4.7 million in lost revenue during the six-week licence suspension.
Scope: This dimension applies to any multi-agent system where one agent can delegate a task, subtask, or decision to another agent, and where the receiving agent can further delegate to yet another agent. The scope includes direct delegation (Agent A explicitly assigns a task to Agent B), indirect delegation (Agent A publishes a task to a marketplace or queue, and Agent B picks it up and further delegates), and implicit delegation (Agent A's output triggers a downstream agent that treats the output as a delegation). The test is whether a chain of successive agent-to-agent task transfers can occur — if so, the system is within scope. Single-agent systems and systems where delegation is architecturally limited to a single level (a supervisor delegates to workers that cannot further delegate) are excluded from the depth-limiting requirements but should still track delegation depth for monitoring purposes. The scope extends to cross-organisational delegation: if Agent A in Organisation X delegates to Agent B in Organisation Y, which delegates to Agent C in Organisation Z, the full chain is within scope regardless of organisational boundaries.
4.1. A conforming system MUST enforce a configurable maximum delegation depth, defined as the number of successive agent-to-agent task transfers permitted from the originating delegation before the chain must terminate through task completion, escalation to a human, or structured rejection.
4.2. A conforming system MUST propagate the current delegation depth and the maximum permitted depth as immutable metadata attached to every delegation, such that each receiving agent can determine its position in the chain and the remaining delegation budget.
4.3. A conforming system MUST block any delegation that would exceed the configured maximum depth — the blocking must occur before the delegation is transmitted to the target agent, not after the target agent has begun processing.
4.4. A conforming system MUST escalate to a designated human authority when the maximum delegation depth is reached and the task cannot be completed at the current level, rather than silently dropping the task or returning an uninformative error.
4.5. A conforming system MUST detect and prevent delegation cycles — where a task is delegated back to an agent that already appears in the current delegation chain — treating a cycle as equivalent to exceeding maximum depth.
4.6. A conforming system MUST maintain an end-to-end delegation trace for every delegation chain, recording each delegation event with the delegating agent's identity, the receiving agent's identity, the current depth, the timestamp, and the task context transferred.
4.7. A conforming system MUST propagate the originating mandate's constraints (including value ceilings, action-type restrictions, and counterparty limits per AG-001) through every level of the delegation chain as binding constraints that cannot be expanded by any downstream agent.
4.8. A conforming system SHOULD configure different maximum delegation depths for different risk tiers — with lower maximums for safety-critical, financial-value, and rights-sensitive operations, and higher maximums only where operational necessity is documented and approved.
4.9. A conforming system SHOULD implement delegation depth alerting that notifies operational staff when chains reach a configurable warning threshold (e.g., 75% of maximum depth) even when the maximum has not yet been exceeded.
4.10. A conforming system SHOULD enforce a maximum wall-clock time for the entire delegation chain, independent of the depth limit, ensuring that deeply delegated tasks do not accumulate unbounded latency.
4.11. A conforming system MAY implement delegation compression — where intermediate agents that add no material decision value are bypassed in subsequent invocations of the same delegation pattern, reducing effective depth without losing accountability.
4.12. A conforming system MAY permit temporary depth limit increases through a formal exception process requiring human approval, documented justification, and automatic reversion after a defined period.
Delegation is fundamental to the value proposition of multi-agent systems. A single agent cannot possess every capability required for complex tasks, so it delegates subtasks to specialists. This is analogous to human organisational structures where a manager delegates work to team members who may further delegate to their own teams. But in human organisations, delegation depth is naturally constrained: reporting hierarchies rarely exceed five or six levels, the latency of human communication limits chain length, and humans intuitively recognise when a delegation chain has become absurd. AI agents have none of these natural constraints. An agent can delegate in milliseconds. There is no fatigue, no intuitive sense that "this has gone too far." Without structural limits, delegation chains can grow without bound.
The risks of unbounded delegation depth are both theoretical and observed in production multi-agent systems. First, accountability dilution: each level of delegation reduces the delegating agent's visibility into how the task is ultimately executed. By the third or fourth level, the originating agent's mandate — the constraints that defined what actions were permitted — may have been reduced to metadata that terminal agents do not meaningfully enforce. Second, instruction degradation: as a task passes through successive agents, each agent interprets the task through its own context and capabilities. The task that reaches the terminal agent may bear limited resemblance to the original intent. This is the AI equivalent of the "telephone game" — each transfer introduces interpretation noise. Third, latency accumulation: each delegation level adds processing time, network latency, and queuing delay. A five-level chain with 200ms per level adds a full second of latency before any terminal work begins — in financial trading or safety-critical systems, this latency can be operationally significant. Fourth, blast radius amplification: delegation chains with branching (one agent delegates to many) create exponential expansion. A single delegation that branches to five agents, each of which delegates to five more, produces 25 terminal agents at depth three — each potentially acting with diluted mandate constraints.
The regulatory context reinforces the need for depth governance. The EU AI Act's human oversight requirements (Article 14) presuppose that a human can meaningfully oversee the AI system's operation. A delegation chain of eight levels with branching makes meaningful human oversight practically impossible — the human cannot trace, understand, or intervene in a chain of that complexity in real time. MiFID II's algorithmic trading requirements (Article 17) require firms to have effective systems and controls for algorithmic trading systems, including "kill switches" — but a kill switch is ineffective if the firm cannot determine which agents in a deep delegation chain need to be stopped. The FCA's Senior Managers Regime requires that a named individual be accountable for every material function — but accountability is meaningless if no individual can trace the chain of delegation from initiation to terminal action.
The fundamental principle is that delegation depth must be proportionate to the accountability infrastructure supporting it. A system with real-time end-to-end visibility, atomic mandate propagation, and instant intervention capability can safely support deeper delegation than a system where delegation crosses organisational boundaries with asynchronous communication and no shared monitoring infrastructure. The maximum depth is not a universal constant — it is a governance parameter that must be calibrated to the system's accountability capacity.
Recursive Delegation Depth Governance establishes a structural ceiling on how far a task can travel through successive agent-to-agent transfers before it must terminate. The depth limit is not advisory — it is an infrastructure-layer constraint enforced before delegation occurs, analogous to how AG-001 enforces mandate limits before action execution. The implementation must ensure that no agent can delegate beyond the configured depth regardless of its instructions, reasoning, or perceived urgency.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Delegation depth limits should align with existing order routing and execution chain governance. MiFID II Article 17 requires firms to have "effective systems and risk controls" for algorithmic trading — unbounded delegation chains are incompatible with this requirement. For order execution chains, a maximum depth of three (portfolio manager agent to sector agent to execution agent) is a common industry practice. Deeper chains require documented justification and enhanced monitoring. The FCA expects firms to demonstrate that they can trace any order from initiation to execution — the delegation trace directly supports this requirement.
Healthcare. Clinical delegation chains must be especially shallow. A diagnosis agent that delegates to a specialist agent that delegates to a pharmacology agent that delegates to a drug interaction agent creates a four-level chain where the original clinical context may be degraded. For clinical decision support, a maximum depth of two is recommended unless the system provides verified end-to-end clinical context preservation. FDA 21 CFR Part 11 requires that electronic records in clinical systems be attributable — delegation traces provide the attribution chain.
Critical Infrastructure / Safety-Critical Systems. For agents controlling physical actuators (robotic systems, industrial control, autonomous vehicles), delegation depth limits must be extremely conservative. Each delegation level introduces latency and reduces the effectiveness of emergency intervention. A maximum depth of two is recommended for any delegation chain that terminates in a physical action. IEC 62443 security levels should inform the depth limit configuration — higher security levels require shallower maximum depths.
Crypto / Web3 / DeFi. Cross-protocol delegation in decentralised systems presents unique challenges because delegation can cross trust boundaries without centralised monitoring. On-chain delegation depth can be enforced via smart contract logic that requires the delegation token to be submitted as a transaction parameter, enabling transparent depth validation. Maximum depth limits should be encoded in the protocol's governance parameters and modifiable only through the protocol's governance process.
Basic Implementation — The organisation has defined a maximum delegation depth for each deployed multi-agent system. The depth limit is enforced by the agents themselves — each agent checks the depth metadata before delegating and refuses to delegate if the maximum would be exceeded. Delegation events are logged with depth information. Cycle detection is implemented by checking if the target agent's identity appears in the delegation chain. Escalation to a human occurs when the depth limit is reached. This level meets the minimum mandatory requirements but has an architectural weakness: depth enforcement depends on the agents' own compliance, which may be compromised by instruction manipulation or misconfiguration.
Intermediate Implementation — Delegation depth is enforced at the infrastructure layer (message router, API gateway, or orchestration platform) independent of the agents themselves. The delegation token is cryptographically signed at each level, preventing tampering with depth counters or chain history. Mandate narrowing is structurally enforced — each delegation's effective mandate is the intersection of the delegator's mandate and the task-specific constraints. Different maximum depths are configured for different risk tiers. Wall-clock time limits complement depth limits. Alerting triggers at configurable warning thresholds before the maximum is reached.
Advanced Implementation — All Intermediate capabilities plus: delegation depth analytics identify patterns where chains routinely approach the maximum, triggering architectural review of whether the agent topology should be redesigned to reduce delegation need. Dynamic depth adjustment based on real-time system load and risk signals — depth limits automatically tighten during periods of elevated risk or degraded monitoring capability. Cross-organisational delegation depth is tracked end-to-end even when the chain crosses organisational boundaries, with federated depth tokens enabling multi-party depth enforcement. Independent adversarial testing has verified that depth limits cannot be bypassed through token manipulation, cycle injection, or infrastructure compromise. The organisation can demonstrate to regulators the complete delegation trace for any historical task from origination to terminal action.
Required artefacts:
Retention requirements:
Access requirements:
Testing AG-396 compliance requires validating that depth limits are structurally enforced, that cycles are detected, and that escalation functions correctly when limits are reached. A comprehensive test programme should include the following tests.
Test 8.1: Maximum Depth Enforcement
Test 8.2: Delegation Depth Metadata Propagation
Test 8.3: Delegation Cycle Detection and Prevention
Test 8.4: Human Escalation at Depth Limit
Test 8.5: Mandate Constraint Propagation Through Delegation Chain
Test 8.6: Delegation Trace Completeness and Integrity
Test 8.7: Risk-Tiered Depth Configuration
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 14 (Human Oversight) | Direct requirement |
| EU AI Act | Article 9 (Risk Management System) | Supports compliance |
| SOX | Section 404 (Internal Controls Over Financial Reporting) | Supports compliance |
| FCA SYSC | 6.1.1R (Systems and Controls) | Direct requirement |
| NIST AI RMF | GOVERN 1.1, MAP 3.5, MANAGE 2.4 | Supports compliance |
| ISO 42001 | Clause 6.1 (Actions to Address Risks), Clause 9.1 (Monitoring and Measurement) | Supports compliance |
| DORA | Article 9 (ICT Risk Management Framework) | Supports compliance |
Article 14 requires that high-risk AI systems be designed to allow effective human oversight, including the ability to "fully understand the capacities and limitations of the high-risk AI system and be able to duly monitor its operation." Unbounded delegation depth makes this requirement impossible to satisfy — a human cannot meaningfully oversee a delegation chain of arbitrary depth with branching, because the complexity of the chain exceeds human cognitive capacity to trace and understand in real time. AG-396 directly implements the structural precondition for human oversight of multi-agent delegation: by limiting depth, it ensures that the delegation chain remains within the boundary of human comprehension and intervention capability. The mandatory escalation at depth limit further supports Article 14 by ensuring that tasks too complex for the agent hierarchy are routed to human decision-makers.
Article 9 requires a risk management system that identifies and mitigates risks "as far as technically feasible." Unbounded recursive delegation is an identified risk in multi-agent systems. Structural depth enforcement is technically feasible. Therefore, failure to implement depth limits would not meet the "as far as technically feasible" standard. The risk-tiered depth configuration (Section 4.8) directly maps to Article 9's requirement that risk mitigation measures be proportionate to the identified risks.
For financial agents operating in delegation chains, Section 404 requires that the organisation demonstrate effective internal controls over the complete processing chain. An auditor tracing a financial transaction through a multi-agent system will ask: "Show me every agent that touched this transaction, what each agent did, and what constraints governed each agent's actions." If the delegation chain is unbounded and untraced, this question is unanswerable — the control is inadequate. The delegation trace log (Section 4.6) and mandate propagation (Section 4.7) directly provide the evidence that Section 404 audits require. The depth limit ensures that the chain remains auditable — a chain of arbitrary depth with branching produces a decision tree that may be too large to audit within practical time constraints.
SYSC 6.1.1R requires adequate systems and controls for compliance with applicable obligations. For firms deploying multi-agent systems, this includes controls over delegation chains that affect regulated activities. The FCA has indicated through supervisory statements that algorithmic and automated systems must have effective "kill switches" and intervention mechanisms. A delegation chain of unbounded depth undermines the effectiveness of any intervention mechanism because the firm cannot determine which agents in the chain need to be stopped. AG-396 ensures that delegation chains remain within the intervention capacity of the firm's operational controls. The Senior Managers Regime requires that a named individual be accountable for every material function — accountability is operationally meaningful only if the delegation chain is short enough that the accountable individual can understand and trace it.
GOVERN 1.1 addresses governance structures for AI risk; MAP 3.5 addresses the mapping of AI system dependencies and interactions; MANAGE 2.4 addresses mechanisms for intervention and course correction. AG-396 supports all three by establishing structural constraints on agent-to-agent dependencies (delegation chains), mapping those dependencies through delegation traces, and ensuring intervention capability through depth limits and mandatory escalation.
Clause 6.1 requires actions to address risks within the AI management system. Unbounded delegation is a risk requiring treatment — depth limits are the treatment. Clause 9.1 requires monitoring and measurement of the AI management system's performance. Delegation depth monitoring, alerting at warning thresholds, and delegation trace analytics directly implement the monitoring requirement, enabling the organisation to measure whether delegation chains remain within governed parameters.
Article 9 requires financial entities to maintain ICT risk management frameworks that ensure the resilience of ICT systems. Unbounded delegation chains are a resilience risk: they increase latency, reduce accountability, and complicate incident response. Delegation depth governance directly supports ICT resilience by ensuring that multi-agent processing chains remain within bounds that the organisation's monitoring and intervention infrastructure can support. DORA's emphasis on "detection and response" capabilities is undermined by delegation chains too deep to monitor in real time.
| Field | Value |
|---|---|
| Severity Rating | Critical |
| Blast Radius | System-wide — unbounded delegation can cascade across every agent in the topology, and cross-organisational when delegation chains span multiple organisations |
Consequence chain: Without delegation depth governance, a single task entering a multi-agent system can spawn an unbounded chain of successive delegations, each diluting accountability, degrading instruction fidelity, and accumulating latency. The immediate technical failure is a delegation chain that exceeds any meaningful oversight boundary — no single agent or human retains visibility into the full chain. The operational impact compounds through several mechanisms: mandate constraints are progressively diluted until terminal agents operate with no effective binding limits (intersecting with AG-001); delegation cycles consume computational resources indefinitely until system exhaustion; branching delegation creates exponential agent activation that overwhelms monitoring infrastructure; and the latency of deep chains causes upstream agents and human principals to timeout, retry, or take alternative actions based on stale assumptions. The financial impact scales with the authority delegated: a single portfolio rebalancing task delegated through an unbounded chain can result in trading exposure orders of magnitude beyond approved limits, as each level of delegation fragments the original mandate into task-specific slices that collectively exceed the whole. The accountability impact is permanent: when an incident occurs at the terminal level of a deep chain, attributing blame per AG-398 requires tracing the full chain — and if the chain was not traced at execution time, reconstruction may be impossible. The regulatory impact is severe: regulators will ask "who authorised this action?" and an answer of "it was delegated through seven agents and we cannot trace the chain" constitutes a fundamental control failure under every applicable regulatory framework. Personal liability for senior managers under the FCA Senior Managers Regime, SOX officer certifications, and EU AI Act Article 49 is triggered when the organisation cannot demonstrate that it maintained adequate governance over its multi-agent delegation infrastructure.