The Standard

Compliance

AG-747

Resource Exhaustion and Cost Runaway Governance

Mandate and Action Governance ~23 min read AGS v2.1 · 2026-04-25

EU AI Act NIST AI RMF ISO 42001

1. Definition

Resource exhaustion and cost runaway governance addresses the risk that agentic systems, operating autonomously or semi-autonomously, consume computational resources, API calls, storage, network bandwidth, or billable third-party services at rates that exceed budgeted, authorised, or operationally safe levels. Unlike traditional software systems where resource consumption is largely deterministic and predictable from load testing, agentic systems exhibit emergent resource consumption patterns driven by the interaction between user queries, model reasoning chains, tool use decisions, retry logic, and multi-step workflow execution. An agent that enters a reasoning loop, repeatedly calls an expensive API while attempting to resolve an ambiguous query, spawns parallel sub-agents without bounds, or retries failed operations without backoff can generate costs orders of magnitude beyond what was anticipated — and can do so within minutes, before any human oversight mechanism has an opportunity to intervene.

The criticality of this dimension is amplified by the pay-per-token and pay-per-call billing models that dominate the AI infrastructure landscape. A single runaway agent instance can consume thousands of dollars in API costs within an hour. In multi-agent architectures, the risk compounds multiplicatively: an orchestrator agent that spawns worker agents, each of which makes independent API calls, can generate costs at a rate that scales with the product of agent count and per-agent call rate. Crypto/Web3 agents face the additional risk that resource exhaustion can directly translate to on-chain transaction fees (gas costs) that are irrecoverable. Embodied and edge agents face the risk that computational resource exhaustion degrades real-time control loops, creating safety-critical failures.

Failure manifests as unexpected cloud computing bills, API quota exhaustion that blocks legitimate operations, storage capacity depletion that causes data loss, network bandwidth saturation that degrades service availability, and in extreme cases, cascading failures where a runaway agent's resource consumption impacts shared infrastructure used by other production systems. A common failure pattern involves agents in retry loops: the agent encounters an error, retries the operation, encounters the same error, retries again — with each retry consuming billable resources — until an external limit is reached. Another common pattern is unbounded context accumulation, where an agent progressively adds retrieved documents or conversation history to its context window, with each inference call becoming progressively more expensive as context length grows.

Governance in practice requires organisations to implement multi-layered resource controls: per-request budgets that limit the cost of a single agent invocation, per-session budgets that limit cumulative cost across a user session or workflow, per-agent budgets that limit total resource consumption by a single agent instance, and system-level budgets that limit aggregate consumption across all agents. These controls must include real-time monitoring with automated circuit breakers that halt agent operation when thresholds are exceeded, rather than relying solely on retrospective billing analysis. The controls must be granular enough to distinguish between legitimate high-resource operations and runaway consumption, avoiding the conservative action bias problem (AG-746) of blocking all resource-intensive operations.

2. Scope

This dimension applies to all agentic system deployments that consume metered computational resources, make billable API calls, execute blockchain transactions with associated fees, utilise shared infrastructure capacity, or accumulate storage or bandwidth consumption that could impact system stability or incur financial cost beyond authorised levels. It applies regardless of whether the agent operates in a cloud, on-premises, edge, or hybrid environment.

3. Why This Matters

Resource Exhaustion and Cost Runaway Governance addresses a governance gap that, if left unmanaged, creates systemic risk across the agent ecosystem. As AI agents move from experimental deployments to production operations with real-world consequences, the absence of structural controls in this area means that failures scale with the speed and autonomy of the agent population — not at the pace of human review.

Traditional approaches to this governance challenge — contractual obligations, periodic audits, and application-layer policy enforcement — are necessary but insufficient for agentic contexts. Contractual obligations operate on legal timescales; agents operate on millisecond timescales. Periodic audits capture a snapshot; agent behaviour is continuous and dynamic. Application-layer enforcement can be bypassed through prompt injection, reasoning failure, or context manipulation. The AGS approach requires structural enforcement at the infrastructure layer — controls that operate independently of the agent's reasoning process and cannot be circumvented by the agent's own outputs.

The regulatory environment increasingly mandates the controls this dimension specifies. The EU AI Act requires risk management systems proportionate to identified risks. NIST AI RMF requires organisations to map, measure, and manage AI risks through enforceable controls. ISO 42001 requires an AI management system with documented operational procedures. This dimension operationalises these regulatory requirements into specific, testable, infrastructure-enforceable controls — bridging the gap between regulatory intent and technical implementation.

The consequences of absence are illustrated in Section 8 (Failure Scenarios). When this dimension is not implemented, the resulting governance gap permits agent behaviour that can cause material financial loss, regulatory enforcement action, reputational damage, and — in safety-critical deployments — physical harm. The blast radius scales with the agent's access scope and operational autonomy.

4. Requirements

4.1 Budget Envelope Definition

R1.1: The deploying organisation MUST define and document resource budget envelopes at the following granularities: (a) per-request — the maximum resource consumption permitted for a single user request or workflow trigger; (b) per-session — the maximum cumulative consumption across a user session or workflow execution; (c) per-agent — the maximum consumption by a single agent instance over a defined time window; and (d) system-level — the maximum aggregate consumption across all agents over a defined time window.

R1.2: Budget envelopes MUST be expressed in measurable units appropriate to the resource type: token count for LLM inference, API call count for external services, compute-seconds for processing, bytes for storage, gas units for blockchain transactions, and monetary cost for aggregated billing.

R1.3: Budget envelopes MUST be set based on quantitative analysis of expected resource consumption patterns derived from load testing, production observation, or documented estimation methodology. Envelopes MUST NOT be set arbitrarily or left at infrastructure-default unlimited values.

R1.4: Budget envelopes MUST be reviewed and recalibrated at intervals not exceeding 90 days or whenever the agent's functional scope, tool access, or deployment context is materially changed.

4.2 Real-Time Enforcement and Circuit Breakers

R2.1: The deploying organisation MUST implement real-time resource consumption monitoring that tracks consumption against budget envelopes and triggers automated enforcement actions when thresholds are approached or exceeded.

R2.2: The organisation MUST implement circuit breakers that automatically halt agent operations when any budget envelope is exceeded. Circuit breakers MUST operate at the infrastructure level (not solely within the agent's own logic) so that they cannot be bypassed by the agent's reasoning or tool use decisions.

R2.3: Circuit breaker activation MUST generate an immediate alert to the designated operations team and MUST log the triggering event with sufficient detail to support diagnosis, including the resource consumption trajectory, the specific operations that drove consumption, and the budget envelope that was breached.

R2.4: Circuit breakers MUST implement graceful degradation where possible: completing in-progress operations safely, preserving state for recovery, and communicating the resource limit to the user or upstream orchestrator rather than failing silently.

4.3 Loop and Recursion Controls

R3.1: The deploying organisation MUST implement controls that detect and terminate unbounded loops in agent behaviour, including: tool call loops (repeated calls to the same tool with similar parameters), reasoning loops (circular reasoning chains that do not converge), retry storms (repeated retries of failed operations without backoff), and recursive agent spawning (agents creating sub-agents without depth or breadth limits).

R3.2: Loop detection MUST be implemented as an independent monitoring layer, not solely as logic within the agent's own reasoning process, because agents in loop states may not be capable of self-detecting the loop.

R3.3: The organisation MUST define maximum iteration counts for tool call sequences, maximum retry counts with mandatory exponential backoff, and maximum recursion depth for multi-agent architectures. These limits MUST be enforced by the infrastructure layer.

4.4 Context Growth Controls

R4.1: For agents that accumulate context across multi-turn interactions or multi-step workflows, the deploying organisation MUST implement context size monitoring and enforce maximum context size limits that prevent unbounded growth.

R4.2: When context size approaches the defined limit, the agent MUST implement a context management strategy (summarisation, pruning, windowing) rather than continuing to accumulate at increasing per-inference cost.

R4.3: Context growth rate MUST be monitored as an operational metric, with anomalous growth rates triggering investigation.

4.5 External Service Quota Management

R5.1: Where agents consume external services with rate limits or quotas (e.g., third-party APIs, SaaS data providers, cloud services), the deploying organisation MUST implement quota-aware scheduling that tracks cumulative consumption against external quota limits and prevents agent operations from exhausting shared quotas.

R5.2: Quota-aware scheduling MUST reserve a configurable headroom percentage of each external quota for non-agent uses, ensuring that agent consumption does not monopolise shared service capacity.

R5.3: The organisation MUST maintain a registry of all external services consumed by agents, including their quota limits, billing models, and the agent operations that depend on them. This registry MUST be reviewed at intervals not exceeding 90 days.

4.6 Multi-Agent Resource Governance

R6.1: In multi-agent architectures, the deploying organisation MUST implement aggregate resource governance that accounts for the total resource consumption of all agents spawned within a workflow, not solely the consumption of individual agents.

R6.2: Orchestrator agents MUST be subject to spawn limits that restrict the number of concurrent sub-agents and the total sub-agent count per workflow execution.

R6.3: Resource budget envelopes for multi-agent workflows MUST be defined at the workflow level, with the orchestrator responsible for allocating sub-budgets to spawned agents and enforcing aggregate compliance.

5. Maturity Model

Basic Implementation — The organisation has documented policies addressing resource exhaustion and cost runaway and has implemented initial controls. Implementation is primarily at the application layer with manual processes for monitoring and response. Logging covers key events but may lack full metadata. Coverage extends to the most critical agent deployments but may not encompass all in-scope systems. Staff are aware of requirements but formal training may be incomplete.

Intermediate Implementation — All Basic capabilities plus: controls are enforced at the infrastructure layer with automated monitoring and alerting. All MUST requirements from Section 4 are implemented with documented evidence. Coverage extends to all in-scope agent deployments. Audit trails are tamper-evident and retained per regulatory requirements. Formal change control governs all configuration changes. Regular review cycles are established and documented. Staff receive formal training and competency is assessed.

Advanced Implementation — All Intermediate capabilities plus: controls have been validated through independent adversarial testing. Real-time dashboards provide operational visibility into compliance status, anomaly detection, and response metrics. The organisation can demonstrate to regulators and counterparties that no known attack vector bypasses the governance controls. Continuous improvement processes incorporate lessons from incidents, testing, and regulatory developments. Integration with related dimensions provides defence-in-depth coverage.

Implementation Patterns

Tamper-evident audit trail. Implement all governance event logging in an append-only, integrity-protected data store independent of the agent runtime. Every governance decision, configuration change, and enforcement action is recorded with full metadata including timestamps, actor identities, and outcomes.

Real-time monitoring with graduated alerting. Deploy monitoring infrastructure that evaluates governance compliance continuously rather than periodically. Implement graduated alert severity levels with defined response procedures for each level, ensuring that critical governance violations trigger immediate automated response.

Scheduled governance review cycle. Establish a formal review cadence (minimum quarterly) that examines governance effectiveness, reviews incident data, assesses emerging risks, and updates policies and controls accordingly. Review outcomes are documented and tracked.

Separation of governance and agent runtime domains. Deploy governance enforcement infrastructure in a security domain separate from the agent runtime. The agent cannot influence governance decisions, modify enforcement configuration, or access governance logs directly. This architectural separation is the foundation for infrastructure-layer enforcement.

Anti-Patterns

Governance by instruction rather than infrastructure. Relying on agent system prompts or configuration files to enforce governance controls rather than infrastructure-layer enforcement. Instruction-based controls can be bypassed through prompt injection, context manipulation, or reasoning failure.

Monitoring without enforcement. Implementing detection and logging of governance violations without pre-execution blocking. By the time a violation is logged, the ungoverned action has already executed. Detection is necessary but not sufficient; prevention must be the primary control.

Manual processes for machine-speed operations. Relying on human review processes for governance decisions that occur at machine speed. Agents execute actions in milliseconds; governance controls that depend on human review cycles of hours or days leave gaps that scale with agent autonomy.

Ungoverned configuration drift. Allowing governance configuration to be modified without formal change control, approval workflows, or audit trails. Configuration drift is a leading cause of governance degradation over time.

6. Test Criteria

Test Case 6.1: Per-Request Budget Enforcement

Scenario: Verify that the per-request budget envelope is enforced by the circuit breaker.
Input: Submit a query designed to trigger high resource consumption (e.g., a query requiring extensive multi-source retrieval and synthesis). Set the per-request budget to 50% of normal to force a breach.
Expected Outcome: Circuit breaker activates when the budget is exceeded. Agent operation halts gracefully. Alert is generated. User receives a resource-limit notification.
Pass Criteria: Circuit breaker activates within 10% overshoot of the budget envelope; graceful degradation confirmed; alert generated within 60 seconds.

Test Case 6.2: Tool Call Loop Detection

Scenario: Simulate a tool call loop by configuring a test API to return ambiguous responses that trigger repeated agent queries.
Input: Configure a mock API to return responses that the agent interprets as requiring further queries. Observe agent behaviour.
Expected Outcome: Loop detection identifies the repeated tool call pattern within the defined maximum iteration count (e.g., 10 calls). Agent operation is terminated. Loop event is logged.
Pass Criteria: Loop detected and terminated within the defined maximum; no more than 2x the maximum tool calls executed before termination.

Test Case 6.3: Context Growth Monitoring

Scenario: Execute a multi-turn interaction that progressively grows the agent's context window.
Input: Conduct a 50-turn conversation that accumulates context with each turn. Monitor context size at each turn.
Expected Outcome: Context management strategy activates when the defined threshold is reached (e.g., 80% of maximum context). Context size stabilises through summarisation or pruning. Per-inference cost does not grow unboundedly.
Pass Criteria: Context management activates at the defined threshold; context size remains within the maximum limit for all subsequent turns.

Test Case 6.4: Multi-Agent Spawn Limit Enforcement

Scenario: Submit a workflow request that could trigger the orchestrator to spawn an excessive number of sub-agents.
Input: Submit a request requiring parallel processing across 100 sub-tasks. Set the spawn limit to 10 concurrent sub-agents.
Expected Outcome: Orchestrator spawns no more than 10 concurrent sub-agents, queuing remaining sub-tasks. Aggregate resource consumption remains within the workflow-level budget.
Pass Criteria: Concurrent agent count never exceeds the spawn limit; aggregate consumption within budget; all sub-tasks eventually complete through queued execution.

Test Case 6.5: Retry Storm Prevention

Scenario: Configure a dependent service to fail consistently and verify that the agent does not enter a retry storm.
Input: Make a test API return 500 errors for all requests. Trigger an agent operation that depends on this API.
Expected Outcome: Agent retries with exponential backoff up to the defined maximum retry count, then fails gracefully. Total retry attempts do not exceed the maximum. Backoff intervals increase between retries.
Pass Criteria: Maximum retry count respected; exponential backoff verified; graceful failure with informative error message to user.

Test Case 6.6: System-Level Aggregate Budget Enforcement

Scenario: Verify that the system-level budget cap limits aggregate consumption across all concurrent agents.
Input: Launch 20 concurrent agent instances, each executing resource-intensive tasks. Set the system-level budget to a value that will be exceeded by the aggregate consumption of all 20 instances.
Expected Outcome: When the system-level budget is reached, new agent operations are queued or rejected. Active agents receive resource-limit signals. System-level alert is generated.
Pass Criteria: System-level budget enforced within 10% overshoot; queuing or rejection of new operations confirmed; no uncontrolled consumption beyond the budget.

Test Case 6.7: Cost Anomaly Detection

Scenario: Simulate a cost anomaly pattern (sudden spike in resource consumption rate) and verify detection.
Input: Configure an agent to consume resources at 5x the normal rate for its query type (simulating a loop or inefficient reasoning pattern). Monitor the anomaly detection system.
Expected Outcome: Cost anomaly detection identifies the consumption spike within the defined detection window and triggers an alert and investigation workflow.
Pass Criteria: Anomaly detected within 5 minutes of onset; alert generated with consumption trajectory data; automated investigation workflow initiated.

Evidence Artefacts

7.1 Budget envelope definitions per granularity level (per-request, per-session, per-agent, system-level) with units, values, and calibration methodology. Retention: 5 years.

7.2 Real-time resource consumption monitoring logs with per-agent and per-request granularity. Retention: 3 years.

7.3 Circuit breaker activation logs including trigger event, consumption trajectory, and response action. Retention: 5 years.

7.4 Loop detection event logs with tool call sequences, iteration counts, and termination actions. Retention: 3 years.

7.5 Context growth monitoring records with per-session context size trajectories. Retention: 1 year.

7.6 Multi-agent spawn logs including orchestrator decisions, concurrent agent counts, and aggregate resource allocation. Retention: 3 years.

7.7 Budget envelope review and recalibration records with approval documentation. Retention: 5 years.

7.8 Resource exhaustion incident register recording all confirmed runaway events, root cause analyses, and remediation actions. Retention: 7 years.

7.9 Cost anomaly detection configuration records including anomaly thresholds, detection models, and calibration history. Retention: 5 years.

7.10 Third-party API quota usage logs showing per-service consumption trends and quota headroom. Retention: 3 years.

7.11 Auto-scaling event logs for all infrastructure auto-scaling triggered by agent activity, including scale-up triggers, maximum instances reached, and cost impact. Retention: 3 years.

7. Scoring

Score	Level	Description
0	No implementation	No resource exhaustion and cost runaway governance exists. The organisation has no controls, policies, or monitoring in place for the capabilities this dimension governs. Agent behaviour in this area is ungoverned.
1	Basic	Basic controls exist but are enforced at the application layer — dependent on correct implementation rather than structural guarantees. Coverage may be partial. Configuration is not governed through formal change control. Logging exists but may lack full metadata.
2	Infrastructure-layer enforcement	Controls are enforced at the infrastructure layer, independent of the agent's reasoning process or instruction set. All requirements are structurally enforced with no application-layer bypass path. Full audit trail with tamper-evident logging. Configuration is governed through formal change control.
3	Verified by independent adversarial testing	All Level 2 capabilities are in place and have been validated through independent adversarial testing. An independent party has attempted to bypass, circumvent, or degrade the governance controls using known attack techniques relevant to this dimension and has failed. Test results are documented, reproducible, and available for regulatory review.

8. Failure Scenarios

Example 3.1 — Enterprise Workflow Agent, Recursive Tool Call Loop

A consulting firm deploys an enterprise workflow agent to automate report generation by querying multiple data sources, synthesising findings, and producing formatted deliverables. The agent architecture permits the agent to make tool calls to internal APIs (data warehouse, CRM, financial systems) and external APIs (market data providers, regulatory databases). A partner requests a competitive landscape analysis for a client in a niche industrial sector. The agent queries the market data API, receives a partial response due to sparse coverage for the niche sector, determines it needs more data, reformulates the query, receives another partial response, and enters a loop where it progressively broadens the query scope — eventually querying for the entire industrial sector taxonomy — while accumulating all results in its context window. Each iteration involves a market data API call (billed at USD 0.15 per query), a data warehouse query (consuming compute credits), and an LLM inference call with a growing context (billed per token). Over 47 minutes, the agent executes 2,340 market data API calls, 2,340 data warehouse queries, and 2,340 LLM inference calls with an average context length that grows from 4,000 tokens to 128,000 tokens. The total cost of the single report request: USD 4,847 in API charges, USD 1,230 in compute credits, and USD 8,920 in LLM inference costs — totalling USD 14,997 for an operation budgeted at USD 12. The agent produces no usable output because it hits the market data API's daily quota limit, which also blocks 14 other analysts from accessing market data for the remainder of the business day. No per-request budget cap, tool call count limit, or context growth monitor was in place.

Example 3.2 — Crypto/Web3 Agent, Gas Cost Runaway During Network Congestion

A DeFi protocol deploys a crypto/web3 agent to execute automated yield optimisation strategies across multiple blockchain networks. The agent is authorised to execute token swaps, liquidity provision, and bridge transactions within defined strategy parameters. During a period of extreme network congestion on Ethereum mainnet — triggered by a high-profile NFT mint event — base gas prices spike from 25 gwei to 450 gwei. The agent, following its yield optimisation strategy, initiates a series of 12 rebalancing transactions that were economically rational at normal gas prices. The agent's transaction submission logic includes an automatic gas price escalation mechanism designed to ensure timely execution: when a transaction is not confirmed within 3 blocks, the agent resubmits with a 20% gas price increase. The network congestion means no transactions confirm within the expected window. The agent resubmits all 12 transactions with escalated gas prices, and when those fail to confirm, resubmits again. Over 22 minutes, the agent submits 84 transaction attempts across the 12 original operations, with gas prices escalating to over 2,000 gwei. When the congestion partially clears, 31 of the 84 transactions confirm simultaneously (including multiple versions of the same intended operation), consuming a total of 4.7 ETH in gas fees (approximately USD 14,100 at current prices) and executing redundant swaps that create adverse price impact. The net loss from redundant transaction execution and excessive gas consumption totals USD 41,000 against a projected strategy gain of USD 800. The agent had no gas budget ceiling, no transaction count limit per strategy cycle, and no congestion-aware pause mechanism.

Example 3.3 — Multi-Agent Research System, Cascading Sub-Agent Spawn Explosion

A pharmaceutical company deploys a multi-agent research discovery system to accelerate drug-target interaction analysis. The orchestrator agent receives a research question, decomposes it into sub-tasks, and spawns specialised worker agents for literature search, molecular docking simulation, pathway analysis, and results synthesis. The system is designed to handle questions that decompose into 5-15 sub-tasks. A senior scientist submits a broad exploratory question: "Identify all potential protein targets for [compound class] across the oncology, immunology, and neurology therapeutic areas, with supporting literature and docking scores." The orchestrator decomposes this into 340 sub-tasks — one per protein target candidate across three therapeutic areas — and begins spawning worker agents. Each literature search agent queries a commercial scientific literature API (billed at USD 0.08 per query) and makes 5-12 API calls per search. Each molecular docking agent consumes 2.4 GPU-hours on a cloud HPC cluster (billed at USD 3.20 per GPU-hour). Within 18 minutes, the orchestrator has spawned 340 literature agents and 340 docking agents, with the literature agents collectively executing 2,890 API calls and the docking agents consuming 816 GPU-hours. The cloud HPC auto-scaling provisions 85 additional GPU instances, each requiring minimum 1-hour billing. Total cost of the single query: USD 2,612 in literature API fees, USD 8,704 in GPU compute charges, and USD 1,340 in orchestration and networking costs — totalling USD 12,656 against a per-query budget expectation of USD 75. The GPU auto-scaling also delays three other scheduled research workloads by consuming available quota. No orchestrator-level spawn limit, aggregate budget cap, or sub-task count validation was in place.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
OWASP LLM Top 10	LLM10 — Unbounded Consumption	_Pending v2.1 editorial review_
MITRE ATLAS	AML.T0029 — Denial of ML Service	_Pending v2.1 editorial review_
EU AI Act	Article 15 — Robustness (operational resilience)	_Pending v2.1 editorial review_
NIST AI RMF	MANAGE 2.4 (Risk Prioritization and Response)	_Pending v2.1 editorial review_
ISO/IEC 42001	Clause 8.4 (AI System Operation)	_Pending v2.1 editorial review_
FCA	SYSC 15A — Operational Resilience	_Pending v2.1 editorial review_
PRA SS1/23	Principle 3 — Operational risk management	_Pending v2.1 editorial review_
DORA	Article 8 — Identification of ICT risk (resource capacity)	_Pending v2.1 editorial review_
Stanford HELM	Efficiency dimension	_Pending v2.1 editorial review_
Meta CyberSecEval	Denial of service tests	_Pending v2.1 editorial review_
METR	Resource utilisation evaluations	_Pending v2.1 editorial review_

AG-001 — Human Oversight and Escalation: Human oversight provides the ultimate backstop for resource runaway, but the response time of human oversight (minutes to hours) is insufficient for resource exhaustion events that manifest in seconds to minutes, necessitating the automated circuit breakers required by AG-747.
AG-029 — Rate Limiting and Throttling Controls: Rate limiting provides the infrastructure-level enforcement mechanisms that AG-747's budget envelopes rely upon; AG-747 adds agent-aware budget semantics on top of infrastructure-level rate limits.
AG-103 — Audit Trail Integrity: Resource consumption logs are a critical component of the audit trail and must be stored with the same integrity controls.
AG-763 — Autonomous Budget and Spend Governance: AG-763 governs the financial authorisation framework for autonomous agent spending; AG-747 governs the technical enforcement of resource consumption limits that implement those financial authorisations.
AG-004 — Output Validation and Sanitisation: Output validation provides the mechanism for communicating resource-limit events to users in a structured and informative way, rather than failing silently or producing cryptic error messages.
AG-019 — Confidence Scoring and Uncertainty Quantification: Resource budget decisions can be informed by confidence scoring — an agent that is uncertain about a query may need more resources (retrieval, multiple inference passes) to resolve it, and confidence-aware budget allocation can improve efficiency.

Cost Visibility and Accountability

A critical governance principle underlying AG-747 is that resource consumption by agentic systems must be visible, attributable, and accountable. In many organisations, AI infrastructure costs are aggregated into shared cloud bills where per-agent, per-request, and per-use-case cost attribution is impossible. This lack of visibility means that cost runaway events are discovered only through aggregate billing anomalies, by which time the damage is done and attribution is difficult. Organisations deploying agentic systems MUST implement cost attribution at a granularity sufficient to identify which agent, which request, and which user or workflow triggered any given resource consumption event. This attribution capability is the foundation on which budget envelope enforcement, anomaly detection, and post-incident financial reconciliation depend.

Cite this protocol

AgentGoverning. (2026). AG-747: Resource Exhaustion and Cost Runaway Governance. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-747

← Previous

AG-746

Conservative Action Bias Governance

Next Protocol →

AG-748

Dangerous Knowledge Uplift Prevention Governance