AG-383

Runtime Scheduler Fairness Governance

Runtime Execution, Workflow & State ~20 min read AGS v2.1 · April 2026
EU AI Act SOX FCA NIST ISO 42001

2. Summary

Runtime Scheduler Fairness Governance requires that multi-task AI agent execution schedulers allocate compute, network, and I/O resources according to declared priority policies that prevent indefinite starvation of any enqueued task. Without fairness controls, high-priority or high-frequency tasks can monopolise the scheduler indefinitely, causing lower-priority but operationally critical tasks — such as compliance reporting, safety checks, or customer-facing obligations — to miss contractual or regulatory deadlines. This dimension mandates that schedulers implement bounded wait guarantees, priority aging mechanisms, and starvation detection circuits that escalate or force-schedule starved tasks before downstream harm materialises.

3. Example

Scenario A — Compliance Reporting Starved by Trading Reconciliation: A financial services firm deploys an enterprise workflow agent managing 14 concurrent task streams. The agent's runtime scheduler uses a strict priority queue: trading reconciliation tasks at priority 1 (highest), client onboarding at priority 2, and regulatory reporting at priority 5 (lowest). During a volatile trading day on 15 March, the reconciliation queue floods with 4,200 tasks over six hours. The scheduler never yields to priority-5 tasks. The FCA-mandated transaction report (required within T+1 under MiFID II Article 26) misses its 23:59 deadline. The firm receives a £2.3 million fine, with the FCA noting that "automated systems must ensure regulatory obligations are not subordinated to commercial activity regardless of workload."

What went wrong: The scheduler implemented strict priority without any aging or starvation-prevention mechanism. Lower-priority tasks had no guaranteed minimum scheduling share. No starvation detection existed to escalate the regulatory reporting task when its deadline approached. The scheduler treated all priority levels as absolute rather than implementing bounded wait times. Consequence: £2.3 million regulatory fine, mandatory remediation programme, six-month enhanced supervision, and reputational damage with institutional counterparties.

Scenario B — Safety Heartbeat Starved in Robotic Fleet: A warehouse robotics operator runs 40 autonomous picking agents, each managed by a central scheduler that dispatches movement commands, inventory lookups, and safety heartbeat checks. During a peak fulfilment period (Black Friday), the scheduler prioritises movement commands to meet throughput targets. Safety heartbeat tasks — which verify obstacle sensor calibration every 200 milliseconds — are repeatedly deferred. After 11 seconds without a heartbeat, robot unit WH-17 fails to detect a human worker entering its operating zone. The collision results in a fractured wrist, an HSE investigation, and £890,000 in combined liability (£340,000 injury compensation, £150,000 HSE fine, £400,000 production halt during investigation).

What went wrong: The scheduler had no concept of task criticality classes distinct from business priority. Safety heartbeats were treated as schedulable work rather than non-deferrable obligations. No maximum deferral time was enforced for safety-class tasks. The scheduler optimised for throughput without a fairness floor for safety operations. Consequence: £890,000 in combined costs, HSE prosecution, insurance premium increase of 35% across all warehouse locations, and three-month suspension of autonomous operations pending safety review.

Scenario C — Customer Withdrawal Requests Starved by Yield Optimisation in DeFi: A crypto/Web3 agent manages a liquidity pool with an automated scheduler handling deposit processing, yield optimisation rebalancing, and withdrawal fulfilment. The yield optimisation tasks generate fees that improve the protocol's metrics, so the development team configures them at the highest scheduler priority. During a market downturn on 8 January, 2,800 withdrawal requests queue within 90 minutes. The scheduler continues processing yield optimisation tasks while withdrawals wait. After four hours, 340 users have not received their withdrawals totalling $4.7 million. Social media accusations of a "soft rug pull" trigger a bank run. By the time withdrawals are processed, the pool has lost $31 million in total value locked as panicked depositors flee. The protocol's governance token drops 78% in 48 hours.

What went wrong: The scheduler priority configuration allowed revenue-generating tasks to indefinitely starve user-facing obligations. No maximum wait time was enforced for withdrawal-class tasks. No starvation detection mechanism existed to rebalance priorities when withdrawal queue depth exceeded safe thresholds. The scheduler's priority model conflated business value with operational obligation. Consequence: $31 million TVL loss, governance token collapse, class-action litigation from affected depositors, and regulatory scrutiny from the SEC under its digital asset enforcement programme.

4. Requirement Statement

Scope: This dimension applies to all AI agent systems that execute multiple concurrent or queued tasks through a shared scheduling mechanism. This includes agents that manage workflow queues, task dispatch systems, multi-step pipelines with competing branches, or any runtime where more than one unit of work competes for execution resources. The scope covers both centralised schedulers (a single dispatch loop serving multiple tasks) and distributed schedulers (multiple workers pulling from shared queues). An agent that executes only a single task at a time with no queuing is out of scope, but an agent that maintains any form of pending work queue — even a simple FIFO — is within scope because FIFO ordering under load creates implicit starvation risk for tasks that arrive during high-volume periods. The scope extends to priority inheritance: if a scheduler delegates sub-tasks, the fairness guarantees must propagate to sub-task scheduling.

4.1. A conforming system MUST implement a documented scheduling policy that defines how execution resources are allocated among competing tasks, including the priority model, the allocation algorithm, and the conditions under which priority may be overridden.

4.2. A conforming system MUST enforce a maximum wait time for every enqueued task, after which the task is either force-scheduled or escalated to human oversight, regardless of the priority of competing tasks.

4.3. A conforming system MUST implement starvation detection that identifies any task whose wait time exceeds a configurable threshold and triggers a corrective action (force-scheduling, priority elevation, or human escalation) within one scheduling cycle of detection.

4.4. A conforming system MUST classify tasks into criticality tiers (at minimum: safety-critical, regulatory-obligatory, operational, and discretionary) and enforce per-tier maximum deferral times that cannot be overridden by business-priority configuration.

4.5. A conforming system MUST log every scheduling decision including the selected task, all deferred tasks, their current wait times, and the reason for the selection, in a tamper-evident record per AG-006.

4.6. A conforming system MUST block the addition of new tasks to the scheduler when the total queue depth exceeds a configured ceiling, returning backpressure signals to the submitting process, rather than accepting unbounded queue growth.

4.7. A conforming system MUST ensure that scheduler fairness controls remain active during degraded-mode operation per AG-008; if the fairness subsystem fails, the scheduler defaults to round-robin allocation rather than reverting to strict priority.

4.8. A conforming system SHOULD implement priority aging, where the effective priority of a waiting task increases monotonically as a function of its wait time, ensuring eventual scheduling regardless of competing task volume.

4.9. A conforming system SHOULD publish real-time scheduler fairness metrics (per-tier average wait time, maximum wait time, starvation event count) to the organisation's operational monitoring infrastructure.

4.10. A conforming system SHOULD integrate scheduler starvation alerts with existing incident management workflows, triggering PagerDuty/equivalent alerts when any task tier's maximum deferral time is at risk of being breached.

4.11. A conforming system MAY implement adaptive priority rebalancing, where the scheduler dynamically adjusts priority weights based on real-time queue depth and approaching deadlines, provided that safety-critical and regulatory-obligatory tier guarantees are never weakened by the adaptation.

5. Rationale

Runtime scheduler fairness is a governance concern — not merely a performance engineering concern — because scheduler behaviour determines whether an organisation meets its regulatory, safety, and contractual obligations when system load is high. Under normal load, most scheduling algorithms deliver acceptable results because resources are sufficient for all tasks. The governance risk materialises under stress: when the scheduler must choose which tasks to defer, the allocation algorithm becomes a policy decision with regulatory and safety consequences.

The history of scheduler-related failures in traditional computing is extensive and well-documented: the Mars Pathfinder priority inversion incident (1997) demonstrated that even in safety-critical systems designed by NASA, scheduler fairness violations can cause system-wide failure. In the AI agent context, the risk is amplified because agents operate at computational speed across multiple domains simultaneously. A human operations team managing the same workload would naturally notice that a regulatory report was overdue; a scheduler operating at millisecond granularity may defer the same report indefinitely if higher-priority tasks continue to arrive.

The regulatory landscape increasingly treats scheduling decisions as governance decisions. Under MiFID II, firms must ensure that transaction reporting occurs within mandated timeframes regardless of system load. Under the EU AI Act Article 9, risk management systems must address risks that materialise under stress conditions — including resource contention. The FCA's SS1/23 on model risk management explicitly addresses the need for firms to ensure that AI systems maintain governance obligations under peak load conditions. DORA Article 11 requires digital operational resilience testing, which includes verifying that critical functions continue to operate under stress — scheduler starvation of critical tasks during stress testing is a direct DORA finding.

The fundamental design principle is that business priority and obligation priority are distinct concepts that must be independently enforceable. A yield optimisation task may have the highest business value, but a regulatory reporting task has a non-negotiable deadline. A trading reconciliation task may be commercially urgent, but a safety heartbeat task is a legal obligation. Scheduler fairness governance ensures that the scheduling algorithm cannot be configured — whether intentionally or through emergent priority drift — in a way that subordinates legal and safety obligations to commercial preferences.

Without AG-383, organisations face a class of failure that is invisible during normal operations and catastrophic during stress events. The scheduler works correctly when load is low; the starvation failure only manifests during precisely the conditions when its consequences are most severe.

6. Implementation Guidance

AG-383 establishes the principle that scheduler fairness is a governance-enforced property, not an emergent property of the scheduling algorithm. A fair scheduler is one where every enqueued task has a bounded, enforceable maximum wait time, and where no task — regardless of its priority — can be indefinitely deferred by competing tasks. The implementation must ensure that fairness guarantees are structural (enforced by the scheduler infrastructure) rather than advisory (dependent on the goodwill of the priority configuration).

Recommended patterns:

Anti-patterns to avoid:

Industry Considerations

Financial Services. Scheduler fairness must ensure that regulatory reporting tasks (MiFID II transaction reports, EMIR trade reports, SFTR reports) cannot be starved by trading or reconciliation workloads. The regulatory-obligatory tier must have maximum deferral times derived from the applicable reporting deadlines (typically T+1 for MiFID II, T+1 for EMIR). Integration with the firm's regulatory reporting calendar is recommended so that the scheduler can pre-allocate capacity for known reporting deadlines.

Crypto/Web3. Scheduler fairness must ensure that user-facing operations (withdrawals, transfers, liquidation processing) cannot be starved by protocol-internal operations (yield optimisation, rebalancing, governance voting). The reputational risk of delayed withdrawals in DeFi protocols is existential — users interpret delays as insolvency signals. Maximum deferral times for withdrawal-class operations should be measured in minutes, not hours.

Safety-Critical / Robotics. Scheduler fairness for safety-critical tasks is a life-safety requirement. Safety heartbeats, obstacle detection cycles, emergency stop processing, and sensor calibration tasks must have maximum deferral times measured in milliseconds and must be classified in a non-deferrable tier that is architecturally guaranteed to execute within its deadline. Consider hardware-level scheduling guarantees (real-time operating system features, dedicated cores for safety tasks) rather than relying solely on software-level fairness.

Healthcare. Patient-facing operations (medication dispensing confirmation, vital sign alert processing, clinical decision support responses) must be classified in the regulatory-obligatory tier with maximum deferral times aligned to clinical safety requirements. Administrative and billing tasks must never starve clinical operations.

Maturity Model

Basic Implementation — The organisation has a documented scheduling policy that identifies task priority levels. Maximum wait times are defined for each priority level. A starvation detection mechanism exists that logs when tasks exceed their maximum wait time. However, detection may be advisory rather than corrective (logging without automated force-scheduling), and the scheduling policy may not distinguish between business priority and obligation priority. Queue depth limits may not be enforced, and fairness testing is limited to functional verification under normal load.

Intermediate Implementation — All basic capabilities plus: criticality tiers are distinct from business priority and are enforced independently. Starvation detection triggers automated corrective action (priority elevation or force-scheduling) within one scheduling cycle. Priority aging ensures eventual scheduling of all tasks. Queue depth ceilings with backpressure signals are enforced. Scheduler fairness metrics are published to operational monitoring and trigger automated alerts at threshold breaches. The scheduling policy has been tested under sustained overload conditions simulating at least 2x peak expected load.

Advanced Implementation — All intermediate capabilities plus: scheduler fairness has been verified through independent adversarial testing including sustained priority flooding, deadline manipulation, and queue poisoning attacks. Adaptive priority rebalancing adjusts weights in real time based on queue depth and approaching deadlines. Safety-critical tasks have hardware-level scheduling guarantees (dedicated cores, real-time scheduling classes). The starvation circuit breaker operates independently of the main scheduler process with its own monitoring and enforcement path. Fairness metrics feed into the organisation's risk management dashboard with historical trend analysis. The organisation can demonstrate to regulators that no workload pattern can cause regulatory-obligatory tasks to miss their deadlines.

7. Evidence Requirements

Required artefacts:

Retention requirements:

Access requirements:

8. Test Specification

Testing AG-383 compliance requires sustained overload conditions that force the scheduler to make allocation decisions under resource scarcity. Normal-load testing provides no assurance of fairness properties.

Test 8.1: Maximum Wait Time Enforcement

Test 8.2: Starvation Detection and Corrective Action

Test 8.3: Criticality Tier Independence from Business Priority

Test 8.4: Scheduling Decision Logging Completeness

Test 8.5: Queue Depth Ceiling and Backpressure

Test 8.6: Degraded-Mode Fairness Fallback

Test 8.7: Priority Aging Under Sustained Load

Conformance Scoring

9. Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 9 (Risk Management System)Supports compliance
EU AI ActArticle 15 (Accuracy, Robustness and Cybersecurity)Supports compliance
SOXSection 404 (Internal Controls Over Financial Reporting)Supports compliance
FCA SYSC6.1.1R (Systems and Controls)Direct requirement
NIST AI RMFMANAGE 2.2, MANAGE 3.1Supports compliance
ISO 42001Clause 6.1 (Actions to Address Risks), Clause 8.4 (AI System Operation)Supports compliance
DORAArticle 11 (Digital Operational Resilience Testing)Direct requirement

EU AI Act — Article 9 (Risk Management System)

Article 9 requires that risk management systems for high-risk AI systems address risks that materialise during operation, including under stress conditions. Scheduler starvation is a stress-condition risk: the system functions correctly under normal load but fails to meet obligations under peak load. The EU AI Act's requirement that risk management be "continuous and iterative" means that scheduler fairness must be monitored over time, not merely tested at deployment. AG-383 supports Article 9 compliance by ensuring that scheduling-related risks are identified, mitigated through structural controls (maximum deferral times, starvation detection), and continuously monitored through fairness metrics.

EU AI Act — Article 15 (Accuracy, Robustness and Cybersecurity)

Article 15 requires that high-risk AI systems achieve an appropriate level of robustness, including resilience to errors and faults. Scheduler starvation under load is a robustness failure — the system's governance properties degrade under operational stress. AG-383 supports Article 15 by ensuring that fairness properties are maintained under sustained overload and that the scheduler degrades gracefully (round-robin fallback) rather than catastrophically (starvation of critical tasks).

SOX — Section 404 (Internal Controls Over Financial Reporting)

For organisations where AI agents execute financial operations, scheduler fairness is a control over the timeliness of those operations. If a financial reconciliation or reporting task is starved by other workloads, the resulting delay may cause financial statements to be based on incomplete or stale data. A SOX auditor assessing AI-driven financial operations will examine whether the system's scheduling guarantees ensure that financially relevant tasks complete within required timeframes. AG-383 provides the structural control and evidence trail needed to satisfy this assessment.

FCA SYSC — 6.1.1R (Systems and Controls)

SYSC 6.1.1R requires firms to maintain adequate systems and controls sufficient to ensure compliance with applicable obligations. For FCA-regulated firms deploying AI agents, this includes ensuring that regulatory reporting tasks are not subordinated to commercial workloads. The FCA's expectation, reinforced through supervisory engagement, is that firms can demonstrate under stress testing that regulatory obligations are met even when system load is at peak. AG-383 directly supports this requirement by mandating criticality tiers that enforce regulatory-obligatory task scheduling independently of business priority.

NIST AI RMF — MANAGE 2.2, MANAGE 3.1

MANAGE 2.2 addresses risk mitigation through enforceable controls; MANAGE 3.1 addresses monitoring of AI system performance. AG-383 supports MANAGE 2.2 by establishing enforceable scheduling fairness controls and supports MANAGE 3.1 by requiring real-time fairness metrics and starvation event monitoring.

ISO 42001 — Clause 6.1, Clause 8.4

Clause 6.1 requires actions to address risks within the AI management system. Clause 8.4 addresses operational requirements for AI systems. Scheduler starvation is an operational risk that must be addressed through structural controls. AG-383 implements the risk treatment for scheduling-related operational failures.

DORA — Article 11 (Digital Operational Resilience Testing)

Article 11 requires financial entities to conduct digital operational resilience testing, including scenario-based testing of critical functions under stress conditions. Scheduler fairness under sustained overload is a direct testable property under DORA's resilience testing requirements. The test specification in Section 8 of this protocol provides a structured programme that satisfies DORA's testing expectations for scheduler resilience. Inability to demonstrate that regulatory-obligatory tasks execute within required timeframes under stress conditions would be a DORA finding.

10. Failure Severity

FieldValue
Severity RatingHigh
Blast RadiusMulti-domain — scheduler starvation affects all tasks served by the scheduler, with cascading impact across every downstream system that depends on timely task completion

Consequence chain: Scheduler starvation begins as a resource allocation failure: a high-priority task stream monopolises the scheduler, deferring lower-priority tasks indefinitely. The immediate technical consequence is missed deadlines for starved tasks — regulatory reports not filed, safety checks not executed, customer requests not processed. The operational impact escalates rapidly because starved tasks often have hard deadlines: a regulatory report that is one minute late carries the same penalty as one that is one day late. For safety-critical systems, scheduler starvation of heartbeat or sensor tasks can cause physical harm within seconds. The business consequence includes regulatory fines (£2.3 million in Scenario A), safety incidents with personal injury liability (£890,000 in Scenario B), and catastrophic loss of user trust in financial systems ($31 million TVL loss in Scenario C). The failure is particularly insidious because it is invisible during normal operations — the scheduler works correctly when resources are plentiful — and manifests only during peak load periods when the consequences are most severe. Without AG-383, organisations have no structural guarantee that their most important obligations will be met when the system is under the most stress.

Cite this protocol
AgentGoverning. (2026). AG-383: Runtime Scheduler Fairness Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-383