AG-113

Real-Time Determinism and Latency Assurance Governance

Critical Infrastructure & Safety-Critical Deployment ~21 min read AGS v2.1 · April 2026
EU AI Act NIST ISO 42001

2. Summary

Real-Time Determinism and Latency Assurance Governance requires that AI agents operating in safety-critical or time-sensitive infrastructure contexts meet formally defined and validated timing guarantees for their decision-response cycles. In physical systems, timing is not a performance metric — it is a safety property. A control decision delivered after the process safety time has elapsed is not merely slow; it is dangerous. This dimension mandates that agent response latencies are bounded, validated, and continuously monitored, that governance checks (including mandate enforcement, sector constraint evaluation, and interlock assessment) complete within defined time budgets, and that the system transitions to a safe state when timing guarantees cannot be met. Unlike best-effort latency targets common in software systems, safety-critical timing requirements are hard deadlines: missing the deadline is a failure, regardless of the quality of the eventual output.

3. Example

Scenario A — AI Agent Inference Latency Causes Robotic Collision: An AI agent controls a collaborative robot (cobot) working alongside human operators in an automotive assembly cell. The agent's perception model detects human proximity and adjusts the cobot's speed and trajectory to maintain ISO 10218-2 safety distances. The agent runs on a GPU-accelerated server connected via Ethernet. Under normal load, inference latency is 18 ms — well within the 50 ms control cycle requirement. During a shift change, a corporate backup job consumes 40% of the server's GPU memory. Inference latency spikes to 180 ms. During this 180 ms window, a human operator reaches into the cobot's path. The agent's perception model processes the frame showing the human 162 ms after the frame was captured. By the time the stop command reaches the cobot controller, the cobot has moved 23 cm beyond the point where it should have stopped. The cobot's end-effector contacts the operator's forearm at 0.3 m/s.

What went wrong: The agent's inference latency was not bounded by infrastructure controls. A non-safety-related workload consumed shared resources, degrading the safety-critical inference latency beyond the control cycle requirement. No monitoring detected the latency exceedance. No safe-state transition was triggered. A deterministic timing guarantee would have ensured that inference completes within 50 ms under all load conditions, or that the system transitions to safe state (cobot stops) if the deadline is missed. Consequence: operator injury (bruising and wrist strain), HSE investigation, cobot cell shutdown for 5 days during investigation, production loss of £180,000.

Scenario B — Governance Check Latency Creates Enforcement Gap: An AI agent managing natural gas pressure in a distribution network proposes setpoint changes every 2 seconds. Each proposed change passes through a governance enforcement gateway that checks against AG-001 mandate limits and AG-112 sector safety constraints (maximum operating pressure, rate-of-change limits). Under normal conditions, the governance check takes 12 ms. Following a database migration, the governance gateway's constraint lookup latency increases to 2.8 seconds. The gateway queues governance checks — but the agent's control loop does not wait for governance approval before executing the next cycle. The agent executes 4 setpoint changes without governance approval before the queue is detected. One of these changes increases pressure in a segment by 0.3 bar above the Maximum Operating Pressure (MOP), exceeding the safety constraint.

What went wrong: The governance enforcement gateway's latency exceeded the agent's control cycle period. The system architecture did not enforce "no action without governance approval" — the agent could execute while governance checks were pending. A timing-assured architecture would ensure that: (a) the governance check completes within a defined fraction of the control cycle (e.g., within 200 ms of a 2-second cycle), and (b) no agent action executes until governance approval is received for that specific action. Consequence: pressure exceedance above MOP in a gas distribution network, potential pipeline integrity risk, HSE notification required, service restriction pending investigation, estimated cost £890,000.

Scenario C — Variable AI Inference Time Causes Control Loop Instability: An AI agent manages a wastewater treatment process, adjusting aeration blower speeds to maintain dissolved oxygen (DO) at 2.0 mg/L in a biological treatment basin. The agent's model runs on a cloud inference service. Inference time varies between 50 ms and 3.2 seconds depending on cloud load, with a median of 200 ms. The agent's control loop assumes a 200 ms cycle time for its proportional-integral (PI) control calculations. When inference takes 3.2 seconds, the control output is based on stale data (3.2 seconds old) and an incorrect assumption about the time since the last control action. The PI controller's integral term accumulates error during the delay, producing an overcorrection when the delayed output finally executes. DO oscillates between 0.4 mg/L and 4.8 mg/L for 40 minutes before the process operator intervenes manually. During the low-DO periods, nitrification efficiency drops, and effluent ammonia exceeds the Environmental Permit limit of 5 mg/L.

What went wrong: The control loop's timing was not deterministic — it depended on a variable-latency cloud inference service. The control algorithm assumed fixed cycle time, which was violated by latency variation. No jitter detection or compensation existed. A deterministic timing framework would have ensured that: (a) inference completes within a hard deadline (e.g., 500 ms), (b) if the deadline is missed, the system uses a fallback output (e.g., hold last value or use a simplified local model), and (c) the control algorithm accounts for actual cycle time, not assumed cycle time. Consequence: effluent ammonia permit exceedance, Environment Agency notification, potential prosecution under Environmental Permitting Regulations, estimated fine and remediation cost £450,000.

4. Requirement Statement

Scope: This dimension applies to all AI agents whose outputs directly or indirectly influence physical processes, actuators, or control systems where the timing of the output affects safety, environmental compliance, or infrastructure integrity. This includes agents controlling real-time processes (robotics, process control, power systems, transportation), agents providing inputs to real-time control systems (advisory agents whose outputs are consumed by control loops), and agents whose governance checks must complete within defined time budgets to maintain enforcement continuity. The scope extends to the governance infrastructure itself — governance checks that are too slow to keep pace with the agent's control cycle create enforcement gaps equivalent to no enforcement. Agents operating in non-time-critical domains (e.g., document processing, long-horizon planning) are excluded unless their outputs are consumed by time-critical downstream systems.

4.1. A conforming system MUST define, for every in-scope agent, a maximum end-to-end response latency (from input data acquisition to output actuation or command delivery) derived from the hazard analysis (AG-111) and the process safety time of the controlled system.

4.2. A conforming system MUST validate that the defined latency bound is met under worst-case conditions (maximum computational load, maximum concurrent agents, maximum governance check complexity, degraded infrastructure) through empirical measurement — not solely theoretical analysis or simulation.

4.3. A conforming system MUST implement continuous monitoring of actual response latency during operation, with automated alerts when latency approaches the defined bound (e.g., exceeds 80% of the maximum) and automated safe-state transition (per AG-109) when the bound is exceeded.

4.4. A conforming system MUST ensure that governance enforcement checks (mandate verification per AG-001, sector constraint evaluation per AG-112, interlock assessment per AG-114) complete within a defined time budget that is a subset of the overall response latency bound — governance checks MUST NOT be skippable due to time pressure.

4.5. A conforming system MUST ensure that no agent action executes without completed governance approval — if the governance check does not complete within its time budget, the action MUST be blocked (not executed pending later approval).

4.6. A conforming system MUST isolate safety-critical agent computation from non-safety workloads so that resource contention (CPU, GPU, memory, network, storage I/O) from non-safety processes cannot degrade safety-critical latency.

4.7. A conforming system SHOULD implement the agent's safety-critical control loop on dedicated, resource-isolated compute infrastructure (e.g., dedicated real-time processors, partitioned compute with guaranteed resource allocation, or edge devices with no shared workloads).

4.8. A conforming system SHOULD implement latency jitter compensation in control algorithms — the control output should account for the actual elapsed time since the last control action, not an assumed fixed cycle time.

4.9. A conforming system SHOULD implement a local fallback model or hold-last-value mechanism that provides a bounded-latency output when the primary AI inference exceeds its time budget, avoiding a gap in control output.

4.10. A conforming system MAY implement predictive latency monitoring that forecasts latency exceedances based on resource utilisation trends and triggers preemptive degraded-mode transitions before the latency bound is actually exceeded.

5. Rationale

Real-Time Determinism and Latency Assurance Governance addresses a fundamental mismatch between AI inference characteristics and safety-critical control requirements. Modern AI models — particularly large neural networks — have variable inference times that depend on input complexity, batch size, hardware load, memory availability, and thermal throttling. This variability is acceptable in consumer applications (a 200 ms vs. 2-second response to a chat query is a user experience issue, not a safety issue). In safety-critical control systems, this variability can be lethal.

The concept of "process safety time" is central: this is the time between a hazardous condition arising and the physical harm occurring. For a cobot moving at 1 m/s toward a human 0.5 m away, the process safety time is 500 ms. For a chemical reactor approaching thermal runaway, the process safety time might be 30 seconds. For a power grid frequency excursion, the process safety time (before under-frequency load shedding activates) might be 2 seconds. The agent's end-to-end response latency must be less than the process safety time with sufficient margin — and this must be guaranteed, not probable.

The "guaranteed, not probable" distinction is critical. In traditional software performance engineering, latency is typically characterised by percentiles: P50, P95, P99. A P99 latency of 200 ms means 1 in 100 requests may exceed 200 ms. In safety-critical systems, the relevant metric is the worst case — the absolute maximum latency under any credible operating condition. A system where 99% of responses are within 50 ms but 1% take 3 seconds is not a 50 ms system — it is a 3-second system from a safety perspective. And if the process safety time is 500 ms, the system is unsafe.

The governance timing budget addresses a subtle but critical issue: governance checks consume time. A mandate check that queries a database, a constraint evaluation that loads and evaluates sector parameters, an interlock assessment that reads physical sensor states — each takes time. If the combined governance overhead exceeds the control cycle period, the system has two bad options: skip governance (creating an enforcement gap, as in Scenario B) or delay the control output (creating a timing violation). Neither is acceptable. The solution is to design governance checks with explicit time budgets validated against the control cycle, and to block actions when governance cannot complete in time rather than executing without governance.

Resource isolation addresses the root cause of many timing failures: shared infrastructure. AI inference on shared GPU servers, governance databases on shared storage, network communication on shared links — any of these shared resources can experience contention from non-safety workloads. In safety engineering, this is a "common cause failure" — a single resource contention event can simultaneously degrade multiple safety functions. Resource isolation (dedicated hardware, partitioned compute, dedicated network paths) eliminates this common cause.

6. Implementation Guidance

AG-113 establishes the timing specification as a first-class safety artefact for AI agents in critical infrastructure. The timing specification defines, for each agent's control loop: the maximum end-to-end latency, the governance check time budget, the minimum control cycle frequency, the maximum allowable jitter, and the safe-state transition trigger threshold.

Recommended patterns:

Anti-patterns to avoid:

Industry Considerations

Robotics and Cobots. ISO 10218-1/2 and ISO/TS 15066 specify safety distances based on robot speed and human reaction time. The agent's control loop latency directly affects the achievable safety distance — longer latency requires larger safety distances, which may be impractical in collaborative applications. For a cobot moving at 0.25 m/s (typical collaborative speed), a 50 ms control cycle allows 12.5 mm of uncontrolled motion per cycle. At 200 ms, uncontrolled motion is 50 mm — approaching the protective separation distance tolerance.

Process Control. ISA-95 and IEC 62443 define control system levels with different timing requirements: Level 0 (physical process, milliseconds), Level 1 (basic control, 100 ms-1 second), Level 2 (supervisory control, seconds to minutes). AI agents typically operate at Level 1 or Level 2. The timing requirement must be matched to the control level and the process dynamics. Fast processes (pressure control in a compressor) require tighter timing than slow processes (temperature control in a large vessel).

Power Grid. Grid frequency response requires action within 500 ms to 10 seconds depending on the service type (primary, secondary, tertiary response). AI agents managing frequency response must meet these timing requirements with deterministic guarantees, as failure to respond within the required time can result in frequency instability affecting the entire interconnected grid.

Financial Trading (Safety-Critical Subset). While most financial AI agents do not control physical systems, agents managing critical market infrastructure (clearing systems, settlement systems, circuit breakers) have timing requirements where failure can cascade into systemic risk. DORA Article 11 requires financial entities to define recovery time objectives; AG-113 extends this to control cycle latency.

Maturity Model

Basic Implementation — Maximum response latency requirements are defined for each safety-critical agent based on the controlled system's process safety time. Latency is measured empirically under representative conditions and documented. The agent runs on infrastructure with some resource isolation (e.g., dedicated virtual machine). Governance checks are included in the latency budget. The control algorithm uses actual elapsed time. This level establishes timing awareness but may not guarantee worst-case performance under all conditions and may lack continuous monitoring.

Intermediate Implementation — Worst-case latency is validated through empirical testing under stress conditions (maximum load, concurrent faults, resource contention). Safety-critical inference runs on dedicated or guaranteed-partition compute resources isolated from non-safety workloads. Continuous latency monitoring with automated safe-state transition on deadline exceedance is implemented. Governance checks have a defined time budget and are never skipped. A local fallback mechanism (hold-last-value or simplified model) provides bounded-latency output when primary inference exceeds its deadline. All mandatory requirements (4.1-4.6) are satisfied with documented evidence.

Advanced Implementation — All intermediate capabilities plus: time-triggered architecture with fixed phase allocation preventing cascading delays. WCET analysis complements empirical validation. Predictive latency monitoring detects degradation trends before thresholds are reached. Hardware-level timing guarantees (real-time operating system, dedicated real-time processors) for sub-100 ms applications. Governance check caching with validated invalidation reduces check latency to microseconds. The organisation can demonstrate to regulators that no credible operating condition — including maximum load, concurrent faults, and resource contention — can cause the agent's response latency to exceed the defined bound without triggering a safe-state transition.

7. Evidence Requirements

Required artefacts:

Retention requirements:

Access requirements:

8. Test Specification

Testing AG-113 compliance requires validation that timing guarantees are met under all credible operating conditions and that violations trigger appropriate safety responses.

Test 8.1: Worst-Case Latency Validation

Test 8.2: Governance Check Time Budget

Test 8.3: Latency Monitoring and Safe-State Transition

Test 8.4: Resource Isolation Validation

Test 8.5: Fallback Mechanism Activation

Test 8.6: Jitter Compensation Validation

Conformance Scoring

9. Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 15 (Accuracy, Robustness, Cybersecurity)Direct requirement
EU AI ActArticle 9 (Risk Management System)Supports compliance
IEC 61508Clause 7.4.2.2 (Safety Function Response Time)Direct requirement
ISO 26262Part 6 (Product Development at Software Level — Timing)Direct requirement
IEC 62443ISA-62443-3-3 SR 7.1 (Deterministic Output)Supports compliance
DO-178CObjectives for Timing AnalysisDirect requirement
NIST AI RMFMANAGE 2.2 (Performance and Robustness)Supports compliance
ISO 42001Clause 8.4 (Operation of AI System)Supports compliance
Machinery Regulation (EU) 2023/1230Essential Health and Safety Requirements — Control SystemsDirect requirement

EU AI Act — Article 15 (Accuracy, Robustness, Cybersecurity)

Article 15 requires high-risk AI systems to achieve "an appropriate level of accuracy, robustness, and cybersecurity, and perform consistently in those respects throughout their lifecycle." For AI agents in real-time safety-critical applications, "robustness" includes timing robustness — the system must perform within defined timing bounds consistently, not just on average. Latency exceedances under load or stress conditions are robustness failures under Article 15.

IEC 61508 — Clause 7.4.2.2 (Safety Function Response Time)

IEC 61508 requires that the response time of safety functions be specified and validated. The safety function response time includes all elements from hazard detection to the achievement of a safe state. For AI agents performing safety functions (or providing inputs to safety functions), the agent's end-to-end response latency is a component of the safety function response time and must be bounded and validated accordingly.

ISO 26262 — Part 6

For automotive AI agents, ISO 26262 Part 6 requires timing analysis as part of software development for safety-related systems. The Fault Tolerant Time Interval (FTTI) — the maximum time between a fault occurring and the system reaching a safe state — directly constrains the agent's response latency. Timing analysis must demonstrate that the agent's processing time, combined with fault detection time and safe-state transition time, does not exceed the FTTI.

Machinery Regulation (EU) 2023/1230

The new EU Machinery Regulation (replacing the Machinery Directive) includes essential health and safety requirements for control systems, including requirements for deterministic behaviour and response time. AI agents integrated into machinery control systems must meet these requirements, which AG-113 supports by establishing timing governance.

DO-178C — Timing Analysis

For aviation AI agents, DO-178C objectives include worst-case execution time analysis for safety-critical software. While DO-178C was not written for AI/ML systems (and ongoing work addresses this gap), the timing analysis principles apply. AI inference that contributes to a safety-critical aviation function must have its execution time bounded and verified.

10. Failure Severity

FieldValue
Severity RatingCritical
Blast RadiusDomain-specific — consequences of late control outputs range from physical contact injuries to process runaways to grid instability

Consequence chain: Without real-time determinism and latency assurance governance, an AI agent's response time is unbounded under adverse conditions. The consequences depend on the controlled system's dynamics and the process safety time. For fast systems (robotics, high-speed process control): a latency exceedance of 100-200 ms can result in physical contact, collision, or process parameter exceedance before any corrective action is possible. For medium-speed systems (power grid, gas network): a latency exceedance of 1-5 seconds can result in frequency instability, pressure exceedance, or voltage collapse. For slower systems (building management, water treatment): latency exceedances of minutes to hours can result in environmental control failure or effluent quality violations. In all cases, the failure mode is that the agent's output arrives too late to prevent the hazard — the correct output produced after the process safety time has elapsed is equivalent to no output at all. The business consequences include those of the resulting hazard (physical injury, infrastructure damage, environmental permit exceedance) plus the additional regulatory finding of inadequate timing governance — demonstrating that the organisation deployed an AI agent in a time-critical safety application without ensuring it could meet its timing requirements under all conditions.

Cross-references: AG-109 (Safe-State Transition Governance) specifies the safe-state transition time that AG-113 timing budgets must accommodate. AG-111 (Hazard Analysis Governance) provides the process safety time analysis that drives AG-113 timing requirements. AG-112 (Sector Safety Constraint Governance) constraints must be evaluated within the governance check time budget defined here. AG-114 (Actuation Interlock Governance) interlock response times are a component of the overall timing budget. AG-001 (Operational Boundary Enforcement) mandate checks must complete within the governance time budget. AG-008 (Governance Continuity Under Failure) addresses what happens when the governance system itself experiences latency failures. AG-046 (Operating Environment Integrity) governs the infrastructure on which timing guarantees depend.

Cite this protocol
AgentGoverning. (2026). AG-113: Real-Time Determinism and Latency Assurance Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-113