AG-382: Dead-Letter Queue Governance

2. Summary

Dead-Letter Queue Governance requires that every AI agent system implements a formally governed dead-letter queue (DLQ) for execution items that have exhausted their retry budgets, been denied by policy controls, or failed in a manner that prevents normal completion — and that every item entering the DLQ is isolated from active processing, classified by failure reason, subject to mandatory human review within defined time windows, and cleared only through an auditable disposition process. Without governed dead-letter handling, irrecoverable failures silently accumulate in unmonitored queues, creating backlogs of unprocessed customer requests, unresolved financial transactions, and uninvestigated governance violations that compound into regulatory exposure and operational risk. This dimension ensures that no failed execution item is forgotten, that poisoned or malicious items cannot re-enter active processing without review, and that the DLQ itself does not become an ungoverned data store that violates retention or erasure obligations.

3. Example

Scenario A — Unmonitored Dead-Letter Queue Accumulates Governed Exposure: A financial-value agent processes cross-border wire transfers. When a transfer fails compliance screening (sanctions check timeout), the transaction is placed in a dead-letter queue. The DLQ has no monitoring, no alerting, and no mandatory review process. Over a period of six weeks, 3,847 failed wire transfers accumulate in the DLQ, representing £14.2 million in pending customer payments. Customers begin complaining about delayed transfers. The operations team discovers the DLQ backlog during a routine monthly review. Investigation reveals that 3,612 of the failures were transient sanctions-service timeouts that would have cleared on re-screening. The remaining 235 require genuine compliance review. The six-week delay in processing these transfers results in 89 customer complaints to the Financial Ombudsman Service, £340,000 in compensation payments for consequential losses (customers who missed property completion deadlines, supplier payment windows, and investment settlement dates), and an FCA supervisory visit focused on operational resilience.

What went wrong: The DLQ existed as a technical construct — a database table where failed items were stored — but had no governance wrapper. No alerting threshold triggered when the queue exceeded a defined size. No mandatory review window required human attention within a defined period. No classification of DLQ items by urgency or financial value existed. The operations team treated the DLQ as a low-priority backlog rather than a container of active financial obligations. Consequence: £14.2 million in delayed customer payments, £340,000 in compensation, FCA supervisory action, reputational damage across 89 formal complaints.

Scenario B — Poisoned Message Re-Injection Causes Cascading Failure: An enterprise workflow agent processes employee onboarding across HR, IT provisioning, and facilities management systems. A malformed employee record — containing a Unicode control character in the surname field — causes the onboarding workflow to fail at the IT provisioning step. The record enters the dead-letter queue. An automated DLQ retry process, configured to re-attempt failed items every 4 hours, re-injects the poisoned record into the workflow. The record fails again at IT provisioning, is returned to the DLQ, and is re-injected 4 hours later. This cycle continues for 11 days. Each re-injection attempt consumes provisioning system resources and generates error logs. On day 8, the provisioning system's error log partition fills to capacity, causing the logging subsystem to fail. With logging unavailable, the provisioning system begins failing open per a misconfigured error handler — new provisioning requests are processed without access control verification. Three contractor accounts are provisioned with administrator-level access that should have been denied by the access control check.

What went wrong: The DLQ had an automated retry mechanism with no poison-message detection. A message that failed for a structural reason (malformed data) was re-injected repeatedly because the retry mechanism did not distinguish between transient and structural failures. The cascading failure — from repeated re-injection to log exhaustion to provisioning system fail-open — demonstrates how an ungoverned DLQ can amplify a minor data quality issue into a security breach. Consequence: Three contractor accounts with unauthorised administrator access for 3 days before detection, log integrity gap of 72 hours, data protection impact assessment required, remediation cost of £220,000 including forensic investigation and access audit.

Scenario C — Dead-Letter Queue Violates Right to Erasure: A customer-facing agent processes insurance quote requests. Failed requests — typically due to incomplete customer data — are placed in a dead-letter queue for manual review and re-processing. A customer submits a quote request that fails due to a missing address field. The request, containing the customer's name, date of birth, income, medical history disclosures, and partial address, enters the DLQ. Three weeks later, the customer exercises their GDPR Article 17 right to erasure, requesting deletion of all their personal data. The organisation's data subject request process searches the active customer database, the CRM, and the document management system — but not the dead-letter queue. The customer's personal data, including sensitive medical disclosures, remains in the DLQ for 14 months until a storage capacity review discovers 47,000 unprocessed DLQ items dating back over a year. The data protection officer discovers that 312 of these items contain personal data for individuals who subsequently exercised erasure rights.

What went wrong: The DLQ was not registered as a personal data processing location in the organisation's data inventory (Record of Processing Activities under GDPR Article 30). The data subject request process did not include DLQ scanning in its search scope. The DLQ had no retention policy — items accumulated indefinitely. No automated purge mechanism existed. The DLQ was treated as a technical buffer, not as a data store subject to data protection obligations. Consequence: 312 GDPR erasure right violations, ICO investigation, potential fine of up to 4% of annual turnover under GDPR Article 83(5), mandatory notification to 312 affected data subjects, reputational damage, and remediation cost of £890,000 including data protection audit, DLQ governance implementation, and legal fees.

4. Requirement Statement

Scope: This dimension applies to all AI agent systems that maintain any form of queue, buffer, holding area, or storage location for execution items that cannot be processed through their normal workflow path. This includes explicitly named dead-letter queues, error tables, failed-transaction logs, retry-exhausted item stores, quarantine zones, and any equivalent mechanism regardless of its technical label. If an execution item can enter a state where it is no longer actively processing but has not been formally resolved — whether through successful completion, deliberate cancellation, or auditable disposition — the storage location for that item is within scope. The scope extends to temporary buffers that are intended to be short-lived but may accumulate items under failure conditions, and to any data store that receives items routed from retry budget exhaustion per AG-381. The scope also includes the personal data and financial data contained within DLQ items, which remain subject to data protection and financial record-keeping obligations regardless of their processing status.

4.1. A conforming system MUST route all execution items that exhaust their retry budget per AG-381, receive a terminal policy denial, or fail with an irrecoverable error to a formally designated dead-letter queue that is isolated from active processing pipelines.

4.2. A conforming system MUST classify every DLQ item upon ingestion by failure reason (retry exhaustion by error class, policy denial, data validation failure, dependency failure, unknown error) and by data sensitivity (contains personal data, contains financial data, contains health data, or no sensitive data).

4.3. A conforming system MUST enforce a maximum review window for each DLQ item — the period within which a human reviewer must examine the item and record a disposition decision — where the window duration is determined by the item's classification and data sensitivity, and must not exceed 72 hours for items containing financial transaction data or personal data.

4.4. A conforming system MUST prevent any DLQ item from being re-injected into active processing without an explicit, logged disposition decision by an authorised human reviewer, ensuring that poisoned or policy-denied items cannot automatically re-enter the workflow.

4.5. A conforming system MUST generate alerts when the DLQ item count exceeds configurable thresholds, when the oldest unreviewed item exceeds its review window, or when the aggregate financial value of DLQ items exceeds a defined ceiling.

4.6. A conforming system MUST include all dead-letter queues in the organisation's data subject request search scope, ensuring that erasure requests, access requests, and portability requests under applicable data protection regulations are fulfilled for data held in DLQ items.

4.7. A conforming system MUST enforce retention limits on DLQ items, automatically escalating items that approach the retention limit to senior review and purging items that exceed the maximum retention period, consistent with AG-016.

4.8. A conforming system MUST maintain a tamper-evident disposition log for every DLQ item recording: ingestion timestamp, classification, reviewer identity, disposition decision (re-process, cancel, escalate, purge), disposition rationale, and disposition timestamp, consistent with AG-006.

4.9. A conforming system SHOULD implement poison-message detection that identifies items which have previously been re-injected and failed again, preventing repeated re-injection cycles.

4.10. A conforming system SHOULD segregate DLQ storage by data sensitivity classification, ensuring that items containing health data, financial data, or other specially protected categories are stored with access controls appropriate to their sensitivity level.

4.11. A conforming system SHOULD expose DLQ metrics — item count, age distribution, classification breakdown, aggregate financial value — to operational dashboards in real time.

4.12. A conforming system MAY implement automated disposition for defined low-risk DLQ item categories (e.g., transient timeout failures for non-financial, non-personal-data operations) where the disposition rule set is versioned, approved, and auditable.

5. Rationale

Dead-letter queues are a well-established pattern in distributed systems engineering, but in autonomous agent systems they acquire governance significance that far exceeds their traditional role as a reliability mechanism. In conventional message-processing systems, a dead-letter queue is a technical safety net — a place for messages that cannot be processed, reviewed periodically by engineers, and either fixed and re-processed or discarded. In agent systems, the items entering the DLQ represent failed attempts to affect external state: payments that were not made, customer requests that were not fulfilled, compliance checks that were not completed, and governance decisions that were not resolved. Each unresolved DLQ item is an open obligation — financial, contractual, regulatory, or ethical — that accumulates risk with every hour it remains unaddressed.

The governance risk of ungoverned dead-letter queues manifests in four distinct failure modes. First, silent accumulation: DLQ items accumulate without alerting, creating backlogs that represent hidden operational and governed exposure. An unmonitored DLQ containing thousands of failed payment transactions is a latent financial liability that does not appear on any dashboard or report until someone inspects the queue directly. Second, poison-message amplification: items that failed for structural reasons (malformed data, policy violations, semantic errors) are automatically re-injected into active processing, consuming resources and potentially causing cascading failures each time they fail again. Third, data protection violation: DLQ items containing personal data remain subject to data protection obligations — access rights, erasure rights, portability rights, retention limits — but DLQs are routinely excluded from data subject request processes and data inventories because they are classified as "technical infrastructure" rather than "data processing." Fourth, governance circumvention: without mandatory human review of DLQ items, an agent system can effectively bypass a governance control by routing the denied item to the DLQ and then automatically re-injecting it, treating the DLQ as a temporary holding area rather than a governance checkpoint.

The regulatory landscape increasingly recognises that operational backlogs and unresolved processing failures carry compliance risk. DORA Article 10 requires financial entities to implement incident management processes that address the resolution of ICT-related incidents — a DLQ backlog of failed financial transactions is an unresolved ICT-related incident. GDPR Article 17 right to erasure applies to all personal data held by the controller, regardless of the data's processing status — personal data in a DLQ is still personal data, and failure to include DLQs in erasure request processing is a compliance violation. The EU AI Act's Article 9 risk management requirements extend to the operational behaviour of AI systems, including how they handle irrecoverable failures — an AI system that silently discards failed operations without human review fails the risk management standard.

The relationship between AG-382 and AG-381 (Retry Budget by Error Class Governance) is direct and architectural. AG-381 defines when an execution item has exhausted its retry options. AG-382 defines what happens next. Without AG-382, retry budget exhaustion per AG-381 becomes a dead end — the item is no longer retrying but has no formal disposition path. Without AG-381, the DLQ receives items without classification, making review and disposition decisions harder and less reliable. Together, these two dimensions create a complete lifecycle for failed execution: classification, bounded retry, isolation, review, and auditable disposition.

6. Implementation Guidance

AG-382 establishes the dead-letter queue as a governed component of the agent execution infrastructure — not merely a technical buffer, but a formal governance checkpoint where irrecoverable execution items are isolated, classified, reviewed, and dispositioned under audit. The DLQ must be implemented as a first-class system component with its own access controls, monitoring, alerting, and retention policies.

Recommended patterns:

Classified DLQ with sensitivity-based segregation. Implement the DLQ as a structured data store with mandatory fields: item identifier, ingestion timestamp, source workflow, failure reason, error class (from AG-381 classification), data sensitivity classification, financial value (if applicable), review deadline, and disposition status. Segregate storage by sensitivity level — items containing health data in a separate partition with elevated access controls, items containing financial data with audit-grade access logging, and non-sensitive items in a general partition. This segregation ensures that access control granularity matches data sensitivity.
Review workflow with SLA enforcement. Implement a review workflow that assigns each DLQ item to a reviewer based on its classification. Financial-value items route to the finance operations team; policy-denied items route to the compliance team; data validation failures route to the data quality team. The workflow enforces review SLAs — items approaching their review window deadline trigger escalation alerts, and items exceeding their deadline trigger senior management notification. The review interface presents the item's full context: original request, failure chain, retry history per AG-381, and available disposition actions.
Disposition decision framework. Define a finite set of disposition actions: re-process (re-inject into active processing after human verification that the failure cause has been addressed), cancel (permanently terminate the workflow with a recorded rationale), escalate (route to a higher authority for decision), purge (delete the item and associated data, used for data protection compliance or retention limit enforcement), and hold (retain the item for a defined additional period pending external resolution). Each disposition action generates a tamper-evident log entry per AG-006.
Poison-message circuit breaker. Track the re-injection history of each DLQ item. If an item has been re-injected and returned to the DLQ more than once, flag it as a potential poison message and block further re-injection without escalated review. Implement a re-injection counter that persists across DLQ cycles — an item that has been re-injected twice and failed twice should not be re-injectable a third time without senior reviewer approval. This prevents the cascading failure pattern described in Scenario B.

Anti-patterns to avoid:

Automated DLQ retry without classification. The most dangerous anti-pattern is an automated process that periodically re-injects all DLQ items into active processing regardless of failure reason. This treats all failures as transient, which they are not. Policy denials re-injected into active processing are governance bypass attempts. Malformed data re-injected without correction will fail again. Automated retry of unclassified DLQ items is structurally equivalent to unbounded retry with no budget — it circumvents the controls established by AG-381.
DLQ as an unmonitored append-only store. When DLQ items are written but never read, the queue becomes a black hole for failed operations. No alerting, no review process, no retention policy. Items accumulate indefinitely, representing hidden operational debt and data protection liability. This pattern is especially common in early-stage agent deployments where the DLQ is implemented as a "we'll deal with it later" mechanism.
Excluding DLQs from data inventories. DLQs that contain personal data, financial data, or health data must be registered in the organisation's Record of Processing Activities (GDPR Article 30) and included in the scope of data subject request processes. Treating the DLQ as "technical infrastructure" exempt from data protection obligations is a compliance failure waiting for a data subject request to expose it.
Shared DLQ across sensitivity levels. A single DLQ that mixes items containing health data, financial data, and non-sensitive operational data forces the highest sensitivity access controls onto all items, creating operational friction, or applies the lowest sensitivity controls to all items, creating data protection risk. Segregation by sensitivity level enables appropriate access control granularity.
Disposition without recorded rationale. A reviewer who clicks "cancel" without recording why the item was cancelled has made an unauditable governance decision. Every disposition must include a rationale — even if standardised — that explains why the chosen action was appropriate. This is essential for regulatory evidence and for learning from failure patterns.

Industry Considerations

Financial Services. DLQ items representing financial transactions — payments, trades, settlements — carry specific obligations. A failed payment in the DLQ is still a customer obligation that accrues interest, penalty, and complaint risk with every hour of delay. Payment Services Directive 2 (PSD2) Article 89 requires payment service providers to ensure that payment transactions are executed within defined timeframes — a DLQ backlog that delays payments beyond these windows creates regulatory exposure. The FCA's Consumer Duty (PS22/9) requires firms to deliver good outcomes for customers, which includes timely resolution of failed transactions. DLQ review windows for financial items should be measured in hours, not days.

Healthcare. DLQ items from clinical decision support agents may contain patient data subject to heightened protection. A failed prescription verification in the DLQ represents a patient waiting for medication. HIPAA minimum necessary requirements apply to DLQ review access — reviewers should see only the data necessary for disposition, not the full clinical record. Review windows for patient-impacting items must reflect clinical urgency.

Critical Infrastructure and Robotics. DLQ items from safety-critical agents require immediate attention because they may represent unresolved safety conditions. A failed safety interlock check in the DLQ means the interlock status is unknown. IEC 61508 requirements for safety-instrumented system fault management map directly to DLQ governance for safety-critical items. Review windows should be measured in minutes, not hours, and automatic process halt should occur when safety-critical DLQ items are detected.

Crypto and Web3. Failed blockchain transactions in the DLQ carry unique risks: gas fees were consumed on failed transactions, nonce sequences may be disrupted, and time-sensitive DeFi operations (liquidation protection, yield harvesting) may become worthless if delayed. DLQ review must account for the time-sensitivity of on-chain operations and the irrecoverability of consumed gas costs.

Maturity Model

Basic Implementation — The organisation has a designated dead-letter queue for each agent system. Failed items are routed to the DLQ after retry budget exhaustion per AG-381. A weekly manual review process examines DLQ items and makes disposition decisions. Items are classified by failure reason but not by data sensitivity. Alerting exists for queue size thresholds. Disposition decisions are logged but may not be tamper-evident. The DLQ is not yet included in the data subject request search scope. This level prevents the worst failure modes (silent accumulation without any review, unlimited poison-message re-injection) but has gaps in data protection compliance and review timeliness.

Intermediate Implementation — DLQ items are classified by both failure reason and data sensitivity upon ingestion. Storage is segregated by sensitivity level with appropriate access controls. Review SLAs are enforced with automated escalation for overdue items. The DLQ is registered in the organisation's Record of Processing Activities and included in data subject request search scope. Disposition decisions are recorded in a tamper-evident log per AG-006 with mandatory rationale. Poison-message detection prevents repeated re-injection. Retention limits are enforced with automated escalation and purge. Real-time DLQ metrics are exposed to operational dashboards.

Advanced Implementation — All intermediate capabilities plus: machine-learning-assisted DLQ classification that detects novel failure patterns and recommends disposition actions. Automated disposition for low-risk, well-understood failure categories with auditable rule sets. Cross-agent DLQ correlation that identifies systemic failures affecting multiple agents simultaneously. Integration with AG-016 data retention governance for automated retention enforcement and erasure compliance. DLQ governance has been verified through independent adversarial testing, including scenarios where an agent attempts to use the DLQ as a governance bypass mechanism. DLQ metrics feed into organisational risk dashboards, and aggregate DLQ governed exposure is reported to senior management.

7. Evidence Requirements

Required artefacts:

DLQ configuration artefact. The complete DLQ configuration including: classification taxonomy, sensitivity-based segregation rules, review window definitions per classification, alerting thresholds, retention limits, and automated disposition rules (if any). Format: structured data, versioned and governed per AG-007.
DLQ ingestion log. Timestamped records of every item entering the DLQ including: item identifier, source workflow, failure reason, error class from AG-381, data sensitivity classification, financial value (if applicable), and review deadline. Tamper-evident per AG-006.
Disposition decision log. Records of every disposition decision including: item identifier, reviewer identity, disposition action, disposition rationale, disposition timestamp, and whether the item was re-injected, cancelled, escalated, purged, or held. Tamper-evident per AG-006.
Review SLA compliance report. Metrics demonstrating the percentage of DLQ items reviewed within their defined review window, items that exceeded the window with escalation evidence, and time-to-disposition distribution.
Data subject request coverage evidence. Documentation demonstrating that DLQ search is included in the data subject request process, with evidence from at least two test data subject requests (one erasure, one access) showing DLQ items were included in the search scope.

Retention requirements:

DLQ configuration, ingestion logs, and disposition logs: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise. DLQ items themselves are subject to the retention limits defined in the DLQ configuration, which must comply with AG-016.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Evidence must exist as retained artefacts, not be reconstructable after the fact. DLQ item access must be controlled by data sensitivity classification — auditors reviewing disposition decisions for health-data items must have appropriate clearance.

8. Test Specification

Testing AG-382 compliance requires validating that DLQ ingestion, classification, isolation, review, disposition, and data protection integration all function as governed processes. A comprehensive test programme should include the following tests.

Test 8.1: DLQ Routing from Retry Exhaustion

Stimulus: Trigger retry budget exhaustion per AG-381 across each error class (transient, semantic, policy). Verify that the exhausted item is routed to the DLQ.
Expected behaviour: Each retry-exhausted item appears in the DLQ with correct classification inherited from the AG-381 error class. The item is no longer present in the active processing pipeline. The DLQ ingestion log records the item with all required fields.
Pass criteria: 100% of retry-exhausted items are routed to the DLQ. Each item's DLQ classification matches its AG-381 error class. No retry-exhausted item remains in or returns to active processing without disposition.
Fail criteria: Any retry-exhausted item fails to appear in the DLQ. Any item's classification is missing or incorrect. Any retry-exhausted item remains in active processing after budget exhaustion.

Test 8.2: DLQ Item Classification Completeness

Stimulus: Route items to the DLQ with varying failure reasons and data sensitivity levels — including items containing personal data, financial transaction data, health data, and non-sensitive operational data.
Expected behaviour: Each item is classified by both failure reason and data sensitivity upon ingestion. Items containing sensitive data are routed to the appropriate segregated partition. Classification is recorded in the ingestion log.
Pass criteria: Every DLQ item has both a failure reason classification and a data sensitivity classification. Items are stored in the partition matching their sensitivity level. No item lacks either classification field.
Fail criteria: Any DLQ item lacks failure reason or data sensitivity classification. Any item is stored in a partition that does not match its sensitivity level.

Test 8.3: Re-Injection Prevention Without Human Review

Stimulus: Attempt to re-inject a DLQ item into active processing through: direct API call to the processing pipeline, automated retry mechanism, agent instruction to re-process the item, and database manipulation of the item's status field.
Expected behaviour: All re-injection attempts are blocked unless preceded by an authorised human reviewer's explicit disposition decision of "re-process." The blocking mechanism operates at the infrastructure layer, not in the agent's application logic.
Pass criteria: No re-injection succeeds without a logged human disposition decision. All bypass attempts (API, automated retry, agent instruction, database manipulation) are blocked and logged.
Fail criteria: Any re-injection succeeds without a human disposition decision. Any bypass technique circumvents the re-injection control.

Test 8.4: Review Window SLA Enforcement

Stimulus: Route items to the DLQ with defined review windows (e.g., 24 hours for financial data items, 72 hours for non-sensitive items). Allow items to approach and exceed their review windows without disposition.
Expected behaviour: Alerting triggers at configurable thresholds before the window expires (e.g., at 75% and 90% of the window). When the window expires without disposition, escalation to senior review occurs automatically. The escalation is logged.
Pass criteria: Alerts fire at configured thresholds. Escalation occurs when the review window expires. No item exceeds its review window without generating both an alert and an escalation. Financial data items enforce the 72-hour maximum window.
Fail criteria: No alert fires when items approach their review window. No escalation occurs when the window expires. Items exceed their review window silently.

Test 8.5: DLQ Alerting on Threshold Breach

Stimulus: Progressively add items to the DLQ until the configured count threshold, age threshold, and aggregate financial value threshold are each exceeded.
Expected behaviour: Each threshold breach generates an alert through the configured alerting channel. Alerts include the threshold that was breached, the current value, and the configured limit.
Pass criteria: Alerts fire for each threshold type (count, age, financial value) when the threshold is exceeded. Alert content identifies the specific threshold breached. No threshold breach occurs without generating an alert.
Fail criteria: Any threshold breach does not generate an alert. Alert content does not identify which threshold was breached.

Test 8.6: Data Subject Request Coverage

Stimulus: Place items containing personal data into the DLQ. Submit a data subject access request (DSAR) and a data subject erasure request for the data subjects whose data is in the DLQ items. Execute the organisation's data subject request process.
Expected behaviour: The DSAR process includes DLQ search and returns the personal data held in DLQ items. The erasure process includes DLQ search and either purges the DLQ items or redacts the personal data. Both processes generate evidence of DLQ inclusion.
Pass criteria: The DSAR response includes personal data from DLQ items. The erasure process purges or redacts personal data in DLQ items. Evidence of DLQ search is recorded for both request types.
Fail criteria: The DSAR response omits personal data held in DLQ items. The erasure process does not reach DLQ items. No evidence of DLQ search is recorded.

Test 8.7: Disposition Log Tamper Evidence

Stimulus: Execute disposition decisions across all disposition types (re-process, cancel, escalate, purge, hold). Attempt to modify the disposition log entries after they are recorded — alter the reviewer identity, change the disposition action, modify the rationale, or delete an entry.
Expected behaviour: Each disposition decision generates a tamper-evident log entry with all required fields per AG-006. Attempts to modify or delete log entries are detected and blocked or flagged.
Pass criteria: Every disposition generates a log entry with: item identifier, reviewer identity, disposition action, rationale, and timestamp. No log entry can be modified or deleted without detection. Tamper-evidence mechanism is consistent with AG-006.
Fail criteria: Any disposition lacks a log entry. Any required field is missing. Log entries can be modified or deleted without detection.

Test 8.8: Poison-Message Detection and Re-Injection Block

Stimulus: Route an item to the DLQ, disposition it as "re-process" with human approval, re-inject it, allow it to fail again and return to the DLQ. Attempt to disposition it as "re-process" again.
Expected behaviour: The system detects that the item has been previously re-injected and failed. The item is flagged as a potential poison message. Re-injection is blocked without escalated (senior reviewer) approval. The re-injection history is visible to the reviewer.
Pass criteria: Items that have failed after re-injection are flagged as potential poison messages. Standard re-injection is blocked for flagged items. Escalated approval enables re-injection with additional logging. Re-injection history is presented to reviewers.
Fail criteria: Items are not flagged after failing a second time post-re-injection. Standard re-injection is permitted for items that have already failed after re-injection. Re-injection history is not visible to reviewers.

Conformance Scoring

Score 0: No formal dead-letter queue exists — failed execution items are discarded, silently accumulated in logs, or left in an unmonitored state with no review or disposition process.
Score 1: A dead-letter queue exists and receives failed items, but review is ad hoc, items are not classified by failure reason or data sensitivity, and re-injection is not controlled by mandatory human review.
Score 2: DLQ items are classified, review windows are enforced with alerting and escalation, re-injection requires human disposition, and the disposition log is tamper-evident. Data subject request processes include DLQ search.
Score 3: Verified by independent adversarial testing — an independent party has attempted to bypass DLQ controls through automated re-injection, DLQ data exfiltration, poison-message cycling, and data subject request evasion, and all attempts failed.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 12 (Record-Keeping)	Direct requirement
SOX	Section 404 (Internal Controls Over Financial Reporting)	Direct requirement
FCA SYSC	6.1.1R (Systems and Controls)	Direct requirement
NIST AI RMF	GOVERN 1.5, MANAGE 2.2	Supports compliance
ISO 42001	Clause 8.4 (AI System Operation), Clause 9.1 (Monitoring, Measurement, Analysis)	Supports compliance
DORA	Article 10 (ICT-related Incident Management), Article 11 (ICT-related Incident Classification)	Direct requirement

EU AI Act — Article 9 (Risk Management System)

Article 9 requires providers of high-risk AI systems to establish a risk management system that identifies risks and implements mitigation measures throughout the system lifecycle. Irrecoverable execution failures represent operational risks that, if unmanaged, can cascade into financial, safety, and rights-related harms. DLQ governance implements the risk mitigation measure for the specific risk of unresolved failures accumulating without oversight. The requirement that the risk management system operate "throughout the entire lifecycle" means that failure management — not only normal operation — is within the regulatory scope. An AI system that operates correctly 99.5% of the time but has no governance over the 0.5% that fails does not meet Article 9's standard.

EU AI Act — Article 12 (Record-Keeping)

Article 12 requires high-risk AI systems to include logging capabilities that enable monitoring of the system's operation and post-market monitoring. DLQ disposition logs are a direct implementation of this requirement for the failure path of the AI system's operation. The logs record what failed, why it failed, who reviewed it, what decision was made, and what rationale supported that decision. Without these records, the organisation cannot demonstrate to supervisory authorities how failures were managed — a gap that Article 12 specifically addresses.

SOX — Section 404 (Internal Controls Over Financial Reporting)

For AI agents executing financial operations, the DLQ contains items that represent incomplete financial transactions. A failed payment in the DLQ is an unresolved financial obligation that affects the accuracy of financial reporting. SOX Section 404 requires management to assess the effectiveness of internal controls over financial reporting — a DLQ backlog of unresolved financial transactions represents a control weakness. The auditor will ask: "How do you ensure that failed financial transactions are resolved within a defined timeframe?" and "Can you demonstrate that no failed financial transaction was lost or unreviewed?" DLQ governance with review SLAs and disposition logging provides the answer to both questions.

FCA SYSC — 6.1.1R (Systems and Controls)

SYSC 6.1.1R requires firms to maintain adequate systems and controls sufficient to ensure compliance with applicable obligations. For firms deploying AI agents, this includes controls over failure management. The FCA's Consumer Duty (PS22/9) creates a specific obligation to deliver good outcomes for customers — a DLQ backlog of unprocessed customer transactions directly undermines this obligation. The FCA's operational resilience framework requires firms to manage the resolution of important business services disruptions, including disruptions caused by accumulated processing failures. DLQ review windows with SLA enforcement demonstrate that the firm manages failure resolution with the same rigour as normal operation.

NIST AI RMF — GOVERN 1.5, MANAGE 2.2

GOVERN 1.5 addresses ongoing monitoring and periodic review of the AI risk management process. MANAGE 2.2 addresses mechanisms for tracking and responding to known AI risks. DLQ governance implements ongoing monitoring of failure states (GOVERN 1.5) and structured response to known failure modes (MANAGE 2.2). The disposition framework — classify, review, decide, log — directly implements the structured response that NIST envisions for known operational risks.

ISO 42001 — Clause 8.4, Clause 9.1

Clause 8.4 addresses the operation of AI systems, including operational controls for non-normal conditions. Clause 9.1 addresses monitoring, measurement, analysis, and evaluation of the AI management system. DLQ governance satisfies both: it provides operational controls for failure conditions (8.4) and monitoring metrics that enable management to evaluate how effectively the system handles failures (9.1). DLQ metrics — review SLA compliance, time-to-disposition, re-injection rates — are directly applicable to the performance evaluation required by Clause 9.1.

DORA — Article 10, Article 11

Article 10 requires financial entities to establish ICT-related incident management processes including detection, logging, classification, and resolution of incidents. Article 11 requires classification of incidents based on their impact. DLQ items in a financial agent system are ICT-related incidents — failed processing events that require detection, classification, and resolution. AG-382's classification requirement (failure reason and data sensitivity) directly implements Article 11's classification requirement. The review window and disposition process implement Article 10's resolution requirement. DORA's emphasis on timely incident resolution makes the review window SLA particularly significant: a DLQ backlog that grows without timely review is, under DORA's framework, an accumulation of unresolved ICT-related incidents.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Service-wide — extends to customer-facing obligations, regulatory compliance posture, and data protection commitments across the organisation

Consequence chain: Without governed dead-letter handling, irrecoverable execution failures enter a state of administrative limbo where they are neither actively processing nor formally resolved. The immediate technical failure is silent accumulation — items enter an unmonitored store where they consume storage resources and accumulate governance liability without any mechanism to trigger review or resolution. The operational consequence develops over time as the unresolved backlog grows: customer transactions remain unprocessed, creating complaint volumes and compensation liability that scale linearly with the backlog size and the duration of inattention. Financial transactions in the DLQ represent unreconciled obligations that affect the accuracy of financial reporting and settlement positions. Personal data in the DLQ remains subject to data protection obligations that the organisation is failing to fulfil — every data subject request that does not search the DLQ is a compliance violation. When poison messages are automatically re-injected without detection, the DLQ becomes an amplification mechanism: each re-injection cycle consumes processing resources, generates error logs, and may cascade into infrastructure failures that affect healthy workflows. The business consequence includes regulatory enforcement action for inadequate operational resilience (DORA, FCA operational resilience), data protection fines for failure to include DLQs in data subject request processes (GDPR Article 83), SOX findings for unresolved financial obligations, customer compensation payments that scale with backlog duration, and reputational damage from visible processing failures. The severity compounds non-linearly: a DLQ backlog discovered at 100 items is a minor operational issue; the same backlog discovered at 50,000 items after six months is a regulatory incident requiring board-level disclosure.

Cite this protocol

AgentGoverning. (2026). AG-382: Dead-Letter Queue Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-382

← Previous Protocol

AG-381

Retry Budget by Error Class Governance

Next Protocol →

AG-383

Runtime Scheduler Fairness Governance