AG-380

Checkpoint Garbage-Collection Governance

Runtime Execution, Workflow & State ~24 min read AGS v2.1 · April 2026
EU AI Act GDPR SOX FCA NIST HIPAA ISO 42001

2. Summary

Checkpoint Garbage-Collection Governance requires that stale checkpoints, incomplete execution state, and orphaned workflow snapshots are retired through a formally defined, auditable process that preserves forensic integrity while reclaiming system resources. Checkpoints accumulate as agents execute multi-step workflows — each representing a recoverable snapshot of execution state — and without governed retirement, they consume storage, create security exposure through retained sensitive data, and risk accidental resumption of obsolete execution contexts. This dimension ensures that checkpoint lifecycle management follows defined retention policies, that deletion is verifiable and irreversible, and that no checkpoint is removed while it remains required for rollback, audit, or regulatory retention.

3. Example

Scenario A — Stale Checkpoint Resumes Obsolete Trading Strategy: An investment management firm deploys AI agents to execute algorithmic trading strategies. Each agent checkpoints its execution state — including portfolio positions, pending orders, and risk parameters — every 60 seconds. When a strategy is retired and replaced with a new version, the agent's 14,000 accumulated checkpoints are not cleaned up. Three months later, a system restart causes the orchestration platform to scan for resumable checkpoints. It discovers and resumes a checkpoint from the retired strategy, which begins executing trades based on a three-month-old market model. Before detection, the agent executes 312 trades totalling £8.7 million in notional value, resulting in a £1.4 million loss against positions the firm no longer intends to hold.

What went wrong: No garbage-collection policy existed for checkpoints associated with retired workflows. The orchestration platform treated all checkpoints as valid resumption points regardless of age or the status of the associated workflow. The checkpoint contained sufficient state to resume autonomous execution without re-validation. Consequence: £1.4 million direct trading loss, FCA investigation into algorithmic trading controls, potential enforcement action under MAR for uncontrolled market activity, £3.2 million in remediation costs including checkpoint lifecycle management system implementation.

Scenario B — Accumulated Checkpoints Expose Personal Data Beyond Retention Period: A customer-facing AI agent processing insurance claims creates checkpoints at each workflow stage containing the claimant's personal data: name, address, medical records, bank details, and claim photographs. The agent processes 2,500 claims per month. After 18 months, the organisation holds 315,000 checkpoint files containing personal data for claims that have been fully settled, many beyond the 12-month retention period defined in the organisation's privacy policy. A data subject access request from a former claimant reveals that the organisation holds 14 checkpoint snapshots containing the individual's medical records despite the claim being settled 16 months prior. The supervisory authority investigates and discovers the systemic retention failure.

What went wrong: Checkpoint creation was automated as part of the workflow engine, but no corresponding garbage-collection process existed. The checkpoint storage was not integrated with the data retention policy engine. Personal data embedded in checkpoints was invisible to the organisation's data lifecycle management. Consequence: GDPR Article 5(1)(e) violation for storage limitation principle breach, supervisory authority investigation, potential fine of up to €10 million or 2% of global annual turnover under Article 83(4), £1.8 million remediation programme to identify and purge all over-retained checkpoint data, reputational damage from public notification.

Scenario C — Checkpoint Accumulation Causes Safety-Critical System Failure: An embodied AI agent controlling an autonomous warehouse logistics system checkpoints its state — including vehicle positions, route plans, and obstacle maps — every 5 seconds to support rapid recovery from failures. The checkpoint storage volume is provisioned with 500 GB. Over six months of continuous operation, checkpoint accumulation fills the volume to 98% capacity. The garbage-collection process, implemented as a low-priority background task, cannot keep pace with checkpoint creation. When the volume reaches 100%, the checkpoint write fails silently, and the agent continues operating without recovery capability. A subsequent sensor failure causes the agent to collide with a warehouse worker, and the system cannot roll back to a safe state because no valid checkpoint exists. The worker sustains injuries requiring hospitalisation.

What went wrong: Garbage-collection was implemented as a best-effort background process without resource guarantees. No monitoring alerted on checkpoint storage approaching capacity. The silent write failure meant the system continued operating without the recovery capability it was designed to provide. The collision recovery procedure assumed checkpoint availability that no longer existed. Consequence: worker injury requiring hospitalisation, HSE investigation, potential prosecution under the Health and Safety at Work Act 1974, £2.1 million in compensation and legal costs, facility shutdown pending safety review, insurance premium increase of £450,000 annually.

4. Requirement Statement

Scope: This dimension applies to all AI agent systems that create checkpoints, snapshots, or persistent records of execution state for the purpose of recovery, resumption, rollback, or debugging. A checkpoint is any persistent artefact that captures the agent's execution context at a point in time with the intent of enabling future state restoration. This includes explicit checkpoints created by workflow orchestration engines, implicit snapshots created by container orchestration systems, database transaction savepoints used for agent state recovery, serialised agent memory states, cached intermediate computation results retained for recovery purposes, and any other persistent artefact tied to a specific point in an agent's execution timeline. Systems that create no persistent execution state — stateless agents that re-derive all context on each invocation — are excluded. The scope extends to checkpoint-like artefacts regardless of their storage medium: filesystem, object storage, database, distributed cache, or blockchain. If it captures execution state and persists beyond the immediate execution context, it is within scope.

4.1. A conforming system MUST define a formal checkpoint retention policy for each workflow type, specifying the maximum number of checkpoints retained per workflow instance, the maximum age of retained checkpoints, and the conditions under which checkpoints become eligible for garbage collection.

4.2. A conforming system MUST execute garbage collection of eligible checkpoints automatically according to the defined retention policy, without requiring manual intervention for routine cleanup.

4.3. A conforming system MUST prevent garbage collection of any checkpoint that is still required for an active workflow's rollback capability, an in-progress regulatory hold, an unresolved audit request, or a pending data-subject access request.

4.4. A conforming system MUST ensure that checkpoint deletion is irreversible and verifiable — deleted checkpoint data cannot be recovered from the storage medium, and the deletion event is recorded in a tamper-evident log including the checkpoint identifier, deletion timestamp, retention policy that triggered the deletion, and the identity of the process that executed the deletion.

4.5. A conforming system MUST prevent stale checkpoints from being used to resume or restore agent execution after the checkpoint has been superseded by a more recent valid checkpoint or after the associated workflow has reached a terminal state.

4.6. A conforming system MUST monitor checkpoint storage utilisation and generate alerts when utilisation exceeds a configurable threshold, ensuring that garbage-collection capacity keeps pace with checkpoint creation rate.

4.7. A conforming system MUST ensure that garbage collection of checkpoints containing personal data complies with applicable data retention regulations, including the right to erasure under GDPR Article 17, and that checkpoint retirement is integrated with the organisation's data lifecycle management processes.

4.8. A conforming system SHOULD implement garbage collection as a resource-guaranteed process with dedicated compute and I/O allocation, rather than a best-effort background task that competes with production workloads for resources.

4.9. A conforming system SHOULD classify checkpoint data by sensitivity tier (e.g., contains personal data, contains financial data, contains safety-critical state) and apply tier-appropriate deletion methods, including cryptographic erasure or secure overwrite for sensitive tiers.

4.10. A conforming system SHOULD maintain a checkpoint registry that tracks the lifecycle state of every checkpoint (active, eligible for collection, held for regulatory or audit purposes, deleted) and is queryable for compliance reporting.

4.11. A conforming system MAY implement checkpoint compression or deduplication to reduce storage consumption between garbage-collection cycles while preserving the ability to restore from any retained checkpoint.

4.12. A conforming system MAY support legal hold overrides that indefinitely suspend garbage collection for checkpoints associated with specific workflow instances, entities, or time periods, triggered by compliance or legal teams without requiring engineering intervention.

5. Rationale

Checkpoint Garbage-Collection Governance addresses a deceptively mundane but operationally critical risk: the uncontrolled accumulation of persistent execution state. Every modern agent framework that supports multi-step workflows creates checkpoints — snapshots of the agent's execution context that enable recovery from failures, rollback from errors, and resumption after interruptions. These checkpoints are essential for operational resilience, but without governed lifecycle management, they become a growing liability across three dimensions: resource consumption, data protection, and execution safety.

The resource dimension is straightforward but consequential. Checkpoints consume storage. In high-throughput deployments — an agent processing thousands of transactions daily, each checkpointed multiple times — storage accumulation is rapid. A single agent checkpointing every 60 seconds produces 525,600 checkpoints per year. If each checkpoint is 50 KB (modest for a workflow with meaningful state), that is 25 GB per agent per year. An organisation deploying 100 agents accumulates 2.5 TB of checkpoint data annually. Without garbage collection, this grows indefinitely, consuming infrastructure budget and eventually causing storage exhaustion failures. The failure mode of storage exhaustion is particularly dangerous because it is often silent — the checkpoint write fails, the agent continues operating, and the recovery capability the checkpoints were designed to provide silently disappears.

The data protection dimension is more insidious. Checkpoints capture the agent's execution context, which frequently includes the data being processed: customer personal data, financial transaction details, medical records, authentication tokens, API keys, and other sensitive information. Each checkpoint is a snapshot of this data at a point in time. Data protection regulations — GDPR, CCPA, HIPAA — require that personal data be retained only as long as necessary for the purpose for which it was collected. Checkpoint data is typically collected for the purpose of enabling workflow recovery. Once the workflow is complete, the recovery purpose is fulfilled, and continued retention requires a separate legal basis. Many organisations fail to recognise checkpoints as a data processing activity subject to retention limitations, creating systemic non-compliance that scales with the volume of checkpoint data accumulated.

The execution safety dimension is the most dangerous. Stale checkpoints are not inert — they are resumable execution contexts. An orchestration system that discovers a checkpoint may attempt to resume execution from that point. If the checkpoint is stale — from a retired workflow version, an outdated data model, or an obsolete operational context — resumption can cause the agent to execute actions that are inconsistent with the current operational state. This is not a hypothetical risk; it is a well-documented failure pattern in distributed systems where orphaned state leads to zombie processes executing unintended actions.

The intersection with AG-379 (Workflow State-Machine Integrity Governance) is direct: AG-379 governs the integrity of state transitions during execution; AG-380 governs the integrity of state artefacts after execution phases complete. Together, they ensure that workflow state is correct during execution and properly retired after execution. The intersection with AG-006 (Tamper-Evident Record Integrity) is equally direct: checkpoint deletion must be logged with the same tamper-evidence guarantees as checkpoint creation, ensuring a complete forensic trail of the checkpoint's lifecycle.

The intersection with AG-016 (Data Retention & Right to Erasure) creates a regulatory obligation: checkpoints containing personal data must be subject to the same retention and erasure policies as any other personal data store. The practical challenge is that checkpoints are often created by infrastructure layers that are invisible to the organisation's data governance team. AG-380 makes checkpoint data lifecycle visible and governable.

6. Implementation Guidance

AG-380 establishes the checkpoint retention policy as the central governance artefact for checkpoint lifecycle management. A checkpoint retention policy is a versioned, formally defined specification that determines when checkpoints are created, how long they are retained, under what conditions they become eligible for garbage collection, and how they are deleted. The policy is not optional — it is a structural requirement that prevents unbounded state accumulation, ensures data protection compliance, and eliminates the risk of stale checkpoint resumption.

The retention policy should be stored as structured configuration data in a persistent layer independent of the agent's runtime. The garbage-collection mechanism should operate as an autonomous process with its own resource allocation, monitoring, and alerting. Garbage collection should never depend on the agent's cooperation — it is an infrastructure concern, not an agent concern.

Recommended patterns:

Anti-patterns to avoid:

Industry Considerations

Financial Services. Checkpoint retention in financial services must account for multiple overlapping regulatory retention requirements: MiFID II requires retention of records relating to investment services for a minimum of 5 years (7 years in some jurisdictions). SOX requires retention of audit-relevant records for 7 years. Anti-money laundering regulations require retention of transaction records for 5 years after the business relationship ends. Checkpoints that capture transaction state during processing may fall under any or all of these requirements. Financial services firms should implement a checkpoint classification scheme that maps checkpoint content to applicable retention schedules, ensuring that garbage collection respects the longest applicable retention period while not over-retaining beyond it.

Healthcare. Checkpoints in healthcare workflows frequently contain protected health information (PHI) subject to HIPAA retention requirements. The HIPAA Privacy Rule requires that covered entities retain PHI for 6 years from the date of creation or the date when it was last in effect, whichever is later. Checkpoints from clinical workflow agents — capturing patient data during triage, assessment, or treatment planning — must respect this retention floor. Additionally, the HIPAA Security Rule requires that disposal of PHI render it unrecoverable, mapping directly to the secure deletion requirements of AG-380. State-level medical records retention laws may impose longer retention periods.

Safety-Critical Systems. In safety-critical deployments, checkpoint garbage collection must never compromise the system's ability to recover from failures. Garbage-collection processes for CPS agents should be formally verified to ensure that at least the minimum required number of recovery checkpoints is always available. The garbage-collection process itself must be fail-safe: if garbage collection fails mid-execution, it must not leave the checkpoint store in an inconsistent state where some checkpoints are partially deleted. IEC 61508 safety integrity levels inform the rigour required for garbage-collection process verification.

Crypto/Web3. Checkpoint state for agents interacting with blockchain systems may include private keys, unsigned transaction data, or intermediate state proofs. Garbage collection of such checkpoints requires cryptographic-grade secure deletion to prevent key material recovery. Additionally, checkpoint retention must account for blockchain finality periods — a checkpoint associated with a pending transaction must not be garbage-collected until the transaction has achieved sufficient confirmation depth.

Maturity Model

Basic Implementation — The organisation has defined checkpoint retention policies for each workflow type specifying maximum age and maximum count. Garbage collection is implemented as a scheduled process (e.g., daily cron job) that scans the checkpoint store, identifies eligible checkpoints, and deletes them. Deletion is logged. Stale checkpoint resumption is prevented by checking workflow status before restoration. This level meets the minimum mandatory requirements but has operational weaknesses: the scheduled nature of garbage collection means checkpoints accumulate between runs, storage utilisation monitoring may not catch rapid accumulation, and the garbage-collection process competes with production workloads for resources.

Intermediate Implementation — Garbage collection operates as a dedicated process with guaranteed resource allocation, triggered by both schedule and workflow lifecycle events. Checkpoints are classified by sensitivity tier, with appropriate deletion methods per tier. A checkpoint registry tracks lifecycle state and is queryable for compliance reporting. Storage utilisation monitoring generates alerts at configurable thresholds. Legal hold capability allows compliance teams to suspend garbage collection for specific entities or time periods. Reference counting prevents garbage collection of checkpoints with active dependencies (rollback, audit, regulatory hold). Deletion logging is tamper-evident per AG-006.

Advanced Implementation — All intermediate capabilities plus: cryptographic erasure is used for sensitive checkpoint tiers, eliminating the need for secure overwrite and enabling near-instantaneous deletion. Generational storage management moves checkpoints through tiers based on age, optimising storage cost. Garbage-collection throughput is automatically scaled based on checkpoint creation rate, ensuring the collection process always keeps pace with creation. Formal verification has confirmed that the minimum required recovery checkpoints are always available even during garbage-collection operations. The organisation can demonstrate to regulators a complete, tamper-evident lifecycle trail for every checkpoint from creation through deletion, including the retention policy that governed each lifecycle transition.

7. Evidence Requirements

Required artefacts:

Retention requirements:

Access requirements:

8. Test Specification

Testing AG-380 compliance requires validation that checkpoint garbage collection operates reliably, respects retention policies, prevents stale resumption, and maintains forensic integrity. A comprehensive test programme should include the following tests.

Test 8.1: Retention Policy Enforcement

Test 8.2: Active Dependency Protection

Test 8.3: Stale Checkpoint Resumption Prevention

Test 8.4: Deletion Irreversibility and Logging

Test 8.5: Storage Utilisation Monitoring and Alerting

Test 8.6: Data Protection Compliance for Personal Data Checkpoints

Test 8.7: Garbage-Collection Resource Guarantee Under Load

Conformance Scoring

9. Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 12 (Record-Keeping)Direct requirement
EU AI ActArticle 9 (Risk Management System)Supports compliance
GDPRArticle 5(1)(e) (Storage Limitation), Article 17 (Right to Erasure)Direct requirement
SOXSection 802 (Criminal Penalties for Altering Documents)Supports compliance
FCA SYSC9.1.1R (Record-Keeping Requirements)Direct requirement
NIST AI RMFGOVERN 1.5, MANAGE 2.4Supports compliance
ISO 42001Clause 6.1 (Actions to Address Risks), Annex B (AI Data Management)Supports compliance
DORAArticle 12 (Backup Policies and Recovery), Article 9 (ICT Risk Management)Supports compliance

EU AI Act — Article 12 (Record-Keeping)

Article 12 requires that high-risk AI systems be designed and developed with capabilities enabling the automatic recording of events (logs). This requirement extends to the lifecycle management of those records. Checkpoints, as persistent records of AI system execution state, fall within the scope of Article 12. The regulation requires that logs be "kept for a period that is appropriate in view of the intended purpose of the high-risk AI system and applicable legal obligations." AG-380 implements this requirement by defining formal retention policies that determine checkpoint lifecycle duration based on purpose and legal obligation. The regulation also requires that logging capabilities "ensure a level of traceability of the AI system's functioning throughout its lifecycle" — checkpoint lifecycle logging (creation, retention, holds, deletion) contributes to this traceability requirement.

EU AI Act — Article 9 (Risk Management System)

Article 9 requires risk management systems for high-risk AI systems. Uncontrolled checkpoint accumulation creates risks across multiple dimensions: storage exhaustion (operational risk), data over-retention (compliance risk), and stale resumption (safety risk). AG-380 implements risk mitigation measures for all three risk categories. The risk management system required by Article 9 must identify and address these risks; AG-380 provides the structured control.

GDPR — Article 5(1)(e) and Article 17

Article 5(1)(e) establishes the storage limitation principle: personal data must be "kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for which the personal data are processed." Checkpoints containing personal data are subject to this principle. Once the workflow for which the checkpoint was created is complete, continued retention of personal data within the checkpoint requires a separate legal basis. AG-380 ensures that checkpoint garbage collection is integrated with data retention policies, preventing systematic violation of the storage limitation principle. Article 17 establishes the right to erasure. A data subject exercising this right is entitled to have their personal data deleted from all systems, including checkpoint stores. AG-380's checkpoint registry and sensitivity classification enable the organisation to identify and delete checkpoints containing a specific data subject's personal data in response to erasure requests. Failure to include checkpoint stores in erasure processing is a common compliance gap that AG-380 directly addresses.

SOX — Section 802

Section 802 establishes criminal penalties for the destruction, alteration, or falsification of records. While AG-380 governs the legitimate, policy-driven deletion of checkpoints, the tamper-evident logging requirement ensures that all deletions are attributable and auditable. This protects the organisation from allegations of improper record destruction by demonstrating that every deletion followed the defined retention policy and was executed by an authorised process. SOX auditors examining AI agent operations will ask how the organisation manages the lifecycle of execution records — AG-380 provides the structured answer.

FCA SYSC — 9.1.1R (Record-Keeping Requirements)

SYSC 9.1.1R requires FCA-regulated firms to maintain orderly records of their business and internal organisation. For firms deploying AI agents, checkpoints are business records that document the agent's execution state during business-critical processes. The FCA expects that record-keeping systems have defined retention policies, that records are retrievable for regulatory purposes, and that record disposal follows documented procedures. AG-380 directly implements these requirements for checkpoint records. The FCA's approach to AI governance, articulated through supervisory statements, emphasises that firms must demonstrate control over all data artefacts produced by AI systems, including intermediate execution state.

NIST AI RMF — GOVERN 1.5, MANAGE 2.4

GOVERN 1.5 addresses ongoing monitoring and review of AI governance processes. MANAGE 2.4 addresses mechanisms for managing AI system data throughout its lifecycle. AG-380 supports compliance by establishing governed lifecycle management for checkpoint data, ensuring that AI system execution artefacts are subject to the same data management rigour as other organisational data assets.

ISO 42001 — Clause 6.1, Annex B

Clause 6.1 requires organisations to address risks within the AI management system. Annex B provides guidance on AI data management. Checkpoint accumulation is a data management risk within the AI management system that AG-380 directly mitigates. The formal retention policy and garbage-collection mechanism satisfy ISO 42001's requirement for systematic data lifecycle management within the AI management system.

DORA — Article 12, Article 9

Article 12 requires financial entities to establish backup policies and procedures and to determine recovery methods. Checkpoint management is directly relevant: checkpoints enable recovery, and their governance ensures that recovery capability is maintained (not degraded by storage exhaustion) while resources are managed efficiently. Article 9 requires ICT risk management frameworks. Uncontrolled checkpoint accumulation is an ICT risk — storage exhaustion, data over-retention, and stale state resumption are all ICT risk events that AG-380's governed garbage collection mitigates.

10. Failure Severity

FieldValue
Severity RatingCritical
Blast RadiusOrganisation-wide — extends to regulatory relationships, data subjects, and dependent systems that rely on checkpoint-based recovery capabilities

Consequence chain: Without governed checkpoint garbage collection, the failure manifests through three independent but compounding paths. First, storage exhaustion: checkpoints accumulate until storage capacity is reached, at which point new checkpoint creation fails. If the failure is silent — as it frequently is in systems that treat checkpoint storage as a non-critical resource — the agent continues operating without recovery capability. The next failure requiring rollback discovers that no valid checkpoint exists, transforming a recoverable incident into an unrecoverable one. The operational impact depends on the agent's domain: in financial services, inability to roll back a transaction sequence; in healthcare, inability to recover a clinical workflow to a safe state; in manufacturing, inability to revert a physical process to a known-good configuration. Second, data protection violation: checkpoints containing personal data persist beyond their lawful retention period. The violation scales with the number of data subjects processed and the duration of over-retention. A single agent processing 2,500 customer records monthly accumulates 30,000 over-retained records annually, each representing an individual regulatory violation. GDPR fines for systematic storage limitation breaches can reach €10 million or 2% of global annual turnover. Third, stale resumption: an orphaned checkpoint from a retired workflow, outdated data model, or superseded strategy is accidentally resumed, causing the agent to execute actions based on obsolete state. In financial services, this can result in trades executed against outdated market models; in healthcare, treatments based on superseded clinical data; in manufacturing, production runs using recalled process parameters. The severity compounds because all three failure paths can activate simultaneously — an organisation with no checkpoint governance simultaneously accumulates storage risk, data protection liability, and stale resumption exposure, with the combined consequence exceeding the sum of individual risks.

Cite this protocol
AgentGoverning. (2026). AG-380: Checkpoint Garbage-Collection Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-380