AG-231: Legal Hold and Preservation Governance

2. Summary

Legal Hold and Preservation Governance requires that when litigation, investigation, regulatory inquiry, or other legal proceedings are reasonably anticipated, all relevant AI agent artefacts — logs, prompts, model versions, configuration snapshots, training data references, decision traces, and output records — are immediately identified, preserved, and protected from routine deletion, modification, or overwriting. AI agents generate vast quantities of ephemeral data that standard retention policies may delete before the data becomes legally critical. This dimension ensures that the organisation can produce complete, unaltered AI agent records when legally compelled to do so, preventing spoliation sanctions, adverse inference rulings, and obstruction findings.

3. Example

Scenario A — Routine Log Rotation Destroys Litigation-Critical Evidence: A customer-facing financial advisory agent provides investment recommendations. A client suffers significant losses and files a negligence claim alleging that the agent's recommendation was unsuitable. The claim is filed 14 months after the recommendation. The organisation's standard log retention policy retains detailed agent interaction logs for 90 days, after which they are rotated to compressed archives retaining only summary records. The detailed prompt history, reasoning trace, and model version that generated the recommendation have been deleted. The organisation cannot reconstruct what information the agent had at decision time, what reasoning it applied, or whether it considered the client's risk profile. The court draws an adverse inference from the missing records: "The defendant's failure to preserve records of the AI system's decision-making process supports the inference that those records would have been unfavourable to the defendant."

What went wrong: The organisation had no mechanism to trigger preservation of AI agent records when litigation became reasonably foreseeable. The standard 90-day retention policy was designed for operational debugging, not legal preservation. When the litigation trigger occurred (client complaint at 6 months), no legal hold was placed on the relevant records. By the time formal discovery commenced, the records had been routinely destroyed. Consequence: Adverse inference ruling, significantly weakened defence, settlement at GBP 2.3 million (vs. estimated GBP 400,000 had the records been available), and regulatory inquiry from the FCA regarding record-keeping obligations.

Scenario B — Model Version Destruction Prevents Forensic Replay: A public-sector benefits determination agent is challenged in judicial review for alleged discriminatory outcomes. The challenge is filed 10 months after the decisions in question. During that period, the agent's model has been retrained 4 times. The organisation preserved the decision logs (inputs and outputs) but did not preserve the model versions that produced those outputs. Without the exact model version, the decisions cannot be forensically replayed (per AG-066). The court cannot verify whether the model was discriminatory because the model no longer exists. The court orders a full re-determination of all 12,400 benefits decisions made during the challenged period using a newly validated model — at a cost of GBP 4.1 million and a 9-month timeline.

What went wrong: The legal hold preserved logs but not the complete set of artefacts required for forensic replay. The model versions were treated as infrastructure (routinely replaced) rather than evidence (requiring preservation). The preservation scope was defined by the legal team's understanding of "relevant records" — which did not include the AI model itself. Consequence: GBP 4.1 million in re-determination costs, 9-month delay affecting 12,400 benefit recipients, judicial criticism of the agency's AI governance practices, and policy requirement for model version retention imposed by the court.

Scenario C — Cross-Jurisdictional Preservation Conflict: An enterprise workflow agent operating across the EU and US is subject to simultaneous legal holds: a US litigation hold requiring preservation of all communications and data related to a contract dispute, and a GDPR data subject deletion request from an EU individual whose data appears in the preserved records. The organisation faces a conflict: US preservation obligations require retention; GDPR requires deletion. Without a governance framework for resolving preservation conflicts, a junior data protection officer processes the deletion request, removing records that are subject to the US litigation hold. The US court imposes spoliation sanctions for destruction of evidence.

What went wrong: No conflict resolution mechanism existed for competing preservation and deletion obligations. The legal hold was not communicated to the data protection team. The deletion workflow did not check for active legal holds before processing. Consequence: Spoliation sanctions in the US proceeding (adverse inference instruction to the jury), GDPR compliance exposure if the deletion had not been processed, and internal disciplinary proceedings for the process failure.

4. Requirement Statement

Scope: This dimension applies to every AI agent that generates, processes, or retains data that could become relevant to litigation, regulatory investigation, arbitration, internal investigation, or other legal proceedings. In practice, this means all agents that interact with external parties (customers, counterparties, regulators, the public), all agents that make or contribute to decisions affecting individuals (benefits determinations, credit decisions, hiring recommendations), and all agents operating in regulated sectors. The scope covers the full range of AI agent artefacts: interaction logs (prompts, responses, context windows), model versions (weights, architecture, configuration), training data references (not necessarily the training data itself, but sufficient metadata to identify what data was used), governance configuration snapshots (mandate limits, access controls, operational parameters), and decision traces (reasoning chains, tool calls, intermediate outputs). The scope extends to metadata: timestamps, user identifiers, session identifiers, and version identifiers that are necessary to locate and contextualise preserved artefacts.

4.1. A conforming system MUST implement a legal hold mechanism that, upon activation, immediately suspends all routine deletion, rotation, archiving, anonymisation, and modification of identified artefact categories for the duration of the hold.

4.2. A conforming system MUST preserve the complete set of artefacts necessary for forensic replay of agent decisions within the hold scope, including the model version, configuration snapshot, input data, and output data at the time of each decision.

4.3. A conforming system MUST ensure that legal hold activation propagates to all storage systems, backup systems, and archival systems where relevant artefacts may exist, within 4 hours of hold activation.

4.4. A conforming system MUST prevent any process — automated or manual — from modifying or deleting artefacts subject to an active legal hold, including routine log rotation, data retention policy enforcement, GDPR erasure processing, and storage optimisation.

4.5. A conforming system MUST maintain a hold registry recording: the hold identifier, activation date, scope definition (which agents, which artefact types, which time period), the legal matter reference, the activating authority, and the release date (when the hold is lifted).

4.6. A conforming system MUST support hold scope definition at multiple granularities: per-agent, per-user, per-time-period, per-artefact-type, and per-matter, allowing holds to be precisely targeted.

4.7. A conforming system SHOULD implement conflict detection between legal holds and competing data obligations (e.g., GDPR deletion requests) and escalate conflicts to legal counsel for resolution before any action is taken.

4.8. A conforming system SHOULD verify hold effectiveness through periodic integrity checks confirming that artefacts within hold scope remain present, unmodified, and accessible.

4.9. A conforming system SHOULD support pre-litigation preservation triggers — automated or semi-automated identification of events that create a reasonable anticipation of litigation (e.g., formal complaints, regulatory inquiries, demand letters) — that activate holds proactively.

4.10. A conforming system MAY implement custodian notification workflows that inform relevant personnel when their agent interactions are subject to a legal hold and their obligations regarding those records.

5. Rationale

AI agents generate a distinctive category of evidence. Unlike traditional software systems where the code is static and deterministic, AI agents produce outputs that depend on a specific model version, specific input context, specific configuration parameters, and potentially stochastic sampling. To reconstruct why an agent produced a particular output, you need not just the input and output logs but the exact model version, the exact configuration, and the exact context window at the time of the decision. If any of these artefacts is destroyed, the decision cannot be forensically replayed — and in legal proceedings, the inability to explain a decision can be as damaging as the decision itself.

The legal obligation to preserve evidence arises when litigation is "reasonably anticipated" — not when a lawsuit is filed. For AI agents, reasonable anticipation may arise from customer complaints, regulatory inquiries, internal audit findings, media coverage, or even the agent's own error logs. The gap between reasonable anticipation and actual legal proceedings can be months or years. During that gap, standard operational processes (log rotation, model retraining, storage optimisation) routinely destroy the very evidence that litigation will require.

Spoliation — the destruction of evidence — carries severe legal consequences. In the US, Federal Rules of Civil Procedure Rule 37(e) permits courts to impose sanctions including adverse inference instructions (telling the jury to assume the destroyed evidence was unfavourable), case-dispositive sanctions (dismissal or default judgment), and monetary penalties. In the UK, CPR Part 31 imposes similar obligations. In the EU, the procedural laws of each member state impose preservation obligations. For AI-specific evidence — model versions, reasoning traces, configuration snapshots — the spoliation risk is acute because these artefacts are not traditionally considered "documents" and may not be captured by standard legal hold processes designed for email and file servers.

6. Implementation Guidance

The legal hold mechanism for AI agents must be broader than traditional e-discovery holds because AI agent artefacts are more diverse, more distributed, and more ephemeral than traditional business records.

Recommended patterns:

Artefact-aware hold taxonomy. Define the complete taxonomy of AI agent artefacts subject to preservation: interaction logs, model versions (weights + architecture + tokeniser), training data manifests, configuration snapshots, governance mandate versions, tool call records, reasoning traces, and output records. For each artefact type, define: the storage location, the retention policy absent a hold, the preservation mechanism (suspend deletion, copy to immutable store, or snapshot), and the verification method (hash comparison, existence check, integrity seal per AG-006).
Hold propagation engine. Implement an automated propagation mechanism that, upon hold activation, identifies all storage systems containing in-scope artefacts and applies preservation locks. For distributed systems, this requires a registry of all artefact storage locations (databases, object stores, model registries, log aggregators, backup systems). The propagation must complete within the 4-hour window specified in requirement 4.3. Each propagation target should confirm receipt and application of the hold.
Immutable preservation store. Copy in-scope artefacts to a write-once, read-many (WORM) store at hold activation time. This provides defence-in-depth: even if the operational system's hold lock fails (e.g., due to a misconfigured retention policy), the preservation copy exists. The WORM store should be tamper-evident per AG-006, with cryptographic integrity seals on each preserved artefact.
Conflict resolution workflow. Implement a workflow that detects conflicts between legal holds and competing data obligations. When a GDPR erasure request targets data subject to an active legal hold, the workflow pauses the erasure, notifies legal counsel, and requires a documented legal determination before proceeding. The determination should reference the specific legal basis for prioritising preservation over erasure (or vice versa).

Anti-patterns to avoid:

Preserving logs but not models. The most common AI-specific preservation failure. Without the model version, logs show inputs and outputs but cannot explain why the agent produced a given output. Forensic replay (AG-066) requires the complete computational state, not just the input/output record.
Relying on backups as preservation. Backup systems are designed for disaster recovery, not legal hold. Backups are routinely overwritten, rotated, and pruned. A legal hold on a production database does not automatically extend to backup tapes. Backups may also lack the granularity needed — a nightly backup cannot preserve an artefact that was created and deleted within the same day.
Manual hold processes at AI-agent scale. Traditional legal hold involves emailing custodians with preservation instructions. AI agent artefacts are not controlled by custodians — they are generated and managed by automated pipelines. A manual process cannot reliably hold artefacts across model registries, log aggregators, configuration management systems, and training data stores. Automated propagation is essential.
Indefinite holds without periodic review. Legal holds that are never released accumulate artefacts indefinitely, creating storage cost, data protection risk (retaining personal data beyond necessity), and operational burden. Holds should be reviewed periodically (at least quarterly) and released promptly when the legal basis for preservation no longer exists.
Treating anonymisation as preservation. Anonymising records subject to a legal hold may constitute spoliation if the anonymisation destroys information relevant to the legal matter. Preservation means retaining the artefact in its original, unaltered form.

Industry Considerations

Financial Services. FCA rules (SYSC 9, MiFID II Article 16(6-7)) require retention of records of all services, activities, and transactions for at least 5 years (7 years for MiFID records). For AI agents providing investment advice, this includes the complete interaction record and the basis for each recommendation. Legal holds may extend these periods further. The FCA has indicated that it expects firms to be able to reconstruct the basis for AI-generated advice at any historical point.

Healthcare. Medical records retention periods are typically 10+ years (30 years for some categories). AI-assisted clinical decisions must be reconstructable for the full retention period. Legal holds arising from malpractice claims must preserve the complete AI decision support record, including the model version and clinical guidelines the model was trained on.

Public Sector. Freedom of information requests, judicial reviews, and public inquiries can create preservation obligations with broad scope. Public sector agencies must preserve AI agent records that could be relevant to government accountability, including records of how AI was used in decisions affecting individuals' rights or entitlements.

Maturity Model

Basic Implementation — The organisation has a documented legal hold process that extends to AI agent interaction logs. Upon hold activation, a manual process identifies the relevant log stores and suspends rotation for the specified time period and agent. Model versions are preserved in the model registry but not explicitly linked to the legal hold. Configuration snapshots are preserved if they exist but are not routinely generated. This level meets minimum preservation obligations for interaction records but creates forensic replay gaps due to incomplete artefact coverage.

Intermediate Implementation — The organisation has an automated hold propagation engine that, upon activation, identifies and locks all in-scope artefact types (logs, models, configurations, training data manifests) across all storage systems. A hold registry tracks all active holds with scope definitions and activation/release dates. Conflict detection identifies competing obligations (e.g., GDPR erasure vs. litigation hold) and escalates to legal counsel. Periodic integrity checks verify that held artefacts remain present and unmodified. Hold scoping supports per-agent, per-user, per-period, and per-matter granularity.

Advanced Implementation — All intermediate capabilities plus: pre-litigation preservation triggers automatically identify events that create reasonable anticipation of legal proceedings and activate holds proactively. Artefacts are copied to a WORM preservation store with tamper-evident integrity seals per AG-006. The system supports forensic replay (AG-066) directly from preserved artefacts, verifying that the preserved model version, configuration, and input data reproduce the original output. Cross-jurisdictional hold management resolves conflicts between jurisdictions with different preservation and deletion obligations. The organisation can demonstrate to any court or regulator that no relevant artefact was destroyed, modified, or made inaccessible after the preservation obligation arose.

7. Evidence Requirements

Required artefacts:

Legal hold policy. The organisation's documented policy defining: when legal holds are triggered, who has authority to activate and release holds, what artefact types are in scope, and how conflicts with competing obligations are resolved.
Hold registry. The complete registry of all active and historical legal holds, including hold ID, activation date, scope, legal matter reference, activating authority, and release date.
Propagation confirmation records. Records demonstrating that each hold was propagated to all relevant storage systems within the required timeframe, with confirmation from each storage system.
Integrity verification records. Periodic verification records confirming that held artefacts remain present, unmodified, and accessible throughout the hold period.
Conflict resolution records. Records of all conflicts between legal holds and competing obligations, including the legal determination made and the basis for it.

Retention requirements:

Hold registry and propagation records: minimum 7 years after hold release for regulated sectors; minimum 3 years otherwise. Preserved artefacts: retained for the duration of the hold plus the applicable retention period after release.

Access requirements:

Producible to courts, regulators, or opposing counsel (pursuant to appropriate legal process) within 48 hours. Must include chain-of-custody evidence demonstrating that artefacts have not been modified since preservation.

8. Test Specification

Test 8.1: Hold Activation and Propagation

Stimulus: Activate a legal hold specifying a particular agent, artefact types (logs, model version, configuration), and time period (last 6 months). Verify propagation to all relevant storage systems.
Expected behaviour: Within 4 hours, all identified storage systems confirm that deletion, rotation, and modification of in-scope artefacts are suspended.
Pass criteria: All storage systems holding in-scope artefacts confirm the hold. A subsequent automated retention policy execution does not delete held artefacts.
Fail criteria: Any storage system fails to apply the hold, or any held artefact is deleted by routine processes after hold activation.

Test 8.2: Deletion Prevention Under Active Hold

Stimulus: With an active legal hold in place, trigger the routine log rotation process, a GDPR erasure request for in-scope data, and a manual deletion request for in-scope artefacts.
Expected behaviour: All three deletion attempts are blocked. The log rotation skips held artefacts. The GDPR erasure request is escalated to the conflict resolution workflow. The manual deletion is rejected with a reference to the active hold.
Pass criteria: No held artefact is deleted, modified, or anonymised through any mechanism while the hold is active.
Fail criteria: Any held artefact is deleted, modified, or anonymised through any channel.

Test 8.3: Artefact Completeness for Forensic Replay

Stimulus: Activate a hold on a specific agent decision. Verify that the preserved artefacts include: the interaction log (prompt and response), the model version (weights, architecture, tokeniser), the configuration snapshot, and any tool call records.
Expected behaviour: All artefact types necessary for forensic replay are identified, preserved, and accessible.
Pass criteria: A forensic replay using only the preserved artefacts reproduces the original agent output (or, for stochastic models, produces an output within the documented variance range).
Fail criteria: Any artefact type necessary for forensic replay is missing from the preserved set.

Test 8.4: Hold Scope Precision

Stimulus: Activate a hold on Agent-A for a specific 3-month period. Verify that artefacts from Agent-B and artefacts from Agent-A outside the specified period are not affected.
Expected behaviour: Only artefacts matching the hold scope (Agent-A, specified period) are preserved. Artefacts outside scope continue to be subject to routine retention policies.
Pass criteria: In-scope artefacts are preserved; out-of-scope artefacts are unaffected by the hold.
Fail criteria: The hold is applied too broadly (affecting out-of-scope artefacts) or too narrowly (missing in-scope artefacts).

Test 8.5: Conflict Detection and Escalation

Stimulus: With an active legal hold on records including a specific data subject's interactions, submit a GDPR Article 17 erasure request for that data subject.
Expected behaviour: The system detects the conflict between the hold and the erasure request, pauses the erasure, and escalates to legal counsel with both the hold details and the erasure request.
Pass criteria: The erasure is not processed. The conflict is escalated with sufficient information for legal determination. The data subject is not notified of compliance until the conflict is resolved.
Fail criteria: The erasure is processed (destroying held evidence), or the conflict is not detected.

Test 8.6: Integrity Verification

Stimulus: After a hold has been active for 30 days, run the periodic integrity check. Then, attempt to modify a held artefact through direct database access (bypassing application controls).
Expected behaviour: The periodic integrity check confirms all held artefacts are present and unmodified. The direct modification attempt is either blocked (if WORM storage is used) or detected by the next integrity check (if integrity seals per AG-006 are used).
Pass criteria: Integrity verification detects any modification to held artefacts. The modification is logged and alerted.
Fail criteria: A modification to a held artefact goes undetected.

Conformance Scoring

Score 0: No legal hold mechanism exists for AI agent artefacts — routine retention policies apply regardless of legal proceedings.
Score 1: Manual legal hold process exists for interaction logs, but model versions, configurations, and training data manifests are not covered — partial preservation with forensic replay gaps.
Score 2: Automated hold propagation covers all artefact types across all storage systems, with deletion prevention and conflict detection — complete preservation with structural enforcement.
Score 3: Verified by independent testing, with WORM preservation, tamper-evident integrity seals, pre-litigation triggers, and demonstrated forensic replay capability from preserved artefacts — court-ready preservation.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
US FRCP	Rule 37(e) (Failure to Preserve ESI)	Direct requirement
UK CPR	Part 31 (Disclosure and Inspection), Practice Direction 31B	Direct requirement
EU AI Act	Article 12 (Record-Keeping), Article 20 (Corrective Actions)	Supports compliance
GDPR	Article 17 (Right to Erasure) — in tension	Conflict management
MiFID II	Article 16(6-7) (Record-Keeping)	Direct requirement
FCA SYSC	9.1 (General Rules on Record-Keeping)	Direct requirement
SOX	Section 802 (Criminal Penalties for Document Destruction)	Direct requirement
NIST AI RMF	GOVERN 1.5, MANAGE 4.2	Supports compliance

US FRCP — Rule 37(e)

Rule 37(e) governs the consequences of failing to preserve electronically stored information (ESI) that should have been preserved in anticipation of litigation. It permits courts to order curative measures, and if the failure was intentional, to impose severe sanctions including adverse inference instructions and case-dispositive sanctions. For AI agent artefacts, the rule applies to all ESI — logs, model files, configuration data, and reasoning traces are all ESI. The "reasonable steps to preserve" standard requires affirmative preservation measures, not merely the absence of intentional destruction. AG-231's automated hold mechanism demonstrates the "reasonable steps" that Rule 37(e) requires.

UK CPR — Part 31 and Practice Direction 31B

Practice Direction 31B specifically addresses the disclosure of electronic documents and imposes obligations to preserve documents that may be relevant to proceedings that are reasonably anticipated. The definition of "document" is broad and includes "anything in which information of any description is recorded." AI model versions, configuration files, and reasoning traces all fall within this definition. AG-231's hold mechanism implements the preservation obligations that PD31B requires.

Article 17 grants data subjects the right to erasure of their personal data. However, Article 17(3)(e) provides an exemption where processing is necessary for the establishment, exercise, or defence of legal claims. This exemption provides the legal basis for retaining personal data subject to a legal hold despite an erasure request — but only for data that is genuinely relevant to the legal claim. AG-231's conflict resolution workflow implements the balancing required between preservation and erasure obligations.

SOX — Section 802

Section 802 imposes criminal penalties (up to 20 years imprisonment) for the destruction, alteration, or falsification of records with the intent to impede a federal investigation or bankruptcy proceeding. For AI agent artefacts that constitute records of financial reporting processes, this creates a criminal dimension to preservation obligations. AG-231's tamper-evident preservation mechanisms provide evidence that records were not altered after preservation.

MiFID II — Article 16(6-7) and FCA SYSC 9

MiFID II and FCA SYSC 9 require firms to retain records of all services, activities, and transactions sufficient to enable the competent authority to fulfil its supervisory tasks. For AI agents providing investment services, this includes the complete basis for each recommendation or decision. The records must be retained in a medium that allows them to be stored in a way accessible for future reference. AG-231's preservation mechanisms ensure that legal holds extend the standard retention period when legal proceedings require it.

10. Failure Severity

Field	Value
Severity Rating	Critical
Blast Radius	Case-specific, but with potential precedent-setting consequences

Consequence chain: Failure to preserve AI agent artefacts when legally required creates spoliation exposure. In US litigation, spoliation can result in adverse inference instructions (the jury is told to assume the destroyed evidence was unfavourable), monetary sanctions, or case-dispositive sanctions (dismissal or default judgment). In UK proceedings, failure to preserve disclosable documents can result in costs sanctions, adverse inferences, and contempt of court. The financial impact is case-specific but can be substantial: in a claim where the organisation's defence depends on demonstrating that the agent's reasoning was sound, the inability to produce the agent's reasoning record eliminates the defence entirely. The organisation is left to settle on the claimant's terms. For regulated entities, preservation failures also trigger regulatory consequences: FCA enforcement for inadequate record-keeping (SYSC 9 breach), SOX criminal exposure for obstruction (Section 802), and reputational damage that extends beyond the individual case to the organisation's credibility as a deployer of AI systems.

Cross-references: AG-006 (Tamper-Evident Record Integrity) provides the integrity mechanisms that ensure preserved artefacts have not been modified since preservation. AG-066 (Forensic Replay and Evidence Preservation) defines the capability that legal hold enables — the ability to replay an agent's decision using the preserved artefacts. AG-232 (Privilege and Confidential Review Segregation Governance) addresses the segregation of privileged material within preserved records. AG-235 (Evidence Admissibility Governance) addresses the requirements for preserved records to be admissible in legal proceedings. AG-229 (Jurisdictional Applicability Mapping Governance) determines which jurisdiction's preservation obligations apply to a given set of records.

Cite this protocol

AgentGoverning. (2026). AG-231: Legal Hold and Preservation Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-231

← Previous Protocol

AG-230

Substantial Modification Determination Governance

Next Protocol →

AG-232

Privilege and Confidential Review Segregation Governance