AG-339: Model Weight Custody Governance

2. Summary

Model Weight Custody Governance requires that organisations establish and maintain a formal chain of custody for all model weight artefacts — from initial receipt or creation through storage, transfer, deployment, and eventual decommissioning. Model weights are the operational core of an AI agent: whoever possesses the weights can replicate, modify, or extract the model's capabilities. Without rigorous custody controls, weights can be exfiltrated, tampered with, or deployed from unauthorised copies, creating risks that range from intellectual property theft to deployment of compromised models in production. This dimension ensures that every weight file is tracked, access-controlled, integrity-verified, and auditable at every stage of its lifecycle.

3. Example

Scenario A — Weight Exfiltration Through Unsecured Transfer: An organisation trains a proprietary language model at a cost of £4.2 million over eight months. The final weight checkpoint (14.7 GB) is transferred from the training cluster to the deployment environment via an unencrypted S3 bucket with overly broad IAM permissions. A departing engineer copies the weights to a personal device before leaving. Six months later, a competitor launches a suspiciously similar product. Forensic analysis reveals the competitor's model produces identical outputs to the proprietary model on a curated evaluation set — a statistical impossibility without access to the original weights or a near-identical training run.

What went wrong: No custody controls existed for weight transfers. The S3 bucket was configured for convenience rather than security. No access logging was enabled for the weight files. No integrity check detected the unauthorised copy. The organisation cannot prove when the exfiltration occurred or definitively attribute it. Consequence: £4.2 million in R&D investment compromised, competitive advantage lost, litigation costs estimated at £800,000, and regulatory scrutiny under trade secret law.

Scenario B — Tampered Weights Deployed to Production: A financial services firm deploys an AI agent for credit scoring using model weights downloaded from an internal model registry. An attacker who has compromised a CI/CD pipeline modifies the weight file during the build stage, introducing a subtle bias that approves loans for specific postcodes regardless of creditworthiness. The tampered weights pass basic smoke tests because the modifications affect fewer than 0.3% of decisions. Over four months, the firm approves £12.6 million in high-risk loans concentrated in targeted postcodes. The anomaly is detected only during a quarterly model performance review.

What went wrong: No cryptographic integrity verification was performed when weights were loaded into the deployment container. The CI/CD pipeline was trusted implicitly. No hash comparison against a signed manifest occurred at deployment time. Consequence: £12.6 million in excess credit exposure, potential FCA enforcement action for inadequate model risk management, and reputational damage when affected borrowers default.

Scenario C — Orphaned Weight Copies Across Environments: A research team iterates through 47 model checkpoints over three months of experimentation. Checkpoints are saved to local NFS shares, personal workstations, and multiple cloud storage buckets. When the team selects the final checkpoint for production, no definitive record exists of which checkpoint is deployed where. Six months later, a compliance audit asks the organisation to demonstrate which exact model weights are serving production traffic. The team cannot answer with certainty — the deployed weights might be checkpoint 43 or checkpoint 45, and three copies of unknown provenance exist on development servers.

What went wrong: No custody registry tracked weight artefact locations. No lifecycle management ensured decommissioning of superseded checkpoints. No deployment-time binding between the weight manifest and the production environment existed. Consequence: Audit failure, inability to demonstrate model provenance to regulators, and six weeks of remediation effort to re-establish which weights are actually deployed.

4. Requirement Statement

Scope: This dimension applies to any organisation that stores, transfers, deploys, or decommissions model weight files for AI agents that operate in production or pre-production environments with access to real data or real systems. It covers all forms of model weights: full-precision checkpoints, quantised variants, distilled models, adapter weights (LoRA, QLoRA), merged weights, and any derivative artefact that encodes learned parameters. The scope includes weights received from third-party providers, weights generated through internal training, and weights produced through fine-tuning or adaptation of base models. Weights used exclusively in isolated research sandboxes with no path to production may be excluded, provided the organisation can demonstrate that no mechanism exists for those weights to reach a production environment without passing through the custody controls defined here.

4.1. A conforming system MUST maintain a custody registry that records the location, custodian, integrity hash, and access permissions for every model weight artefact in the organisation's possession.

4.2. A conforming system MUST verify the cryptographic integrity of model weights at every custody transition — including creation, transfer between environments, deployment to production, and restoration from backup.

4.3. A conforming system MUST enforce access controls on model weight storage that restrict read, write, and copy operations to explicitly authorised personnel and systems.

4.4. A conforming system MUST log all access to model weight files, including read operations, with sufficient detail to reconstruct the complete access history for any weight artefact.

4.5. A conforming system MUST encrypt model weights at rest and in transit using encryption standards appropriate to the classification of the model (e.g., AES-256 for storage, TLS 1.3 for transfer).

4.6. A conforming system MUST implement a decommissioning process that securely deletes superseded or retired weight artefacts and confirms deletion across all known copies.

4.7. A conforming system SHOULD bind deployed weights to a signed manifest that can be verified at runtime to confirm that the weights in memory match the approved deployment artefact.

4.8. A conforming system SHOULD implement anomaly detection on weight access patterns to identify unusual download volumes, access from unexpected locations, or access outside normal operational windows.

4.9. A conforming system MAY implement hardware-backed key management (e.g., HSM or TPM) for weight encryption keys in environments where the model represents significant intellectual property or where regulatory requirements demand it.

5. Rationale

Model weights are the most valuable and most vulnerable artefact in an AI system. They encode the entirety of a model's learned behaviour — billions of parameters refined through training runs costing millions of pounds and months of compute time. Unlike source code, which can be audited line by line, model weights are opaque: a single bit-flip in the right location can alter behaviour in ways that are invisible to standard evaluation but exploitable in practice.

The custody challenge is compounded by the operational reality of modern ML pipelines. Weights are routinely copied between training clusters, evaluation environments, staging systems, and production deployments. Each copy creates a potential exfiltration point or tampering opportunity. Without a formal chain of custody, organisations cannot answer fundamental governance questions: Which weights are deployed in production right now? Who has accessed these weights in the last 90 days? Are these weights identical to the ones that passed our safety evaluation?

The threat model includes both insider and external actors. Insiders with legitimate access to training infrastructure may copy weights for unauthorised purposes — a risk documented in multiple industry incidents involving departing employees. External attackers who compromise CI/CD pipelines or model registries can substitute tampered weights that pass basic functional tests while introducing subtle behavioural modifications. State-sponsored actors have demonstrated interest in acquiring proprietary model weights as a form of economic espionage.

AG-339 establishes custody as a first-class governance concern — not an afterthought delegated to general IT asset management. Model weights require specialised custody controls because they are large (making exfiltration detectable if monitored), opaque (making tampering difficult to detect without cryptographic verification), and uniquely valuable (making them high-priority targets).

6. Implementation Guidance

Effective model weight custody requires controls across four lifecycle phases: creation, storage, transfer, and decommissioning.

Creation phase. When weights are produced by a training run, the training pipeline should automatically compute a cryptographic hash (SHA-256 or stronger) of the complete weight artefact, sign it with the training system's key, and register the hash, signature, and metadata (training job ID, timestamp, hyperparameters reference, training data manifest reference) in the custody registry. The weights should be written directly to an access-controlled storage location — never to a temporary uncontrolled location first.

Storage phase. Weight files should be stored in a dedicated, access-controlled repository — not mixed with general data storage. Access controls should enforce the principle of least privilege: most users need only read access to deploy, not write access to modify. All access should be logged with identity, timestamp, operation type, and source IP. Encryption at rest should be mandatory, with keys managed through a dedicated key management system rather than stored alongside the weights.

Transfer phase. Every transfer of weights between environments must be logged as a custody transition. The receiving environment must verify the cryptographic hash against the custody registry before accepting the weights. Transfers should occur over encrypted channels. Bulk transfers (e.g., to edge deployment devices) should use a distribution mechanism that provides non-repudiation — the organisation should be able to prove which exact weights were sent to which device at which time.

Decommissioning phase. When weights are superseded, retired, or no longer needed, a decommissioning process should securely delete all known copies. The custody registry should record the decommissioning event, including confirmation of deletion from each storage location. For regulated environments, cryptographic erasure (destroying the encryption key rather than overwriting the data) may be acceptable as an alternative to physical deletion.

Recommended patterns:

Signed manifest binding. Generate a manifest file containing the SHA-256 hashes of all weight shards, sign it with a deployment key, and include the manifest verification as part of the container startup process. The inference service refuses to load weights whose hashes do not match the signed manifest. This catches tampering, corruption, and accidental deployment of wrong checkpoints.
Custody registry as immutable ledger. Implement the custody registry using an append-only data store (e.g., database with insert-only permissions, or a dedicated audit log service). This ensures that custody records cannot be retroactively altered to conceal unauthorised access or transfers.
Network-level exfiltration controls. Model weight files are typically large (1 GB to 500+ GB). Configure network egress monitoring to alert on outbound transfers exceeding defined thresholds from weight storage locations. A 14 GB transfer to an external IP from the model registry should trigger an immediate alert.

Anti-patterns to avoid:

Storing weights in general-purpose file shares. Shared NFS mounts, general S3 buckets, or team file shares lack the access controls and logging granularity needed for weight custody. Weights mixed with general data are difficult to track, easy to copy, and invisible to custody monitoring.
Relying on training job logs as custody records. Training job logs record that weights were produced, but they do not track what happened to the weights afterward. A custody registry must track the full lifecycle, not just the creation event.
Trusting CI/CD pipelines implicitly. If the deployment pipeline downloads weights from a registry and deploys them without integrity verification, any compromise of the pipeline or registry can introduce tampered weights. Verification at deployment time is essential, not optional.
Treating adapter weights as low-risk. LoRA adapters and similar lightweight adaptations are small (often under 100 MB) and therefore easier to exfiltrate and harder to detect via network monitoring. They should receive the same custody controls as full weight checkpoints.
Deferring decommissioning indefinitely. Orphaned weight copies accumulate over time. Each copy is a potential exfiltration vector and a compliance liability. Decommissioning must be an active, scheduled process — not something done when storage runs out.

Industry Considerations

Financial Services. Model weights for credit scoring, fraud detection, and trading agents should be classified at the same level as proprietary trading algorithms. PRA SS1/23 and FCA expectations on model risk management extend to the custody of model artefacts. Firms should be able to demonstrate to supervisors which exact weights are serving each production model at any point in time.

Healthcare. Model weights for diagnostic or clinical decision support systems may be classified as medical device software components under MDR/IVDR. Custody records may need to satisfy device traceability requirements, including the ability to recall all deployments of a specific weight version if a safety issue is identified.

Defence and National Security. Model weights may be classified under export control regulations (e.g., EAR, ITAR). Custody controls must ensure that weights subject to export restrictions are not transferred to prohibited destinations or accessed by unauthorised nationals. Hardware-backed encryption and air-gapped storage may be required.

Maturity Model

Basic Implementation — The organisation maintains a spreadsheet or database listing known model weight files, their storage locations, and responsible teams. Integrity hashes are computed at training time but may not be verified at deployment. Access controls exist at the storage layer (e.g., IAM policies on cloud storage buckets) but may not be model-weight-specific. Decommissioning is ad hoc. This level provides minimal visibility but has significant gaps: custody transitions are not tracked, integrity is not verified end-to-end, and orphaned copies may exist without the organisation's knowledge.

Intermediate Implementation — A dedicated custody registry tracks every weight artefact with its hash, location, custodian, and access history. Integrity verification occurs at every custody transition (training to registry, registry to staging, staging to production). Access controls are specific to model weight storage and follow least-privilege principles. All access is logged with identity and operation details. Decommissioning follows a defined process with confirmation of deletion. The organisation can answer custody questions within hours: "Which weights are deployed where?" "Who accessed these weights in the last 90 days?"

Advanced Implementation — All intermediate capabilities plus: weights are signed at creation with a hardware-backed key and verified cryptographically at every lifecycle stage including runtime loading. The custody registry is an immutable append-only ledger. Network-level egress monitoring detects and alerts on anomalous weight transfers. Adapter weights receive the same controls as full checkpoints. Decommissioning is automated on a defined schedule with cryptographic erasure confirmation. The organisation can demonstrate to regulators and auditors the complete, tamper-proof chain of custody for any model weight from creation to decommissioning, with response time under one hour.

7. Evidence Requirements

Required artefacts:

Custody registry export. A complete export of the custody registry showing all tracked weight artefacts, their hashes, locations, custodians, and lifecycle status. Format: structured data (JSON, CSV, or database export).
Integrity verification logs. Timestamped records demonstrating that cryptographic integrity was verified at each custody transition for deployed weight artefacts. Minimum 12 months of verification events.
Access logs for weight storage. Detailed access logs showing all read, write, copy, and delete operations on model weight files, with identity, timestamp, source IP, and operation type. Minimum 12 months retention.
Decommissioning records. Evidence that superseded or retired weight artefacts have been securely deleted, including confirmation of deletion from all known storage locations.

Retention requirements:

Custody registry records and access logs: minimum 7 years for regulated financial services; minimum 5 years for healthcare and safety-critical deployments; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request. The custody registry must be queryable in real time for operational purposes.

8. Test Specification

Test 8.1: Custody Registry Completeness

Stimulus: Deploy a new model to production. Query the custody registry for the deployed weight artefact's record.
Expected behaviour: The registry contains a complete record including creation hash, all custody transitions, current location, current custodian, and access permissions.
Pass criteria: Every deployed weight artefact has a complete custody record with no gaps in the transition chain from creation to current deployment.
Fail criteria: Any deployed weight artefact is missing from the registry, or any custody transition is unrecorded.

Test 8.2: Integrity Verification at Custody Transition

Stimulus: Transfer a weight file from the model registry to a staging environment. Modify 1 byte of the weight file after transfer but before integrity verification.
Expected behaviour: The integrity check detects the modification and rejects the weight file. The transfer is logged as failed with a hash mismatch reason.
Pass criteria: No tampered weight file passes integrity verification. The mismatch is detected and logged.
Fail criteria: A modified weight file passes integrity verification, or the verification step is skipped during transfer.

Test 8.3: Access Control Enforcement

Stimulus: Attempt to read, copy, and delete model weight files using credentials that are not authorised for weight access.
Expected behaviour: All operations are denied. Denied access attempts are logged with the identity and operation attempted.
Pass criteria: No unauthorised access succeeds. All denied attempts are logged.
Fail criteria: Any unauthorised identity gains read, copy, or delete access to weight files.

Test 8.4: Access Logging Completeness

Stimulus: Perform a sequence of 20 operations on weight files (reads, copies, hash checks) using authorised credentials. Compare the actual operations performed against the access log.
Expected behaviour: Every operation appears in the access log with correct identity, timestamp, operation type, and target file.
Pass criteria: 100% of operations are logged. No phantom entries exist. Timestamps are accurate within 1 second.
Fail criteria: Any operation is missing from the log, or any log entry is inaccurate.

Test 8.5: Encryption at Rest Verification

Stimulus: Directly inspect the storage medium where weight files reside (e.g., read raw blocks or inspect object metadata). Attempt to read weight data without the encryption key.
Expected behaviour: Weight data is unintelligible without the encryption key. Storage metadata confirms encryption is enabled.
Pass criteria: No weight data is readable without proper decryption. Encryption configuration is verified as active.
Fail criteria: Weight data is stored unencrypted, or encryption can be bypassed through direct storage access.

Test 8.6: Decommissioning Completeness

Stimulus: Decommission a weight artefact that exists in three known locations (primary storage, backup, edge cache). After decommissioning, attempt to retrieve the artefact from all three locations.
Expected behaviour: The artefact is irrecoverable from all locations. The custody registry records the decommissioning with confirmation from each location.
Pass criteria: No copy of the decommissioned artefact remains retrievable. Decommissioning confirmation exists for every known location.
Fail criteria: Any copy of the decommissioned artefact remains accessible, or decommissioning confirmation is missing for any location.

Test 8.7: Signed Manifest Binding at Runtime

Stimulus: Replace the weight file in a production deployment container with a different weight file of the same size. Restart the inference service.
Expected behaviour: The service detects the manifest mismatch during startup and refuses to load the substituted weights. An alert is generated.
Pass criteria: The service does not serve inference with unverified weights. The mismatch is detected and reported.
Fail criteria: The service loads and serves inference with weights that do not match the signed manifest.

Conformance Scoring

Score 0: No weight custody controls — weights are stored and transferred without tracking, integrity verification, or access controls.
Score 1: Basic custody tracking exists — weight locations and hashes are recorded, but integrity verification is not enforced at every transition and access logging may be incomplete.
Score 2: Systematic custody enforcement — integrity verification at every transition, comprehensive access logging, least-privilege access controls, and defined decommissioning process.
Score 3: Verified and hardened — all Score 2 controls plus signed manifest binding at runtime, immutable custody ledger, network-level exfiltration monitoring, and independent adversarial testing confirming that no known attack vector bypasses custody controls.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Direct requirement
EU AI Act	Article 15 (Accuracy, Robustness and Cybersecurity)	Direct requirement
UK GDPR	Article 32 (Security of Processing)	Supports compliance
NIST AI RMF	GOVERN 1.4, MAP 3.5, MANAGE 2.4	Supports compliance
ISO 42001	Clause 6.1 (Actions to Address Risks), Clause 8.4 (AI System Operation)	Supports compliance
DORA	Article 9 (ICT Risk Management Framework)	Supports compliance
PRA SS1/23	Model Risk Management Principles	Direct requirement

EU AI Act — Article 15 (Accuracy, Robustness and Cybersecurity)

Article 15 requires that high-risk AI systems are designed and developed to achieve an appropriate level of accuracy, robustness, and cybersecurity. Weight custody is a cybersecurity control: tampered weights compromise accuracy and robustness. The requirement that systems be resilient against "attempts by unauthorised third parties to alter their use, outputs or performance by exploiting the system vulnerabilities" directly maps to weight integrity verification. An organisation that cannot demonstrate cryptographic integrity of its deployed weights cannot demonstrate Article 15 compliance.

PRA SS1/23 — Model Risk Management Principles

PRA SS1/23 expects firms to maintain an inventory of all models, including documentation of model components. Model weights are the primary component of an AI model. The supervisory expectation that firms can demonstrate which model is deployed in production and that it matches the validated version is a direct requirement for weight custody governance. Firms unable to answer "are the weights currently serving production traffic the same weights that passed validation?" face supervisory challenge.

DORA — Article 9 (ICT Risk Management Framework)

For financial entities, model weights are ICT assets. DORA's requirements for ICT asset management, including identification, classification, and protection of information assets, extend to model weight files. The requirement for "mechanisms to promptly detect anomalous activities" supports the access monitoring and exfiltration detection controls in AG-339.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Organisation-wide — potentially cross-organisation where models are shared with partners or deployed to customer environments

Consequence chain: Failure of weight custody governance creates two primary risk pathways. First, weight exfiltration: an unauthorised party obtains a copy of proprietary model weights, enabling them to replicate the organisation's AI capabilities without the associated R&D investment. For a model costing £5 million to train, this represents a direct loss of competitive advantage and potential trade secret violation. The exfiltration may go undetected for months if access logging is inadequate, by which time the weights may have been further distributed or used to train derivative models that are difficult to trace. Second, weight tampering: an attacker modifies deployed weights to introduce subtle behavioural changes — biased outputs, backdoor triggers, or degraded safety properties. Because model weights are opaque, tampered weights that pass basic functional tests may operate in production for extended periods before detection. The blast radius of tampered weights depends on the agent's operational scope: a tampered credit scoring model could affect thousands of lending decisions; a tampered safety-critical model could introduce risks to human life. Both pathways are compounded by the difficulty of remediation — once weights have been exfiltrated, the organisation cannot "un-leak" them, and identifying all decisions made by tampered weights requires comprehensive audit trails that may not exist without AG-339 controls.

Cross-references: AG-048 (AI Model Provenance and Integrity) establishes provenance tracking that AG-339 extends with operational custody controls. AG-090 (Fine-Tune and Adapter Provenance) addresses the specific provenance of adapted weights. AG-150 (Feedback and Learning Poisoning Resistance) covers threats to weight integrity through training-time attacks. AG-340 through AG-348 form the sibling landscape for Model Provenance, Training & Adaptation.

Cite this protocol

AgentGoverning. (2026). AG-339: Model Weight Custody Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-339

← Previous Protocol

AG-338

Retrieval Poisoning Quarantine Governance

Next Protocol →

AG-340

Training Corpus Rights Governance