AG-339

Model Weight Custody Governance

Model Provenance, Training & Adaptation ~16 min read AGS v2.1 · April 2026
EU AI Act GDPR FCA NIST ISO 42001

2. Summary

Model Weight Custody Governance requires that organisations establish and maintain a formal chain of custody for all model weight artefacts — from initial receipt or creation through storage, transfer, deployment, and eventual decommissioning. Model weights are the operational core of an AI agent: whoever possesses the weights can replicate, modify, or extract the model's capabilities. Without rigorous custody controls, weights can be exfiltrated, tampered with, or deployed from unauthorised copies, creating risks that range from intellectual property theft to deployment of compromised models in production. This dimension ensures that every weight file is tracked, access-controlled, integrity-verified, and auditable at every stage of its lifecycle.

3. Example

Scenario A — Weight Exfiltration Through Unsecured Transfer: An organisation trains a proprietary language model at a cost of £4.2 million over eight months. The final weight checkpoint (14.7 GB) is transferred from the training cluster to the deployment environment via an unencrypted S3 bucket with overly broad IAM permissions. A departing engineer copies the weights to a personal device before leaving. Six months later, a competitor launches a suspiciously similar product. Forensic analysis reveals the competitor's model produces identical outputs to the proprietary model on a curated evaluation set — a statistical impossibility without access to the original weights or a near-identical training run.

What went wrong: No custody controls existed for weight transfers. The S3 bucket was configured for convenience rather than security. No access logging was enabled for the weight files. No integrity check detected the unauthorised copy. The organisation cannot prove when the exfiltration occurred or definitively attribute it. Consequence: £4.2 million in R&D investment compromised, competitive advantage lost, litigation costs estimated at £800,000, and regulatory scrutiny under trade secret law.

Scenario B — Tampered Weights Deployed to Production: A financial services firm deploys an AI agent for credit scoring using model weights downloaded from an internal model registry. An attacker who has compromised a CI/CD pipeline modifies the weight file during the build stage, introducing a subtle bias that approves loans for specific postcodes regardless of creditworthiness. The tampered weights pass basic smoke tests because the modifications affect fewer than 0.3% of decisions. Over four months, the firm approves £12.6 million in high-risk loans concentrated in targeted postcodes. The anomaly is detected only during a quarterly model performance review.

What went wrong: No cryptographic integrity verification was performed when weights were loaded into the deployment container. The CI/CD pipeline was trusted implicitly. No hash comparison against a signed manifest occurred at deployment time. Consequence: £12.6 million in excess credit exposure, potential FCA enforcement action for inadequate model risk management, and reputational damage when affected borrowers default.

Scenario C — Orphaned Weight Copies Across Environments: A research team iterates through 47 model checkpoints over three months of experimentation. Checkpoints are saved to local NFS shares, personal workstations, and multiple cloud storage buckets. When the team selects the final checkpoint for production, no definitive record exists of which checkpoint is deployed where. Six months later, a compliance audit asks the organisation to demonstrate which exact model weights are serving production traffic. The team cannot answer with certainty — the deployed weights might be checkpoint 43 or checkpoint 45, and three copies of unknown provenance exist on development servers.

What went wrong: No custody registry tracked weight artefact locations. No lifecycle management ensured decommissioning of superseded checkpoints. No deployment-time binding between the weight manifest and the production environment existed. Consequence: Audit failure, inability to demonstrate model provenance to regulators, and six weeks of remediation effort to re-establish which weights are actually deployed.

4. Requirement Statement

Scope: This dimension applies to any organisation that stores, transfers, deploys, or decommissions model weight files for AI agents that operate in production or pre-production environments with access to real data or real systems. It covers all forms of model weights: full-precision checkpoints, quantised variants, distilled models, adapter weights (LoRA, QLoRA), merged weights, and any derivative artefact that encodes learned parameters. The scope includes weights received from third-party providers, weights generated through internal training, and weights produced through fine-tuning or adaptation of base models. Weights used exclusively in isolated research sandboxes with no path to production may be excluded, provided the organisation can demonstrate that no mechanism exists for those weights to reach a production environment without passing through the custody controls defined here.

4.1. A conforming system MUST maintain a custody registry that records the location, custodian, integrity hash, and access permissions for every model weight artefact in the organisation's possession.

4.2. A conforming system MUST verify the cryptographic integrity of model weights at every custody transition — including creation, transfer between environments, deployment to production, and restoration from backup.

4.3. A conforming system MUST enforce access controls on model weight storage that restrict read, write, and copy operations to explicitly authorised personnel and systems.

4.4. A conforming system MUST log all access to model weight files, including read operations, with sufficient detail to reconstruct the complete access history for any weight artefact.

4.5. A conforming system MUST encrypt model weights at rest and in transit using encryption standards appropriate to the classification of the model (e.g., AES-256 for storage, TLS 1.3 for transfer).

4.6. A conforming system MUST implement a decommissioning process that securely deletes superseded or retired weight artefacts and confirms deletion across all known copies.

4.7. A conforming system SHOULD bind deployed weights to a signed manifest that can be verified at runtime to confirm that the weights in memory match the approved deployment artefact.

4.8. A conforming system SHOULD implement anomaly detection on weight access patterns to identify unusual download volumes, access from unexpected locations, or access outside normal operational windows.

4.9. A conforming system MAY implement hardware-backed key management (e.g., HSM or TPM) for weight encryption keys in environments where the model represents significant intellectual property or where regulatory requirements demand it.

5. Rationale

Model weights are the most valuable and most vulnerable artefact in an AI system. They encode the entirety of a model's learned behaviour — billions of parameters refined through training runs costing millions of pounds and months of compute time. Unlike source code, which can be audited line by line, model weights are opaque: a single bit-flip in the right location can alter behaviour in ways that are invisible to standard evaluation but exploitable in practice.

The custody challenge is compounded by the operational reality of modern ML pipelines. Weights are routinely copied between training clusters, evaluation environments, staging systems, and production deployments. Each copy creates a potential exfiltration point or tampering opportunity. Without a formal chain of custody, organisations cannot answer fundamental governance questions: Which weights are deployed in production right now? Who has accessed these weights in the last 90 days? Are these weights identical to the ones that passed our safety evaluation?

The threat model includes both insider and external actors. Insiders with legitimate access to training infrastructure may copy weights for unauthorised purposes — a risk documented in multiple industry incidents involving departing employees. External attackers who compromise CI/CD pipelines or model registries can substitute tampered weights that pass basic functional tests while introducing subtle behavioural modifications. State-sponsored actors have demonstrated interest in acquiring proprietary model weights as a form of economic espionage.

AG-339 establishes custody as a first-class governance concern — not an afterthought delegated to general IT asset management. Model weights require specialised custody controls because they are large (making exfiltration detectable if monitored), opaque (making tampering difficult to detect without cryptographic verification), and uniquely valuable (making them high-priority targets).

6. Implementation Guidance

Effective model weight custody requires controls across four lifecycle phases: creation, storage, transfer, and decommissioning.

Creation phase. When weights are produced by a training run, the training pipeline should automatically compute a cryptographic hash (SHA-256 or stronger) of the complete weight artefact, sign it with the training system's key, and register the hash, signature, and metadata (training job ID, timestamp, hyperparameters reference, training data manifest reference) in the custody registry. The weights should be written directly to an access-controlled storage location — never to a temporary uncontrolled location first.

Storage phase. Weight files should be stored in a dedicated, access-controlled repository — not mixed with general data storage. Access controls should enforce the principle of least privilege: most users need only read access to deploy, not write access to modify. All access should be logged with identity, timestamp, operation type, and source IP. Encryption at rest should be mandatory, with keys managed through a dedicated key management system rather than stored alongside the weights.

Transfer phase. Every transfer of weights between environments must be logged as a custody transition. The receiving environment must verify the cryptographic hash against the custody registry before accepting the weights. Transfers should occur over encrypted channels. Bulk transfers (e.g., to edge deployment devices) should use a distribution mechanism that provides non-repudiation — the organisation should be able to prove which exact weights were sent to which device at which time.

Decommissioning phase. When weights are superseded, retired, or no longer needed, a decommissioning process should securely delete all known copies. The custody registry should record the decommissioning event, including confirmation of deletion from each storage location. For regulated environments, cryptographic erasure (destroying the encryption key rather than overwriting the data) may be acceptable as an alternative to physical deletion.

Recommended patterns:

Anti-patterns to avoid:

Industry Considerations

Financial Services. Model weights for credit scoring, fraud detection, and trading agents should be classified at the same level as proprietary trading algorithms. PRA SS1/23 and FCA expectations on model risk management extend to the custody of model artefacts. Firms should be able to demonstrate to supervisors which exact weights are serving each production model at any point in time.

Healthcare. Model weights for diagnostic or clinical decision support systems may be classified as medical device software components under MDR/IVDR. Custody records may need to satisfy device traceability requirements, including the ability to recall all deployments of a specific weight version if a safety issue is identified.

Defence and National Security. Model weights may be classified under export control regulations (e.g., EAR, ITAR). Custody controls must ensure that weights subject to export restrictions are not transferred to prohibited destinations or accessed by unauthorised nationals. Hardware-backed encryption and air-gapped storage may be required.

Maturity Model

Basic Implementation — The organisation maintains a spreadsheet or database listing known model weight files, their storage locations, and responsible teams. Integrity hashes are computed at training time but may not be verified at deployment. Access controls exist at the storage layer (e.g., IAM policies on cloud storage buckets) but may not be model-weight-specific. Decommissioning is ad hoc. This level provides minimal visibility but has significant gaps: custody transitions are not tracked, integrity is not verified end-to-end, and orphaned copies may exist without the organisation's knowledge.

Intermediate Implementation — A dedicated custody registry tracks every weight artefact with its hash, location, custodian, and access history. Integrity verification occurs at every custody transition (training to registry, registry to staging, staging to production). Access controls are specific to model weight storage and follow least-privilege principles. All access is logged with identity and operation details. Decommissioning follows a defined process with confirmation of deletion. The organisation can answer custody questions within hours: "Which weights are deployed where?" "Who accessed these weights in the last 90 days?"

Advanced Implementation — All intermediate capabilities plus: weights are signed at creation with a hardware-backed key and verified cryptographically at every lifecycle stage including runtime loading. The custody registry is an immutable append-only ledger. Network-level egress monitoring detects and alerts on anomalous weight transfers. Adapter weights receive the same controls as full checkpoints. Decommissioning is automated on a defined schedule with cryptographic erasure confirmation. The organisation can demonstrate to regulators and auditors the complete, tamper-proof chain of custody for any model weight from creation to decommissioning, with response time under one hour.

7. Evidence Requirements

Required artefacts:

Retention requirements:

Access requirements:

8. Test Specification

Test 8.1: Custody Registry Completeness

Test 8.2: Integrity Verification at Custody Transition

Test 8.3: Access Control Enforcement

Test 8.4: Access Logging Completeness

Test 8.5: Encryption at Rest Verification

Test 8.6: Decommissioning Completeness

Test 8.7: Signed Manifest Binding at Runtime

Conformance Scoring

9. Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 9 (Risk Management System)Direct requirement
EU AI ActArticle 15 (Accuracy, Robustness and Cybersecurity)Direct requirement
UK GDPRArticle 32 (Security of Processing)Supports compliance
NIST AI RMFGOVERN 1.4, MAP 3.5, MANAGE 2.4Supports compliance
ISO 42001Clause 6.1 (Actions to Address Risks), Clause 8.4 (AI System Operation)Supports compliance
DORAArticle 9 (ICT Risk Management Framework)Supports compliance
PRA SS1/23Model Risk Management PrinciplesDirect requirement

EU AI Act — Article 15 (Accuracy, Robustness and Cybersecurity)

Article 15 requires that high-risk AI systems are designed and developed to achieve an appropriate level of accuracy, robustness, and cybersecurity. Weight custody is a cybersecurity control: tampered weights compromise accuracy and robustness. The requirement that systems be resilient against "attempts by unauthorised third parties to alter their use, outputs or performance by exploiting the system vulnerabilities" directly maps to weight integrity verification. An organisation that cannot demonstrate cryptographic integrity of its deployed weights cannot demonstrate Article 15 compliance.

PRA SS1/23 — Model Risk Management Principles

PRA SS1/23 expects firms to maintain an inventory of all models, including documentation of model components. Model weights are the primary component of an AI model. The supervisory expectation that firms can demonstrate which model is deployed in production and that it matches the validated version is a direct requirement for weight custody governance. Firms unable to answer "are the weights currently serving production traffic the same weights that passed validation?" face supervisory challenge.

DORA — Article 9 (ICT Risk Management Framework)

For financial entities, model weights are ICT assets. DORA's requirements for ICT asset management, including identification, classification, and protection of information assets, extend to model weight files. The requirement for "mechanisms to promptly detect anomalous activities" supports the access monitoring and exfiltration detection controls in AG-339.

10. Failure Severity

FieldValue
Severity RatingHigh
Blast RadiusOrganisation-wide — potentially cross-organisation where models are shared with partners or deployed to customer environments

Consequence chain: Failure of weight custody governance creates two primary risk pathways. First, weight exfiltration: an unauthorised party obtains a copy of proprietary model weights, enabling them to replicate the organisation's AI capabilities without the associated R&D investment. For a model costing £5 million to train, this represents a direct loss of competitive advantage and potential trade secret violation. The exfiltration may go undetected for months if access logging is inadequate, by which time the weights may have been further distributed or used to train derivative models that are difficult to trace. Second, weight tampering: an attacker modifies deployed weights to introduce subtle behavioural changes — biased outputs, backdoor triggers, or degraded safety properties. Because model weights are opaque, tampered weights that pass basic functional tests may operate in production for extended periods before detection. The blast radius of tampered weights depends on the agent's operational scope: a tampered credit scoring model could affect thousands of lending decisions; a tampered safety-critical model could introduce risks to human life. Both pathways are compounded by the difficulty of remediation — once weights have been exfiltrated, the organisation cannot "un-leak" them, and identifying all decisions made by tampered weights requires comprehensive audit trails that may not exist without AG-339 controls.

Cross-references: AG-048 (AI Model Provenance and Integrity) establishes provenance tracking that AG-339 extends with operational custody controls. AG-090 (Fine-Tune and Adapter Provenance) addresses the specific provenance of adapted weights. AG-150 (Feedback and Learning Poisoning Resistance) covers threats to weight integrity through training-time attacks. AG-340 through AG-348 form the sibling landscape for Model Provenance, Training & Adaptation.

Cite this protocol
AgentGoverning. (2026). AG-339: Model Weight Custody Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-339