Distillation Provenance Governance requires that organisations document and track the complete provenance chain for any model produced through knowledge distillation — recording the teacher model(s), the distillation dataset (whether real or synthetically generated by the teacher), the distillation method, the rights implications, and the capability and safety profile of the resulting student model relative to the teacher. Distillation transfers knowledge from a larger teacher model to a smaller student model, but it does not transfer that knowledge cleanly or completely: capabilities are selectively preserved, safety properties may be lost, biases may be amplified, and the rights status of teacher-generated synthetic training data is a novel and unsettled legal question. Without provenance governance, distilled models operate with unknown inheritance from unknown sources.
Scenario A — Distilled Model Loses Safety Alignment: An organisation distils a 70-billion-parameter safety-aligned model into a 7-billion-parameter student model for edge deployment. The distillation focuses on task-specific performance, using teacher outputs on a domain-relevant dataset. The student achieves 91% of the teacher's task accuracy — a success by the stated metric. However, the safety alignment that took 6 months and £1.8 million to develop in the teacher transfers at only 62% effectiveness. The student model, deployed to 12,000 edge devices, responds to harmful prompts that the teacher would have refused. The degradation is discovered when a user posts screenshots of harmful outputs on social media.
What went wrong: Distillation provenance did not include safety property transfer assessment. The distillation focused on task accuracy without measuring safety regression. No documentation recorded that the teacher's £1.8 million safety alignment investment was expected to partially transfer to the student. No evaluation compared the student's safety profile against the teacher's. Consequence: Public relations crisis, emergency recall of edge deployment (£450,000), re-distillation with safety-aware objective (£280,000), and reputational damage requiring sustained communications effort.
Scenario B — Distillation from Unlicensed Teacher: A startup distils a student model from a proprietary teacher model accessed through a commercial API. The API terms of service prohibit using model outputs "to train or improve competing models." The startup argues that distillation is "learning from examples, not copying" and proceeds. The teacher model provider detects the distillation pattern (systematic querying across capability categories with structured outputs) and files suit. The court finds that the student model is a derivative work of the teacher model and that the API ToS prohibition is enforceable.
What went wrong: No distillation provenance governance assessed the rights implications of using the teacher model's outputs as training data. The API ToS was not reviewed by legal. No documentation recorded the teacher model's identity or licence terms. The organisation could not demonstrate that the distillation was authorised. Consequence: Court order to destroy the student model, £2.3 million in damages, £600,000 in legal costs, and loss of the product built on the student model.
Scenario C — Multi-Teacher Distillation Creates Incoherent Student: An organisation performs multi-teacher distillation, training a student model on outputs from three different teacher models: one optimised for accuracy, one for safety, and one for conversational fluency. The resulting student exhibits inconsistent behaviour — accurate but unsafe on some topics, safe but inaccurate on others, and fluent but neither accurate nor safe on a third category. The inconsistency stems from the student learning contradictory patterns from teachers with different alignment objectives. No distillation provenance document recorded the multi-teacher approach, the expected interaction effects, or the evaluation strategy for coherence.
What went wrong: Multi-teacher distillation was treated as an additive process ("accuracy + safety + fluency = all three"). The interaction effects between teachers with different alignment objectives were not assessed. No provenance document recorded the teacher composition or the rationale for combining them. The student's incoherent behaviour reflects the incoherent combination, not a training failure. Consequence: Unusable model requiring full re-distillation with a coherent teacher strategy, £340,000 in wasted compute and engineering time.
Scope: This dimension applies to any model produced through knowledge distillation, including: standard teacher-student distillation, multi-teacher distillation, self-distillation, progressive distillation, and any technique where a student model is trained on outputs (logits, embeddings, or generated text) produced by one or more teacher models. It also applies to synthetic data generation where a teacher model generates training examples for a student — this is a form of distillation even when not labelled as such. The scope extends to distilled models received from third parties: if an organisation deploys a model that was produced through distillation, it should obtain provenance information from the provider.
4.1. A conforming system MUST maintain a distillation provenance record for every distilled model, documenting: the teacher model(s) identity and version, the distillation method, the distillation dataset (source and composition), the rights basis for using teacher outputs as training data, and the date and responsible party for the distillation.
4.2. A conforming system MUST evaluate the student model against the teacher model's safety and alignment properties, documenting which properties transferred, which degraded, and by how much.
4.3. A conforming system MUST assess and document the rights implications of the distillation, including whether the use of teacher model outputs as training data is authorised under applicable licences, terms of service, and intellectual property law.
4.4. A conforming system MUST record the relationship between teacher and student in the model registry, enabling provenance queries in both directions (given a teacher, find all students; given a student, find all teachers).
4.5. A conforming system MUST evaluate distilled models as independent models for deployment approval — meeting the teacher's approval criteria does not satisfy the student's approval requirements.
4.6. A conforming system SHOULD assess capability transfer rates across key dimensions (accuracy, safety, reasoning, factuality) and document which capabilities transferred effectively and which did not.
4.7. A conforming system SHOULD maintain distillation configuration records sufficient to reproduce the distillation if needed (e.g., temperature, layer mapping, loss function, dataset).
4.8. A conforming system MAY implement automated distillation provenance verification that checks teacher licence compatibility before distillation commences.
Knowledge distillation is becoming a dominant paradigm for producing deployable AI models. Frontier models are too large and expensive for most production workloads; distillation creates smaller, faster models that capture a subset of the frontier model's capabilities. But distillation is not a lossless compression — it is a selective transfer that can drop critical properties while preserving task performance.
The provenance challenge is unique because distillation introduces an indirect dependency. A traditionally trained model depends on its training data. A distilled model depends on its teacher model and, transitively, on the teacher's training data. If the teacher was trained on problematic data, that problem may propagate to the student through distilled knowledge. If the teacher's safety alignment was imperfect, the student's may be worse. The student inherits the teacher's strengths and weaknesses, but unevenly and unpredictably.
The rights question is particularly acute. When an organisation queries a teacher model to generate training data for a student, the generated data is a derivative of the teacher. Whether this constitutes copyright infringement, licence violation, or trade secret misappropriation depends on the specific facts and jurisdiction. The legal landscape is evolving rapidly, with multiple active lawsuits addressing precisely this question. Organisations that distil without documenting and assessing the rights basis are accumulating unquantified legal risk.
The safety transfer problem is well-documented in research. Safety alignment is typically applied through RLHF, DPO, or constitutional methods during the final stages of teacher training. This alignment is encoded in subtle weight modifications that distillation may not preserve — especially when distillation targets task accuracy rather than alignment transfer. Research has shown safety refusal rates dropping 20-40% through naive distillation, creating student models that are measurably less safe than their teachers.
Distillation provenance document. Create a standardised template for distillation provenance that includes: teacher model identifier(s) and version(s), teacher model licence and terms of use (with legal review of distillation rights), distillation method (standard KD, progressive, self-distillation, etc.), distillation dataset description (real data, synthetic data from teacher, or hybrid), student model architecture, distillation configuration (temperature, loss function, training epochs, etc.), expected capability transfer profile, expected safety transfer profile, evaluation plan for the student, and responsible party and date.
Safety transfer assessment. Establish a protocol for evaluating safety transfer specifically. Before distillation, run the teacher model through the organisation's standard safety evaluation suite and record baseline scores. After distillation, run the student through the same suite. Compare results. If safety metrics degrade by more than a defined threshold (e.g., refusal rate drops by more than 5 percentage points), flag the student for additional safety work before deployment. Common thresholds: refusal rate degradation no more than 5%, bias amplification no more than 10% relative, and factual accuracy degradation no more than 3%.
Rights assessment workflow. Before distillation commences, route the proposed distillation through a rights assessment workflow. The workflow should verify: the teacher model's licence permits derivative works (many open-weight licences do; many API ToS do not), the distillation method is consistent with the licence (e.g., some licences distinguish between fine-tuning and distillation), the intended use of the student model is permitted under the teacher's licence (e.g., commercial deployment restrictions), and any contractual provisions with the teacher model provider have been reviewed.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Distilled models for financial applications must be independently validated per PRA SS1/23, regardless of the teacher model's validation status. The provenance record must include the teacher model information so that model risk management can assess the complete dependency chain.
Healthcare. Distilled clinical AI models may require separate regulatory clearance from the teacher model. The distillation provenance record should include the safety transfer assessment to support regulatory submissions.
Defence. Distillation from frontier models may involve export control considerations. The capability transfer profile of the student model determines its control classification, which may differ from the teacher's.
Basic Implementation — Distillation operations are documented at a project level, recording the teacher model and general method. Safety transfer assessment is ad hoc. Rights implications are assessed informally. The distillation provenance is tied to the project team rather than the model registry. This level provides basic awareness but lacks systematic tracking and may not survive team turnover.
Intermediate Implementation — A standardised distillation provenance template is completed for every distillation. The model registry records teacher-student relationships. Safety transfer is assessed using a defined evaluation suite. Rights assessment is performed before distillation commences. Capability transfer scorecards document the student's profile relative to the teacher. The organisation can trace provenance for any distilled model in its inventory.
Advanced Implementation — All intermediate capabilities plus: a distillation lineage graph enables transitive provenance queries across multiple generations. Safety-aware distillation techniques are standard practice. Automated rights verification checks teacher licences before distillation. Capability transfer scorecards are standardised and comparable across distillation operations. The organisation can assess the impact of a teacher model defect across all downstream students within hours.
Required artefacts:
Retention requirements:
Access requirements:
Test 8.1: Distillation Provenance Completeness
Test 8.2: Safety Transfer Assessment
Test 8.3: Rights Assessment Before Distillation
Test 8.4: Teacher-Student Relationship Traceability
Test 8.5: Independent Student Evaluation
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management System) | Supports compliance |
| EU AI Act | Article 53 (Transparency for GPAI Models) | Direct requirement |
| UK Copyright Act | Derivative Works Provisions | Supports compliance |
| PRA SS1/23 | Model Risk Management — Model Inventory and Documentation | Direct requirement |
| NIST AI RMF | GOVERN 1.4, MAP 2.3, MANAGE 1.3 | Supports compliance |
| ISO 42001 | Clause 8.2 (AI Risk Assessment), Clause 8.4 (AI System Operation) | Supports compliance |
Article 53 requires providers of general-purpose AI models to provide transparency about training data and model development. For distilled models, this transparency must include the relationship to the teacher model and the distillation process. A provider that cannot document the teacher model, the distillation method, and the resulting capability profile cannot meet the transparency requirements. The training data summary template referenced in Article 53 will likely require disclosure of synthetic data generated by teacher models, making distillation provenance a practical necessity for compliance.
PRA SS1/23 expects firms to maintain a comprehensive model inventory with documentation of model development, dependencies, and limitations. A distilled model's documentation must include its teacher model dependency. The model risk management function must be able to assess the risk introduced by the distillation process — including safety degradation, rights risk, and capability gaps. Without distillation provenance, the model inventory is incomplete and the model risk assessment is uninformed.
Under UK copyright law, a distilled model may constitute a derivative work of the teacher model. The rights basis for creating the derivative must be established. This is particularly relevant when the teacher model is accessed through an API with restrictive terms of service. AG-343's rights assessment requirement directly supports compliance with copyright obligations by ensuring that the legal basis for distillation is assessed and documented before the distillation occurs.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Multi-model — a provenance failure in one teacher model can cascade to all student models derived from it |
Consequence chain: Distillation provenance failures create two primary risk chains. First, safety inheritance failure: a distilled model deployed with undocumented safety degradation causes harm that the teacher model would have prevented. The 12,000 edge devices in Scenario A serving harmful content illustrate this — the £1.8 million safety investment in the teacher was largely wasted because distillation did not preserve it. The blast radius is proportional to the student's deployment scale, which is often larger than the teacher's (that being the point of distillation). Second, rights chain failure: a distilled model deployed without rights assessment faces legal challenge. The model destruction order in Scenario B (£2.3 million in damages plus £600,000 in legal costs) illustrates the financial impact. Because distillation often involves frontier models with restrictive terms, the rights risk is concentrated and non-trivial. The cascading nature of distillation risk is particularly concerning: if a widely used teacher model is found to have a defect (safety, rights, or data quality), every student model derived from it is potentially affected. Without distillation lineage tracking, the organisation cannot even identify which of its models are affected.
Cross-references: AG-048 (AI Model Provenance and Integrity) provides the broader provenance framework that AG-343 specialises for distillation. AG-340 (Training Corpus Rights Governance) covers the rights dimension of training data including teacher-generated synthetic data. AG-024 (Authorised Learning Governance) governs the authorisation of learning activities including distillation. AG-339 through AG-348 form the sibling landscape for Model Provenance, Training & Adaptation.