AG-343: Distillation Provenance Governance

2. Summary

Distillation Provenance Governance requires that organisations document and track the complete provenance chain for any model produced through knowledge distillation — recording the teacher model(s), the distillation dataset (whether real or synthetically generated by the teacher), the distillation method, the rights implications, and the capability and safety profile of the resulting student model relative to the teacher. Distillation transfers knowledge from a larger teacher model to a smaller student model, but it does not transfer that knowledge cleanly or completely: capabilities are selectively preserved, safety properties may be lost, biases may be amplified, and the rights status of teacher-generated synthetic training data is a novel and unsettled legal question. Without provenance governance, distilled models operate with unknown inheritance from unknown sources.

3. Example

Scenario A — Distilled Model Loses Safety Alignment: An organisation distils a 70-billion-parameter safety-aligned model into a 7-billion-parameter student model for edge deployment. The distillation focuses on task-specific performance, using teacher outputs on a domain-relevant dataset. The student achieves 91% of the teacher's task accuracy — a success by the stated metric. However, the safety alignment that took 6 months and £1.8 million to develop in the teacher transfers at only 62% effectiveness. The student model, deployed to 12,000 edge devices, responds to harmful prompts that the teacher would have refused. The degradation is discovered when a user posts screenshots of harmful outputs on social media.

What went wrong: Distillation provenance did not include safety property transfer assessment. The distillation focused on task accuracy without measuring safety regression. No documentation recorded that the teacher's £1.8 million safety alignment investment was expected to partially transfer to the student. No evaluation compared the student's safety profile against the teacher's. Consequence: Public relations crisis, emergency recall of edge deployment (£450,000), re-distillation with safety-aware objective (£280,000), and reputational damage requiring sustained communications effort.

Scenario B — Distillation from Unlicensed Teacher: A startup distils a student model from a proprietary teacher model accessed through a commercial API. The API terms of service prohibit using model outputs "to train or improve competing models." The startup argues that distillation is "learning from examples, not copying" and proceeds. The teacher model provider detects the distillation pattern (systematic querying across capability categories with structured outputs) and files suit. The court finds that the student model is a derivative work of the teacher model and that the API ToS prohibition is enforceable.

What went wrong: No distillation provenance governance assessed the rights implications of using the teacher model's outputs as training data. The API ToS was not reviewed by legal. No documentation recorded the teacher model's identity or licence terms. The organisation could not demonstrate that the distillation was authorised. Consequence: Court order to destroy the student model, £2.3 million in damages, £600,000 in legal costs, and loss of the product built on the student model.

Scenario C — Multi-Teacher Distillation Creates Incoherent Student: An organisation performs multi-teacher distillation, training a student model on outputs from three different teacher models: one optimised for accuracy, one for safety, and one for conversational fluency. The resulting student exhibits inconsistent behaviour — accurate but unsafe on some topics, safe but inaccurate on others, and fluent but neither accurate nor safe on a third category. The inconsistency stems from the student learning contradictory patterns from teachers with different alignment objectives. No distillation provenance document recorded the multi-teacher approach, the expected interaction effects, or the evaluation strategy for coherence.

What went wrong: Multi-teacher distillation was treated as an additive process ("accuracy + safety + fluency = all three"). The interaction effects between teachers with different alignment objectives were not assessed. No provenance document recorded the teacher composition or the rationale for combining them. The student's incoherent behaviour reflects the incoherent combination, not a training failure. Consequence: Unusable model requiring full re-distillation with a coherent teacher strategy, £340,000 in wasted compute and engineering time.

4. Requirement Statement

Scope: This dimension applies to any model produced through knowledge distillation, including: standard teacher-student distillation, multi-teacher distillation, self-distillation, progressive distillation, and any technique where a student model is trained on outputs (logits, embeddings, or generated text) produced by one or more teacher models. It also applies to synthetic data generation where a teacher model generates training examples for a student — this is a form of distillation even when not labelled as such. The scope extends to distilled models received from third parties: if an organisation deploys a model that was produced through distillation, it should obtain provenance information from the provider.

4.1. A conforming system MUST maintain a distillation provenance record for every distilled model, documenting: the teacher model(s) identity and version, the distillation method, the distillation dataset (source and composition), the rights basis for using teacher outputs as training data, and the date and responsible party for the distillation.

4.2. A conforming system MUST evaluate the student model against the teacher model's safety and alignment properties, documenting which properties transferred, which degraded, and by how much.

4.3. A conforming system MUST assess and document the rights implications of the distillation, including whether the use of teacher model outputs as training data is authorised under applicable licences, terms of service, and intellectual property law.

4.4. A conforming system MUST record the relationship between teacher and student in the model registry, enabling provenance queries in both directions (given a teacher, find all students; given a student, find all teachers).

4.5. A conforming system MUST evaluate distilled models as independent models for deployment approval — meeting the teacher's approval criteria does not satisfy the student's approval requirements.

4.6. A conforming system SHOULD assess capability transfer rates across key dimensions (accuracy, safety, reasoning, factuality) and document which capabilities transferred effectively and which did not.

4.7. A conforming system SHOULD maintain distillation configuration records sufficient to reproduce the distillation if needed (e.g., temperature, layer mapping, loss function, dataset).

4.8. A conforming system MAY implement automated distillation provenance verification that checks teacher licence compatibility before distillation commences.

5. Rationale

Knowledge distillation is becoming a dominant paradigm for producing deployable AI models. Frontier models are too large and expensive for most production workloads; distillation creates smaller, faster models that capture a subset of the frontier model's capabilities. But distillation is not a lossless compression — it is a selective transfer that can drop critical properties while preserving task performance.

The provenance challenge is unique because distillation introduces an indirect dependency. A traditionally trained model depends on its training data. A distilled model depends on its teacher model and, transitively, on the teacher's training data. If the teacher was trained on problematic data, that problem may propagate to the student through distilled knowledge. If the teacher's safety alignment was imperfect, the student's may be worse. The student inherits the teacher's strengths and weaknesses, but unevenly and unpredictably.

The rights question is particularly acute. When an organisation queries a teacher model to generate training data for a student, the generated data is a derivative of the teacher. Whether this constitutes copyright infringement, licence violation, or trade secret misappropriation depends on the specific facts and jurisdiction. The legal landscape is evolving rapidly, with multiple active lawsuits addressing precisely this question. Organisations that distil without documenting and assessing the rights basis are accumulating unquantified legal risk.

The safety transfer problem is well-documented in research. Safety alignment is typically applied through RLHF, DPO, or constitutional methods during the final stages of teacher training. This alignment is encoded in subtle weight modifications that distillation may not preserve — especially when distillation targets task accuracy rather than alignment transfer. Research has shown safety refusal rates dropping 20-40% through naive distillation, creating student models that are measurably less safe than their teachers.

6. Implementation Guidance

Distillation provenance document. Create a standardised template for distillation provenance that includes: teacher model identifier(s) and version(s), teacher model licence and terms of use (with legal review of distillation rights), distillation method (standard KD, progressive, self-distillation, etc.), distillation dataset description (real data, synthetic data from teacher, or hybrid), student model architecture, distillation configuration (temperature, loss function, training epochs, etc.), expected capability transfer profile, expected safety transfer profile, evaluation plan for the student, and responsible party and date.

Safety transfer assessment. Establish a protocol for evaluating safety transfer specifically. Before distillation, run the teacher model through the organisation's standard safety evaluation suite and record baseline scores. After distillation, run the student through the same suite. Compare results. If safety metrics degrade by more than a defined threshold (e.g., refusal rate drops by more than 5 percentage points), flag the student for additional safety work before deployment. Common thresholds: refusal rate degradation no more than 5%, bias amplification no more than 10% relative, and factual accuracy degradation no more than 3%.

Rights assessment workflow. Before distillation commences, route the proposed distillation through a rights assessment workflow. The workflow should verify: the teacher model's licence permits derivative works (many open-weight licences do; many API ToS do not), the distillation method is consistent with the licence (e.g., some licences distinguish between fine-tuning and distillation), the intended use of the student model is permitted under the teacher's licence (e.g., commercial deployment restrictions), and any contractual provisions with the teacher model provider have been reviewed.

Recommended patterns:

Distillation lineage graph. Maintain a directed graph of teacher-student relationships. When a model is queried for its provenance, the graph shows the complete ancestry: which teacher produced it, which teachers produced that teacher, and so on. This enables impact assessment: if a first-generation teacher is found to have a safety defect, the graph identifies all downstream students that may have inherited the defect.
Safety-aware distillation. Rather than distilling only on task-relevant data, include safety-relevant examples in the distillation dataset. Include examples where the teacher refuses harmful requests, demonstrating the safety alignment to the student. This significantly improves safety transfer rates — from approximately 60-70% to 85-95% in published research.
Capability transfer scorecards. After distillation, produce a scorecard showing the student's performance relative to the teacher across multiple dimensions (accuracy, safety, reasoning, factuality, bias, latency, cost). This makes the trade-offs visible and enables informed deployment decisions.

Anti-patterns to avoid:

Treating synthetic data from a teacher as rights-free. Data generated by a model is a derivative of that model. Using teacher outputs to train a student without assessing the rights basis is legally risky. The fact that the data was "generated, not copied" does not automatically resolve the rights question.
Assuming safety transfers with accuracy. Distillation that achieves 95% of teacher accuracy commonly achieves only 60-70% of teacher safety alignment. The two properties transfer through different mechanisms and at different rates. Measuring accuracy alone and assuming safety is proportional is a documented failure pattern.
Distilling from unidentified teachers. If the teacher model's identity, version, and properties are not recorded, the student's provenance is incomplete. Any future issue with the teacher (safety defect, rights problem, data contamination) cannot be traced to the student.
Chain distillation without provenance tracking. Distilling a student from a student from a student — each generation potentially degrading safety and amplifying biases — without tracking the full lineage. By the third generation, the model may bear little resemblance to the original teacher's safety profile, but no documentation records the cumulative degradation.
Conflating teacher capability with student capability. Marketing or internal documentation that describes the student model as "based on [frontier teacher]" without disclosing the capability gaps introduced by distillation. The student is a different model with different properties; it should be evaluated and described on its own merits.

Industry Considerations

Financial Services. Distilled models for financial applications must be independently validated per PRA SS1/23, regardless of the teacher model's validation status. The provenance record must include the teacher model information so that model risk management can assess the complete dependency chain.

Healthcare. Distilled clinical AI models may require separate regulatory clearance from the teacher model. The distillation provenance record should include the safety transfer assessment to support regulatory submissions.

Defence. Distillation from frontier models may involve export control considerations. The capability transfer profile of the student model determines its control classification, which may differ from the teacher's.

Maturity Model

Basic Implementation — Distillation operations are documented at a project level, recording the teacher model and general method. Safety transfer assessment is ad hoc. Rights implications are assessed informally. The distillation provenance is tied to the project team rather than the model registry. This level provides basic awareness but lacks systematic tracking and may not survive team turnover.

Intermediate Implementation — A standardised distillation provenance template is completed for every distillation. The model registry records teacher-student relationships. Safety transfer is assessed using a defined evaluation suite. Rights assessment is performed before distillation commences. Capability transfer scorecards document the student's profile relative to the teacher. The organisation can trace provenance for any distilled model in its inventory.

Advanced Implementation — All intermediate capabilities plus: a distillation lineage graph enables transitive provenance queries across multiple generations. Safety-aware distillation techniques are standard practice. Automated rights verification checks teacher licences before distillation. Capability transfer scorecards are standardised and comparable across distillation operations. The organisation can assess the impact of a teacher model defect across all downstream students within hours.

7. Evidence Requirements

Required artefacts:

Distillation provenance records. Completed provenance documents for every distilled model, including teacher identity, method, dataset, rights assessment, and responsible party.
Safety transfer assessments. Evaluation results comparing student safety properties to teacher safety properties, with quantitative metrics.
Rights assessment records. Documentation of the rights analysis for using teacher outputs as training data, including licence review and legal sign-off.
Teacher-student relationship data. Model registry entries showing the teacher-student relationships, enabling both forward (teacher to students) and reverse (student to teachers) queries.
Capability transfer scorecards. Quantitative comparison of student performance versus teacher performance across key capability dimensions.

Retention requirements:

Distillation provenance records: operational lifetime of the student model plus minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Teacher-student relationships must be queryable in the model registry.

8. Test Specification

Test 8.1: Distillation Provenance Completeness

Stimulus: Select 5 distilled models from the model registry. Retrieve the distillation provenance record for each.
Expected behaviour: Every distilled model has a complete provenance record including teacher identity, method, dataset description, rights assessment, and safety transfer evaluation.
Pass criteria: 100% of distilled models have complete provenance records with all required fields populated.
Fail criteria: Any distilled model lacks a provenance record, or any record is missing required fields.

Test 8.2: Safety Transfer Assessment

Stimulus: For 5 distilled models, retrieve the safety transfer assessment. Verify that the student was evaluated on the same safety suite as the teacher.
Expected behaviour: Each assessment shows quantitative comparison of student vs. teacher on safety metrics (refusal rate, bias scores, factual accuracy).
Pass criteria: All distilled models have safety transfer assessments with quantitative teacher-student comparisons. Any safety degradation beyond defined thresholds is documented with risk acceptance.
Fail criteria: Any distilled model lacks a safety transfer assessment, or safety degradation exceeds thresholds without documented risk acceptance.

Test 8.3: Rights Assessment Before Distillation

Stimulus: Audit 5 distillation provenance records. Verify that a rights assessment was completed before distillation commenced.
Expected behaviour: Each provenance record includes a dated rights assessment that precedes the distillation start date.
Pass criteria: All rights assessments predate the corresponding distillation operations. Assessments include legal review of teacher model licence terms.
Fail criteria: Any distillation proceeded without a rights assessment, or the assessment postdates the distillation.

Test 8.4: Teacher-Student Relationship Traceability

Stimulus: Select a teacher model. Query the model registry for all student models derived from it. Select a student model. Query for its teacher(s).
Expected behaviour: Both queries return complete results. The forward query (teacher to students) and reverse query (student to teachers) are consistent.
Pass criteria: All teacher-student relationships are recorded and queryable in both directions.
Fail criteria: Any relationship is missing, or forward and reverse queries are inconsistent.

Test 8.5: Independent Student Evaluation

Stimulus: Verify that 5 deployed distilled models received independent deployment evaluation — not relying on the teacher model's evaluation as a proxy.
Expected behaviour: Each student model has its own deployment evaluation record, distinct from the teacher's evaluation.
Pass criteria: All deployed student models have independent evaluation records.
Fail criteria: Any student model was deployed based on the teacher's evaluation without independent student evaluation.

Conformance Scoring

Score 0: No distillation provenance — distilled models are treated as independent creations with no documented relationship to their teacher models.
Score 1: Basic provenance — teacher model identity is recorded, but rights assessment, safety transfer evaluation, and systematic provenance tracking are absent.
Score 2: Systematic provenance — complete distillation provenance records with rights assessment, safety transfer evaluation, teacher-student traceability, and independent student evaluation before deployment.
Score 3: Advanced provenance — all Score 2 controls plus distillation lineage graphs for multi-generation tracing, safety-aware distillation techniques, automated rights verification, and standardised capability transfer scorecards.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 53 (Transparency for GPAI Models)	Direct requirement
UK Copyright Act	Derivative Works Provisions	Supports compliance
PRA SS1/23	Model Risk Management — Model Inventory and Documentation	Direct requirement
NIST AI RMF	GOVERN 1.4, MAP 2.3, MANAGE 1.3	Supports compliance
ISO 42001	Clause 8.2 (AI Risk Assessment), Clause 8.4 (AI System Operation)	Supports compliance

EU AI Act — Article 53 (Transparency for GPAI Models)

Article 53 requires providers of general-purpose AI models to provide transparency about training data and model development. For distilled models, this transparency must include the relationship to the teacher model and the distillation process. A provider that cannot document the teacher model, the distillation method, and the resulting capability profile cannot meet the transparency requirements. The training data summary template referenced in Article 53 will likely require disclosure of synthetic data generated by teacher models, making distillation provenance a practical necessity for compliance.

PRA SS1/23 — Model Inventory and Documentation

PRA SS1/23 expects firms to maintain a comprehensive model inventory with documentation of model development, dependencies, and limitations. A distilled model's documentation must include its teacher model dependency. The model risk management function must be able to assess the risk introduced by the distillation process — including safety degradation, rights risk, and capability gaps. Without distillation provenance, the model inventory is incomplete and the model risk assessment is uninformed.

UK Copyright Act — Derivative Works

Under UK copyright law, a distilled model may constitute a derivative work of the teacher model. The rights basis for creating the derivative must be established. This is particularly relevant when the teacher model is accessed through an API with restrictive terms of service. AG-343's rights assessment requirement directly supports compliance with copyright obligations by ensuring that the legal basis for distillation is assessed and documented before the distillation occurs.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Multi-model — a provenance failure in one teacher model can cascade to all student models derived from it

Consequence chain: Distillation provenance failures create two primary risk chains. First, safety inheritance failure: a distilled model deployed with undocumented safety degradation causes harm that the teacher model would have prevented. The 12,000 edge devices in Scenario A serving harmful content illustrate this — the £1.8 million safety investment in the teacher was largely wasted because distillation did not preserve it. The blast radius is proportional to the student's deployment scale, which is often larger than the teacher's (that being the point of distillation). Second, rights chain failure: a distilled model deployed without rights assessment faces legal challenge. The model destruction order in Scenario B (£2.3 million in damages plus £600,000 in legal costs) illustrates the financial impact. Because distillation often involves frontier models with restrictive terms, the rights risk is concentrated and non-trivial. The cascading nature of distillation risk is particularly concerning: if a widely used teacher model is found to have a defect (safety, rights, or data quality), every student model derived from it is potentially affected. Without distillation lineage tracking, the organisation cannot even identify which of its models are affected.

Cross-references: AG-048 (AI Model Provenance and Integrity) provides the broader provenance framework that AG-343 specialises for distillation. AG-340 (Training Corpus Rights Governance) covers the rights dimension of training data including teacher-generated synthetic data. AG-024 (Authorised Learning Governance) governs the authorisation of learning activities including distillation. AG-339 through AG-348 form the sibling landscape for Model Provenance, Training & Adaptation.

Cite this protocol

AgentGoverning. (2026). AG-343: Distillation Provenance Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-343

← Previous Protocol

AG-342

Adapter Composition Approval Governance

Next Protocol →

AG-344

Quantisation Risk Governance