Fine-Tune Objective Documentation Governance requires that every fine-tuning operation is preceded by a formal, documented statement of intent that specifies the business objective, the behavioural hypothesis being tested, the expected capability trade-offs, the evaluation criteria for success and failure, and the rollback conditions. Fine-tuning is not a routine operational task — it is a deliberate modification of model behaviour that can introduce regressions, amplify biases, degrade safety properties, or create capabilities the organisation did not intend. Without documented objectives, organisations cannot evaluate whether a fine-tune achieved its purpose, cannot detect unintended behavioural changes, and cannot make principled decisions about whether to deploy, iterate, or abandon a fine-tuned model.
Scenario A — Fine-Tuning Degrades Safety Without Detection: A customer-facing agent is fine-tuned to be "more helpful" based on 50,000 human preference labels that reward longer, more detailed responses. The fine-tuning succeeds on its stated metric: average response length increases by 40% and helpfulness ratings improve by 12%. However, no documented objective specified which safety properties must be preserved. Post-fine-tuning evaluation reveals that the model's refusal rate for harmful requests dropped from 97.3% to 84.1% — because the preference data implicitly rewarded compliance over refusal. The degradation is discovered three weeks after deployment, during which the agent has served 1.2 million customer interactions.
What went wrong: The fine-tuning objective was defined in terms of a single metric (helpfulness) without specifying trade-off boundaries for safety properties. No documentation stated which capabilities must not degrade. No evaluation criteria for safety refusal rates were defined before fine-tuning commenced. Consequence: 1.2 million interactions served by a model with degraded safety properties, potential regulatory scrutiny for deploying a model that fails to refuse harmful requests, and remediation cost of £340,000 for emergency rollback, re-evaluation, and re-fine-tuning with safety-aware objectives.
Scenario B — Objective Drift Across Iterative Fine-Tuning: A financial-value agent undergoes five sequential fine-tuning rounds over four months. Each round is documented individually, but no overarching objective document tracks the cumulative intent. Round 1 targets accuracy on structured financial data. Round 2 improves conversational fluency. Round 3 adds regulatory compliance phrasing. Round 4 adjusts tone for enterprise customers. Round 5 optimises for response latency. By round 5, the model's accuracy on structured financial data — the original priority — has degraded by 8.7 percentage points because rounds 2 through 5 each slightly eroded the gains from round 1. No one noticed because each round was evaluated against its own objective, not against the cumulative objective stack.
What went wrong: No cumulative objective document tracked the expected behaviour across all fine-tuning rounds. Each round's evaluation criteria were local to that round. No regression testing against prior round objectives was mandated. The cumulative effect of five optimisations was a net degradation of the most critical capability. Consequence: Deployed agent provides inaccurate financial data to enterprise customers for six weeks before discovery, resulting in two client complaints, one regulatory inquiry, and £180,000 in remediation costs.
Scenario C — Undocumented Fine-Tune Makes Audit Impossible: A regulator asks an organisation to explain why its AI credit scoring agent rejects loan applications from a specific demographic group at a rate 2.3x higher than the population average. The agent was fine-tuned eight months ago on a proprietary dataset. No fine-tune objective document exists. The data scientist who performed the fine-tuning has left the organisation. The training configuration files are on a decommissioned server. The organisation cannot explain: what the fine-tuning intended to achieve, what data was used, what evaluation criteria were applied, or whether the observed demographic disparity was an anticipated trade-off, an unintended consequence, or a pre-existing bias amplified by fine-tuning.
What went wrong: No documented objective existed. No evaluation criteria were specified before fine-tuning. No artefact persisted beyond the individual who performed the work. The organisation has no institutional memory of why this fine-tune was done. Consequence: Regulatory finding for inability to explain model behaviour, potential enforcement action under equality legislation, and £2.4 million remediation programme including full model re-evaluation, retraining, and external audit.
Scope: This dimension applies to every fine-tuning, instruction-tuning, preference-tuning (RLHF, DPO, etc.), and continuous learning operation performed on a model that is intended for production deployment or pre-production evaluation with real data. It covers full fine-tuning (all parameters updated), parameter-efficient fine-tuning (LoRA, QLoRA, prefix tuning, etc.), and any operation that modifies model behaviour through gradient updates on new data. It does not apply to prompt engineering or in-context learning, which do not modify model weights. It does apply to operations performed by third parties on the organisation's behalf — the obligation to document objectives cannot be delegated to a vendor without the organisation retaining the documentation.
4.1. A conforming system MUST require a documented fine-tune objective before any fine-tuning operation commences, specifying: the business rationale, the behavioural hypothesis, the expected capability gains, the capability trade-offs accepted, the evaluation metrics and thresholds for success, and the conditions under which the fine-tuned model will be rolled back.
4.2. A conforming system MUST specify, in the fine-tune objective, which existing capabilities and safety properties MUST NOT degrade beyond defined thresholds, and include regression tests for those capabilities in the post-fine-tuning evaluation.
4.3. A conforming system MUST record the actual fine-tuning configuration (learning rate, number of epochs, batch size, data composition, parameter-efficiency method, and any regularisation techniques) alongside the objective, enabling reproducibility.
4.4. A conforming system MUST evaluate the fine-tuned model against all criteria specified in the objective document before any deployment decision, and record the evaluation results as a persistent artefact linked to the objective.
4.5. A conforming system MUST maintain fine-tune objective documents for the operational lifetime of the model plus the applicable retention period, independent of the individuals who performed the fine-tuning.
4.6. A conforming system SHOULD maintain a cumulative objective register for models that undergo multiple sequential fine-tuning rounds, tracking the intended capability stack and evaluating each round against the full cumulative objective set, not just the current round's objective.
4.7. A conforming system SHOULD require sign-off on the fine-tune objective from both a technical lead and a risk/governance representative before fine-tuning commences.
4.8. A conforming system MAY implement automated pre-fine-tuning checks that validate the objective document's completeness against a required schema before permitting the training pipeline to execute.
Fine-tuning is the primary mechanism by which general-purpose models are adapted to specific organisational contexts. It is also the primary mechanism by which model behaviour is degraded unintentionally. The literature on fine-tuning consistently demonstrates that optimising for one capability frequently degrades others — a phenomenon known as catastrophic forgetting in its extreme form, and more subtly as capability trade-off in its common form. Safety alignment, in particular, is fragile under fine-tuning: multiple research studies have shown that as few as 100 carefully selected examples can substantially degrade a model's safety refusal behaviour.
Despite these known risks, fine-tuning in practice is often treated as a routine operational task — "we'll fine-tune it on our data and see if it gets better." This casual approach fails because it defines no baseline, specifies no trade-off boundaries, and establishes no criteria for success or failure. Without a documented objective, the organisation cannot answer the most basic governance question: "Did this fine-tune do what we intended?"
The documentation requirement serves three functions. First, it forces intentionality: writing down the objective before fine-tuning requires the team to think carefully about what they are trying to achieve, what they are willing to sacrifice, and how they will know if they succeeded. Second, it enables evaluation: post-fine-tuning assessment is meaningful only when measured against pre-defined criteria — otherwise, the team selects metrics post hoc that make the result look favourable. Third, it enables auditability: regulators, auditors, and future team members can understand why a model behaves the way it does by reading the documented chain of fine-tuning objectives.
Fine-tune objective document template. Establish a standardised template that every fine-tuning operation must complete. The template should include: business rationale (why this fine-tune is needed), behavioural hypothesis (what specific behaviour change is expected), success metrics (quantitative thresholds the fine-tuned model must meet), regression boundaries (which existing capabilities must not degrade beyond specified thresholds), dataset summary (what data will be used and its rights status per AG-340), configuration plan (proposed hyperparameters, method, compute budget), rollback criteria (conditions under which the fine-tuned model will be abandoned in favour of the prior version), and approval signatures (technical lead, risk/governance representative).
Cumulative objective register. For models undergoing iterative fine-tuning, maintain a register that stacks objectives. When round 3 of fine-tuning is proposed, the register shows objectives from rounds 1 and 2, and round 3's evaluation must include regression tests against rounds 1 and 2's success criteria. This prevents the silent erosion of earlier capabilities.
Evaluation framework. Establish a standard evaluation framework that automatically runs when a fine-tuning pipeline completes. The framework should: load the objective document, run all specified success metrics, run all specified regression tests, compare results against thresholds, and produce a structured report showing pass/fail for each criterion. The deployment decision should be gated on this report.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Fine-tuning objectives for financial models should explicitly address regulatory requirements: fair lending compliance (no prohibited discriminatory outcomes), explainability requirements (can the organisation explain why the fine-tuned model makes specific decisions?), and model risk management obligations (PRA SS1/23 expectations for documentation of model changes). The fine-tune objective document may need to be submitted to model risk management governance for review.
Healthcare. Fine-tuning of clinical AI models may constitute a significant change requiring regulatory notification under MDR/IVDR. The fine-tune objective document should assess whether the change requires notification to the relevant notified body and should document the clinical risk assessment for the behavioural changes introduced.
Safety-Critical Systems. Fine-tuning of models used in safety-critical contexts (autonomous vehicles, industrial control, aviation) should require formal safety assessment of the proposed behavioural changes before fine-tuning commences. The fine-tune objective should reference the relevant safety case and document how the proposed changes affect safety arguments.
Basic Implementation — Fine-tuning operations are accompanied by informal documentation (emails, tickets, wiki pages) describing the general intent. Evaluation is performed against ad hoc metrics selected by the data scientist performing the fine-tune. Regression testing is inconsistent. Documentation is often incomplete and difficult to locate after the fact. This level demonstrates awareness but lacks the rigour needed for governance: documentation may not precede fine-tuning, evaluation criteria may not be pre-specified, and regression boundaries are often absent.
Intermediate Implementation — A standardised fine-tune objective template is used for all fine-tuning operations. The template is completed and approved before fine-tuning commences. Post-fine-tuning evaluation runs against all specified metrics and regression boundaries. Results are recorded as persistent artefacts linked to the objective. A cumulative objective register tracks iterative fine-tuning. The organisation can retrieve the objective, configuration, and evaluation results for any fine-tuned model in its deployment inventory.
Advanced Implementation — All intermediate capabilities plus: the fine-tune objective is a required pipeline artefact that gates training execution. Automated evaluation suites run immediately upon fine-tuning completion, including a standing safety regression suite. A pre/post comparison dashboard visualises all behavioural changes, not just those in the objective. Deployment is gated on evaluation results meeting all thresholds. Independent review (e.g., model risk management committee) is required for fine-tunes affecting high-risk models. The organisation can demonstrate to regulators a complete, auditable chain of intent, execution, evaluation, and decision for every fine-tuning operation that contributed to any deployed model.
Required artefacts:
Retention requirements:
Access requirements:
Test 8.1: Objective Document Completeness
Test 8.2: Regression Boundary Specification
Test 8.3: Evaluation Against Objective
Test 8.4: Cumulative Regression Detection
Test 8.5: Pipeline Gate Enforcement
Test 8.6: Objective Timestamp Integrity
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management System) | Supports compliance |
| EU AI Act | Article 10 (Data and Data Governance) | Supports compliance |
| PRA SS1/23 | Model Risk Management — Model Development | Direct requirement |
| FCA SYSC | 6.1.1R (Systems and Controls) | Supports compliance |
| NIST AI RMF | MAP 2.1, MAP 2.3, MANAGE 2.2 | Supports compliance |
| ISO 42001 | Clause 8.2 (AI Risk Assessment), Clause 8.4 (AI System Operation) | Supports compliance |
| FDA AI/ML SaMD | Predetermined Change Control Plan | Supports compliance |
PRA SS1/23 establishes supervisory expectations for model risk management at PRA-regulated firms. The guidance explicitly requires documentation of model development decisions, validation of model changes, and ongoing monitoring of model performance. Fine-tuning constitutes a model change. The expectation that firms document the rationale for model changes, validate changes before deployment, and monitor for performance degradation maps directly to AG-341's requirements for objective documentation, evaluation, and regression testing. A firm that fine-tunes a credit risk model without a documented objective and evaluation would face supervisory challenge under SS1/23.
The FDA's framework for AI/ML-based Software as a Medical Device includes the concept of a Predetermined Change Control Plan — a documented plan that specifies what changes the manufacturer intends to make and how those changes will be evaluated. Fine-tuning of clinical AI models falls within this framework. AG-341's fine-tune objective document is structurally aligned with the PCCP concept: it specifies the intended change, the evaluation criteria, and the conditions for rollback.
Article 9 requires a continuous, iterative risk management process. Fine-tuning that modifies model behaviour is a risk-relevant change that must be managed within the risk management system. AG-341 ensures that fine-tuning operations are documented, evaluated, and governed as risk management activities, not as routine engineering tasks that bypass governance.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Model-specific — affects all deployments of the fine-tuned model and all decisions made by those deployments |
Consequence chain: Undocumented fine-tuning creates two compounding risks. First, undetected behavioural degradation: without pre-specified regression boundaries, fine-tuning can silently erode safety properties, accuracy, fairness, or other critical capabilities. The degradation may not be detected until a downstream failure occurs — by which time the model may have made thousands or millions of decisions with degraded capability. A financial model with an 8.7% accuracy degradation serving enterprise clients for six weeks could produce material financial harm. A customer-facing model with a 13% drop in safety refusal rate serving 1.2 million interactions creates significant liability exposure. Second, audit impossibility: when a regulator or auditor asks "why does this model behave this way?", the absence of fine-tuning documentation makes it impossible to explain the model's development history. For regulated industries, inability to explain model behaviour is itself a compliance failure — distinct from whatever the actual behavioural issue is. The remediation cost depends on the severity: minor cases require re-evaluation and documentation reconstruction (£50,000-£200,000); major cases require full model retraining with proper documentation (£500,000-£5,000,000); worst cases involve regulatory enforcement action and service withdrawal.
Cross-references: AG-090 (Fine-Tune and Adapter Provenance) provides the technical provenance infrastructure within which fine-tune objectives are tracked. AG-048 (AI Model Provenance and Integrity) establishes the broader model provenance framework. AG-057 (Dataset Suitability and Bias Control) addresses the quality and bias properties of fine-tuning data. AG-340 (Training Corpus Rights Governance) covers the rights dimension of fine-tuning datasets. AG-339 through AG-348 form the sibling landscape for Model Provenance, Training & Adaptation.