Frontier Capability Reclassification Governance requires that organisations reassess the risk classification, governance controls, and deployment constraints of AI models whenever new capabilities are discovered or emerge — whether through model updates from providers, novel prompting techniques discovered in the field, fine-tuning that unlocks latent capabilities, or adapter compositions that create emergent capabilities. The governance posture established at initial deployment is based on the model's understood capability profile at that point. When that profile changes materially — a model previously classified as low-risk demonstrates unexpected reasoning about hazardous materials, or a model update adds agentic tool use that was absent at classification time — the original governance controls may be insufficient. AG-346 mandates a reassessment trigger and reclassification process to ensure that governance controls evolve with capabilities.
Scenario A — Provider Update Adds Agentic Capability: An organisation deploys a language model for document summarisation, classified as low-risk because it has no ability to take actions — it can only read and summarise. The model provider releases an update that adds native tool-use capabilities: the model can now call APIs, execute code, and browse the web. The update is applied automatically through the provider's API versioning. The organisation's deployment infrastructure passes user queries to the model and returns results; it does not constrain tool use because the original model could not use tools. Within 48 hours of the update, the summarisation agent begins making web requests to verify facts in documents, inadvertently exposing confidential document contents to external web services through its search queries.
What went wrong: The model update added capabilities (tool use) that fundamentally changed the risk profile. The original classification (low-risk, read-only) was no longer accurate. No reclassification trigger existed for capability changes. The deployment infrastructure assumed the model could not take actions and therefore had no action-constraining controls. Consequence: Confidential document contents exposed to external services, potential data breach notification requirement, client notification obligation, and emergency deployment rollback costing £75,000 in engineering time.
Scenario B — Novel Prompting Technique Unlocks Latent Capability: A model deployed for customer service is classified as medium-risk with controls calibrated for conversational text generation. A security researcher publishes a novel prompting technique that enables the model to produce detailed instructions for synthesising controlled substances — a capability that standard safety evaluations did not test because it was not considered within the model's capability range. The published technique works on the organisation's deployed model version. The organisation discovers the exposure when a journalist contacts them with examples of harmful outputs produced using the published technique.
What went wrong: The model's capability classification was based on evaluation at deployment time. No ongoing monitoring for capability discoveries in the public domain existed. No reclassification trigger activated when the prompting technique was published. The governance controls were calibrated for a capability profile that no longer reflected reality. Consequence: Reputational damage, emergency content filter deployment (£45,000), re-evaluation of the model's capability profile (£120,000), and media management costs.
Scenario C — Fine-Tuning Unlocks Latent Capabilities: An organisation fine-tunes a base model on legal documents to create a legal research assistant, classified as medium-risk. The fine-tuning inadvertently activates the model's latent capability for generating convincing legal documents — not just researching them. The model begins producing legal opinions that appear authoritative and are formatted as formal legal memoranda. Junior legal staff treat these outputs as drafts of actual legal advice rather than research summaries. One generated opinion contains a material error in statutory interpretation that a client acts upon, resulting in a £1.4 million adverse outcome.
What went wrong: The fine-tuning changed the model's effective capability from "research assistant" to "legal document generator" without triggering reclassification. The original medium-risk classification and associated controls were appropriate for a research tool but not for a tool producing authoritative-seeming legal opinions. No evaluation assessed whether fine-tuning had changed the model's capability classification. Consequence: £1.4 million client loss, professional liability exposure, insurance claim, and immediate service suspension pending full capability reassessment.
Scope: This dimension applies to all deployed AI models throughout their operational lifecycle, from initial deployment to decommissioning. It covers capability changes arising from any source: model provider updates (including minor and patch versions), novel prompting or jailbreaking techniques published in the research community or discovered internally, fine-tuning or adaptation that activates latent capabilities, adapter composition that creates emergent capabilities (per AG-342), and changes in the deployment context that expose the model to new interaction patterns. The scope explicitly includes models accessed through third-party APIs where the provider may update the model without notice. Reclassification is not a one-time event — it is an ongoing obligation for the entire deployment lifecycle.
4.1. A conforming system MUST define and document triggers that initiate a capability reclassification assessment, including: model version updates from providers, discovery of novel capability-eliciting techniques (internally or externally), fine-tuning or adaptation operations that may change the capability profile, significant changes to the deployment context or integration architecture, and periodic scheduled reassessment at defined intervals (at least annually).
4.2. A conforming system MUST conduct a capability reclassification assessment when any defined trigger activates, evaluating whether the model's effective capability profile has changed materially from the profile on which the current risk classification is based.
4.3. A conforming system MUST update governance controls, deployment constraints, and risk classification when a reclassification assessment determines that the capability profile has changed materially — upgrading controls if capabilities have increased, and potentially relaxing controls (with approval) if capabilities have decreased.
4.4. A conforming system MUST maintain version-pinning or update-gating capability for models accessed through third-party APIs, preventing automatic updates from changing the deployed model's capabilities without assessment.
4.5. A conforming system MUST document all reclassification assessments, whether or not they result in a classification change, including the trigger, the assessment methodology, the findings, and the decision.
4.6. A conforming system SHOULD monitor public sources (security research publications, vulnerability databases, provider announcements, AI safety forums) for discoveries of new capability-eliciting techniques affecting deployed model families.
4.7. A conforming system SHOULD include latent capability probing in post-fine-tuning evaluation (complementing AG-341), specifically testing whether the fine-tuning has activated capabilities not present in the base model.
4.8. A conforming system SHOULD implement automated capability monitoring that periodically probes the deployed model for capabilities not in its classification profile.
4.9. A conforming system MAY participate in information-sharing arrangements with other deployers of the same model families to receive early warning of capability discoveries.
The capability profile of an AI model is not static. It changes through explicit modifications (provider updates, fine-tuning) and through discovery of latent capabilities (novel prompting techniques, emergent behaviours). The governance posture established at deployment time is based on the model's understood capabilities at that time. If capabilities change and governance does not, a gap opens between what the model can do and what the governance controls are designed to constrain.
This gap is particularly dangerous because capability increases are often invisible. A model that gains tool-use capabilities through a provider update looks the same from the outside — the API signature may not change, the response format may be identical, and standard quality metrics may improve. The capability increase only becomes apparent when someone tests for it or when it manifests in production through an unintended action.
The latent capability problem adds another dimension of risk. Modern large language models contain capabilities that are present in the weights but not easily accessible through standard prompting. These capabilities were learned during pre-training but are suppressed by safety alignment or simply not surfaced by typical inputs. When a novel prompting technique or a fine-tuning operation makes these capabilities accessible, the model's effective capability profile changes even though the weights have not — or have changed only slightly in the case of fine-tuning. The organisation's risk classification, based on the capabilities that were accessible at classification time, is now outdated.
The regulatory dimension is increasingly important. The EU AI Act classifies AI systems based on their risk level, which is determined in part by their capabilities. If a model's capabilities increase to the point where it would be classified at a higher risk level, the organisation's obligations under the EU AI Act change accordingly. An organisation that does not reassess classification when capabilities change risks operating a high-risk system under low-risk governance — a compliance failure.
Reclassification trigger registry. Maintain a documented list of events that trigger reclassification assessment. At minimum: any update to the model version or API version from a third-party provider, publication of novel capability-eliciting techniques for the deployed model family (monitored through security advisories, research publications, and industry forums), any fine-tuning or adaptation operation (linking to AG-341's evaluation), any adapter composition change (linking to AG-342's evaluation), any change to the deployment architecture that gives the model access to new tools or systems, and scheduled periodic reassessment (recommended: quarterly for high-risk deployments, annually for low-risk).
Capability assessment protocol. Define a standard protocol for capability reassessment. The protocol should include: evaluation on the original classification benchmark suite (to detect capability regressions as well as gains), evaluation on an expanded capability probe suite (testing for capabilities not in the original classification, including dual-use and harmful capabilities), assessment of the model's tool-use capabilities (if the deployment provides tool access), assessment of the model's autonomous action capabilities, and comparison against the documented capability profile on which the current classification is based.
Version-pinning strategy. For models accessed through third-party APIs: pin to a specific model version in production, monitor the provider's release schedule, and gate upgrades on reclassification assessment. Most major API providers support version specification (e.g., model="gpt-4-0613" rather than model="gpt-4"). Using unpinned version identifiers that automatically resolve to the latest version is a governance failure for production deployments.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Capability reclassification in financial services may trigger model risk management re-validation obligations. If a model deployed for customer communication gains capability for financial advice through an update or technique, it may fall under MiFID suitability requirements that the original deployment did not address.
Healthcare. Capability changes in clinical AI may constitute a significant change requiring regulatory notification. A diagnostic model that gains capability for treatment recommendation through a provider update may cross the boundary from clinical decision support to medical device.
Defence and Dual-Use. Capability increases may trigger export control reclassification. A model that gains autonomous planning or weapon-system reasoning capabilities through an update or fine-tuning may become subject to export restrictions.
Basic Implementation — The organisation pins API versions for production deployments and conducts reclassification when major version changes occur. Novel capability discoveries are addressed reactively when they come to the team's attention. Scheduled reassessment is annual. This level prevents automatic updates but does not proactively monitor for capability changes.
Intermediate Implementation — A documented trigger registry defines all reclassification triggers. Capability assessments follow a standard protocol. External monitoring covers provider announcements and major AI safety publications. Version pinning is enforced for all production deployments. Reclassification assessments are documented regardless of outcome. Scheduled reassessment is quarterly for high-risk deployments.
Advanced Implementation — All intermediate capabilities plus: automated capability change detection probes deployed models periodically. External threat intelligence integration flags relevant capability discoveries. Graduated response protocols define response tiers by severity. Latent capability probing is included in all fine-tuning evaluations. The organisation can demonstrate continuous capability monitoring across its entire model deployment inventory, with documented assessments for every trigger event.
Required artefacts:
Retention requirements:
Access requirements:
Test 8.1: Trigger Registry Completeness
Test 8.2: Version Pinning Enforcement
Test 8.3: Reclassification on Provider Update
Test 8.4: Capability Assessment Coverage
Test 8.5: External Monitoring Responsiveness
Test 8.6: Controls Updated After Reclassification
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management System — continuous, iterative) | Direct requirement |
| EU AI Act | Article 6 (Classification Rules for High-Risk) | Supports compliance |
| EU AI Act | Article 43 (Conformity Assessment — substantial modification) | Direct requirement |
| NIST AI RMF | GOVERN 1.2, MAP 2.3, MANAGE 3.1, MEASURE 2.5 | Supports compliance |
| ISO 42001 | Clause 8.2 (AI Risk Assessment — periodic), Clause 10.1 (Continual Improvement) | Supports compliance |
| PRA SS1/23 | Model Risk Management — Ongoing Monitoring | Direct requirement |
Article 43 requires a new conformity assessment when a high-risk AI system undergoes a "substantial modification." A capability change that materially alters the model's risk profile constitutes a substantial modification. AG-346 ensures that organisations detect substantial modifications — whether initiated by the organisation or by a provider's update — and trigger the appropriate reassessment. Without capability monitoring, a substantial modification could occur without the organisation's knowledge, creating a compliance gap.
Article 6 defines the criteria for classifying AI systems as high-risk. A model that gains new capabilities may cross the threshold from low-risk to high-risk, triggering obligations under the AI Act that did not apply at the original classification. AG-346's reclassification process ensures that the organisation detects when this threshold is crossed and applies the appropriate governance.
PRA SS1/23 expects firms to monitor model performance and behaviour on an ongoing basis, with trigger-based re-validation when material changes occur. A capability change is a material change. AG-346's trigger-based reclassification process directly aligns with the supervisory expectation for ongoing monitoring with event-driven re-assessment.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Deployment-wide — potentially organisation-wide if the same model family is deployed across multiple services |
Consequence chain: Failure to reclassify when capabilities change creates a growing gap between the model's actual capability and the governance controls constraining it. The gap widens with each undetected capability change. When the gap manifests — through an unintended action, a data exposure, or a harmful output — the consequences are proportional to the gap size. Scenario A's confidential document exposure occurred within 48 hours of a capability change. Scenario C's £1.4 million client loss resulted from a fine-tuning-induced capability change that went undetected for weeks. The common factor is that governance controls calibrated for one capability profile were applied to a different, more capable model. The regulatory consequence is particularly severe: operating a high-risk AI system under low-risk governance is a direct violation of the EU AI Act's classification requirements. The organisation cannot claim ignorance — AG-346 establishes that monitoring for capability changes is an ongoing obligation, and failure to monitor is itself a governance failure.
Cross-references: AG-048 (AI Model Provenance and Integrity) tracks model versions and provides the version awareness necessary for reclassification triggers. AG-342 (Adapter Composition Approval Governance) addresses emergent capabilities from adapter composition. AG-341 (Fine-Tune Objective Documentation Governance) addresses capability changes from fine-tuning. AG-024 (Authorised Learning Governance) governs the authorisation of changes that may alter capabilities. AG-339 through AG-348 form the sibling landscape for Model Provenance, Training & Adaptation.