Embedding Model Migration Governance requires that when an organisation changes, upgrades, or fine-tunes the embedding model used to vectorise its knowledge base, the migration is governed through a structured process that ensures retrieval quality is maintained, compatibility issues are detected, and embedding drift is managed. Without this control, embedding model changes silently degrade retrieval quality as new embeddings become semantically incompatible with legacy embeddings, leading to missed retrievals, incorrect relevance rankings, and knowledge base fragmentation. This dimension ensures that embedding model changes are treated as governed infrastructure changes, not routine updates.
Scenario A -- Silent Retrieval Degradation After Model Upgrade: An organisation upgrades its embedding model from a general-purpose model (384 dimensions) to a newer, higher-performance model (768 dimensions). The upgrade is applied to new documents ingested after the change, but the 250,000 existing documents retain their 384-dimension embeddings. The retrieval system now searches a vector space containing two incompatible embedding types. Queries embedded with the new model produce low similarity scores against legacy documents because the vector spaces are not aligned. Over 3 weeks, users report that the agent "forgot" information that was previously accessible. Investigation reveals that 250,000 legacy documents have effectively become invisible to retrieval because cross-model similarity scores fall below the retrieval threshold.
What went wrong: The embedding model was changed without re-embedding existing documents. Incompatible embeddings coexisted in the same vector space. No compatibility check detected the retrieval degradation. Consequence: 250,000 documents effectively lost from retrieval for 3 weeks, 847 incorrect agent responses traced to missing legacy knowledge, £45,000 in investigation and remediation costs, user trust erosion.
Scenario B -- Semantic Drift After Fine-Tuning: An organisation fine-tunes its embedding model on domain-specific financial terminology to improve retrieval precision for financial queries. The fine-tuning shifts the model's semantic space: terms like "derivative" now cluster with financial instruments rather than calculus. The existing knowledge base includes both financial and engineering content. After fine-tuning, engineering documents about mathematical derivatives are no longer retrievable for engineering queries because their embeddings were computed with the pre-fine-tuning model and "derivative" in those embeddings sits in a different region of the semantic space. Engineering queries retrieve financial content instead.
What went wrong: Fine-tuning shifted the embedding model's semantic space without re-embedding existing content. Cross-domain retrieval quality degraded because the semantic alignment between old and new embeddings changed. No impact assessment evaluated cross-domain effects before deployment. Consequence: Engineering team unable to retrieve technical documentation for 10 days, 23 engineering decisions made without proper technical reference, one design error costing £32,000 in rework.
Scenario C -- Vendor Lock-in Through Embedding Dependency: An organisation uses a proprietary embedding model from a vendor. The vendor deprecates the model with 6 months' notice. The organisation's knowledge base contains 1.2 million documents embedded with the deprecated model. The replacement model uses a different architecture and produces embeddings in a different vector space. The organisation must re-embed all 1.2 million documents before the deprecation deadline. At 200 documents per minute, re-embedding takes approximately 100 hours of continuous processing. The organisation's infrastructure cannot support this within the deprecation window without significant additional compute expenditure (estimated £28,000 in cloud compute costs).
What went wrong: The organisation had no migration plan, no re-embedding budget, and no assessment of the re-embedding timeline. The dependency on a specific embedding model was not governed as an infrastructure dependency. Consequence: Emergency procurement of cloud compute (£28,000), 2-week project to re-embed and validate, potential retrieval degradation during the migration period, diversion of engineering resources from other priorities.
Scope: This dimension applies to every AI agent that uses embedding models to vectorise knowledge base content for retrieval. This includes vector databases, semantic search systems, and any RAG implementation that relies on embedding similarity for retrieval. The scope extends to all embedding model changes: upgrades to newer model versions, migration to different model providers, fine-tuning on domain-specific data, changes in embedding dimensions, and changes in tokenisation. The scope includes both the knowledge base embeddings (the stored vectors) and the query embeddings (the vectors computed at query time). The test is: does the agent's retrieval system depend on embedding models? If yes, any change to those models is within scope.
4.1. A conforming system MUST maintain a registry of all embedding models in use, including: model identifier, version, provider, dimensionality, the date the model was deployed, and the scope of content embedded with each model.
4.2. A conforming system MUST require a governed change process for any embedding model change, including impact assessment, compatibility verification, and rollback capability.
4.3. A conforming system MUST ensure embedding compatibility when model changes occur, either by re-embedding all existing content with the new model or by implementing a compatibility layer that aligns cross-model retrievals.
4.4. A conforming system MUST verify retrieval quality after any embedding model change, using a defined benchmark set of queries with known-relevant results.
4.5. A conforming system MUST maintain the ability to roll back to the previous embedding model and its associated embeddings if quality verification fails.
4.6. A conforming system SHOULD implement a re-embedding pipeline capable of processing the full knowledge base within a defined time window (e.g., 48 hours for knowledge bases up to 1 million documents).
4.7. A conforming system SHOULD maintain an embedding compatibility matrix documenting which embedding models are compatible (can coexist in the same vector space with acceptable retrieval quality) and which are incompatible.
4.8. A conforming system SHOULD implement progressive migration, where re-embedding occurs in priority order (critical content first, archival content last) to minimise the period of degraded retrieval.
4.9. A conforming system MAY implement embedding versioning that stores multiple embedding versions per document, enabling parallel retrieval across model generations during the migration period.
Embedding models are the foundation of vector-based retrieval. Every document in the knowledge base is represented as a high-dimensional vector computed by the embedding model. Retrieval works by comparing the query vector (also computed by the embedding model) against document vectors using similarity metrics (typically cosine similarity). This architecture has a critical dependency: the query and document vectors must be in the same semantic space for similarity comparisons to be meaningful.
When the embedding model changes, the semantic space changes. A new model maps the same text to different vectors. If the query is embedded with Model B but the documents are embedded with Model A, the similarity scores become unreliable. In practice, cross-model similarity scores typically degrade by 20-40% compared to same-model scores, depending on the architectural distance between the models. For documents near the retrieval threshold, this degradation pushes them below the threshold, making them invisible to retrieval (Scenario A).
Fine-tuning introduces a subtler problem: the model's semantic space shifts selectively. Terms that the fine-tuning emphasised move in the space; terms that were not in the fine-tuning data remain approximately in place. This creates uneven retrieval quality across domains: the fine-tuned domain improves, other domains may degrade (Scenario B).
The operational consequence is significant. A knowledge base with 1 million documents represents a substantial investment in embedding computation. Re-embedding is computationally expensive (typically 0.5-2 seconds per document for production models), requiring significant compute resources and time. Without a governed migration process, organisations face a choice between living with degraded retrieval (unacceptable for production systems) and executing an unplanned, untested re-embedding (risky and expensive).
The embedding model registry and change governance process ensure that model changes are planned, impact-assessed, and verified. The re-embedding pipeline and compatibility matrix provide the operational capabilities needed to execute migrations safely. The rollback capability provides a safety net when quality verification fails.
Embedding model migration governance requires capabilities at three levels: registry and planning (knowing what models are in use and planning changes), execution (re-embedding and compatibility management), and verification (confirming that retrieval quality is maintained).
Recommended Patterns:
{model_id, model_version, provider, dimensions, deployed_at, deprecated_at, document_count, collection_scope}. Example entry: {model_id: "text-embedding-v3", version: "3.1.2", provider: "vendor-X", dimensions: 768, deployed_at: "2026-01-15", deprecated_at: null, document_count: 450000, collection_scope: "all_collections"}. The registry is the authoritative record of which model produced which embeddings. Every embedding stored in the vector database should carry metadata linking it to the model version that produced it.Anti-Patterns to Avoid:
Financial Services. Embedding model changes should be governed under the firm's model risk management framework (SR 11-7 or equivalent). The benchmark test suite should include financial terminology and regulatory concepts. MiFID II record-keeping requirements mean that the ability to retrieve historical documents must be maintained across model migrations.
Healthcare. Clinical terminology sensitivity requires that embedding model changes be validated against clinical query patterns. A model change that degrades retrieval for medical terminology could affect clinical decision support quality. The benchmark suite should include clinical queries with known-relevant clinical evidence.
Legal. Legal terminology is domain-specific and often counter-intuitive in embedding space (e.g., "consideration" in contract law versus general English). The benchmark suite should include legal domain queries. Fine-tuning for legal terminology should be impact-assessed for effects on non-legal content.
Basic Implementation -- An embedding model registry exists documenting all models in use. Any model change requires a change request and approval. A benchmark test suite of at least 200 queries with ground truth is maintained. Quality verification runs before and after model changes. Rollback to the previous model is possible within 24 hours. Re-embedding is performed as a single batch operation. This meets minimum mandatory requirements.
Intermediate Implementation -- All basic capabilities plus: progressive re-embedding processes critical content first. Dual-model retrieval maintains access during migration. Embedding compatibility matrix documents cross-model compatibility. Re-embedding pipeline can process 1 million documents within 48 hours. Automated quality monitoring detects retrieval degradation within 6 hours of deployment. Model changes are versioned and auditable.
Advanced Implementation -- All intermediate capabilities plus: embedding versioning stores multiple embedding generations per document. Predictive impact assessment estimates retrieval quality impact before migration using a representative sample. The migration pipeline has been independently tested for completeness and rollback reliability. Zero-downtime migration is achieved through progressive re-embedding with dual-model retrieval. The organisation can demonstrate to auditors the complete embedding history of any document.
Required artefacts:
Retention requirements:
Access requirements:
Test 8.1: Cross-Model Retrieval Degradation Detection
Test 8.2: Benchmark Quality Gate Enforcement
Test 8.3: Rollback Execution
Test 8.4: Progressive Re-Embedding Priority
Test 8.5: Dual-Model Retrieval During Migration
Test 8.6: Registry Accuracy
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management System) | Supports compliance |
| EU AI Act | Article 15 (Accuracy, Robustness, Cybersecurity) | Direct requirement |
| EU AI Act | Article 12 (Record-Keeping) | Supports compliance |
| NIST AI RMF | MANAGE 2.2, MANAGE 4.1 | Supports compliance |
| ISO 42001 | Clause 6.1 (Actions to Address Risks), Clause 8.1 (Operational Planning and Control) | Supports compliance |
| DORA | Article 9 (ICT Risk Management Framework) | Supports compliance |
Article 15 requires high-risk AI systems to maintain appropriate levels of accuracy and robustness. An unmanaged embedding model change can silently degrade retrieval accuracy, which in turn degrades the accuracy of all agent outputs. AG-337 directly supports accuracy maintenance by ensuring that model changes are impact-assessed and quality-verified before deployment. The rollback capability supports robustness by providing recovery from quality failures.
Article 12 requires record-keeping for traceability. The embedding model registry and migration change records provide the audit trail for understanding which model produced which embeddings and when changes occurred. This traceability is essential for investigating retrieval quality issues and demonstrating due diligence in model management.
Embedding model changes are a risk to retrieval quality and, by extension, to agent output quality. AG-337's governed change process is a risk management control.
MANAGE 2.2 addresses risk mitigation. MANAGE 4.1 addresses post-deployment monitoring. Embedding model governance mitigates the risk of retrieval degradation, and quality verification provides post-deployment monitoring of retrieval quality.
Clause 6.1 requires actions to address risks. Clause 8.1 requires operational planning and control. Embedding model migration is an operational process that must be planned and controlled to avoid retrieval quality risks.
Article 9 requires financial entities to maintain an ICT risk management framework. Embedding models are ICT components whose changes must be governed within the risk management framework to prevent service degradation.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Organisation-wide -- affects all retrieval across the entire knowledge base |
Consequence chain: Without embedding model migration governance, model changes silently degrade retrieval quality across the entire knowledge base. The immediate failure is invisible document loss: documents embedded with an incompatible model become unretrievable despite being present in the database (Scenario A -- 250,000 documents invisible for 3 weeks, 847 incorrect responses). The secondary failure is cross-domain degradation from fine-tuning (Scenario B -- engineering team unable to retrieve technical documentation, £32,000 in rework). The operational failure is unplanned migration costs when vendor models are deprecated (Scenario C -- £28,000 emergency compute costs plus 2-week project diversion). The blast radius is organisation-wide because the embedding model underlies all retrieval across all collections. A single unmanaged model change can degrade every agent response that relies on the knowledge base.
Cross-references: AG-040 (Persistent Memory Governance) provides the foundational framework for the knowledge base that embeddings represent. AG-082 (Data Minimisation Enforcement) reduces the volume of content requiring re-embedding. AG-122 (Knowledge Integrity Verification) verifies the integrity of knowledge that must be preserved across model migrations. AG-132 (Memory Scope Boundary Enforcement) defines the scope boundaries that apply to re-embedding prioritisation. AG-179 (Memory Audit Trail Governance) captures migration events in the audit trail. AG-333 (Retrieved Evidence Confidence Governance) may need threshold recalibration after model changes. AG-336 (Knowledge Freshness Attestation Governance) is affected because re-embedding resets the embedding timestamp but does not reset content freshness. AG-338 (Retrieval Poisoning Quarantine Governance) should be re-evaluated after model changes as poisoning vectors may differ between models.