AG-348: Open versus Closed Weight Exposure Governance

2. Summary

Open versus Closed Weight Exposure Governance requires that organisations formally assess, document, and manage the distinct risk profiles created by deploying AI agents built on open-weight models (where the model weights are publicly available), closed-weight models (where weights are held exclusively by the provider and accessed only through APIs), or hybrid strategies combining both. The choice between open and closed weights is not merely a technical or commercial decision — it fundamentally shapes the organisation's security posture, compliance obligations, supply chain dependencies, customisation capabilities, and exposure to adversarial attacks. Open weights enable customisation and reduce vendor dependency but expose the model to adversarial analysis; closed weights provide custody security but create supply chain dependency and limit auditability. AG-348 ensures that organisations make this choice with full understanding of the governance implications and implement controls appropriate to the chosen strategy.

3. Example

Scenario A — Open Weights Enable Adversarial Safety Bypass: An organisation deploys a customer-facing agent using an open-weight model that has been fine-tuned for its specific domain. The model weights are publicly available (they were released under an open licence). An adversary downloads the same base weights, analyses the model's safety alignment layer by examining weight patterns, and develops a targeted jailbreaking technique that exploits a specific weakness in the alignment implementation. The technique is published online. Within 48 hours, users of the organisation's agent begin employing the technique, and the agent's safety refusal rate for targeted attacks drops from 94.7% to 31.2%. Because the technique targets the base model's alignment structure — which the organisation's fine-tuning did not modify — no amount of fine-tuning can address the vulnerability without changing the base model.

What went wrong: The organisation deployed an open-weight model without assessing the adversarial exposure created by public weight availability. The risk assessment did not consider that adversaries with access to the same weights could develop targeted attacks through white-box analysis. No compensating controls (e.g., enhanced output filtering, agent monitoring) were implemented to mitigate the elevated adversarial risk of open-weight deployment. Consequence: Safety degradation affecting all users, emergency deployment of compensating controls (£165,000), temporary service suspension for 72 hours during remediation, and reputational damage.

Scenario B — Closed Weight Provider Discontinues Model: A financial services firm builds its entire AI infrastructure on a closed-weight model from Provider X, accessed through API. The firm invests £2.8 million over 18 months in fine-tuning (through the provider's fine-tuning API), evaluation infrastructure calibrated to the model's characteristics, and downstream system integration. Provider X announces that the model version will be deprecated in 90 days, with successor model performance "generally improved" but with different output characteristics. The firm discovers that: fine-tuning investments are not portable to the successor (the provider does not offer weight migration), evaluation benchmarks must be recalibrated, and downstream parsing systems must be updated for the new output format. The 90-day migration timeline is insufficient for a regulated financial services deployment.

What went wrong: The organisation built critical infrastructure on a closed-weight model without assessing vendor dependency risk. No contractual protections addressed model discontinuation. No contingency plan existed for provider-initiated changes. The organisation's £2.8 million in model-specific investments was hostage to the provider's product decisions. Consequence: Emergency migration programme (£1.4 million), regulatory notification to the PRA of material operational risk, 6-month degraded capability during migration, and strategic review of AI vendor strategy.

Scenario C — Hybrid Strategy Creates Governance Gaps: An organisation uses a closed-weight model for customer-facing operations (prioritising custody security) and an open-weight model for internal analytics (prioritising customisation). The governance framework treats them as separate deployments with separate controls. However, the internal analytics model is fine-tuned on the same customer data used by the customer-facing model. When the open-weight internal model is compromised (an attacker gains access to the fine-tuned weights), the customer data influence in the weights is exposed — data that the organisation believed was protected by the closed-weight provider's custody controls. The breach notification analysis reveals that the data exposure through the open-weight model's fine-tuned weights constitutes a personal data breach under GDPR.

What went wrong: The hybrid strategy was not assessed as an integrated system. The closed-weight model's data protection was undermined by the open-weight model's vulnerability when both were fine-tuned on the same data. No governance process evaluated the cross-model data exposure risk. Consequence: GDPR breach notification, potential €6.3 million fine (3% of turnover), remediation of the hybrid strategy, and customer notification obligation.

4. Requirement Statement

Scope: This dimension applies to any organisation deploying AI agents, regardless of whether the models are open-weight, closed-weight, or a hybrid combination. The governance requirements apply to the decision process (choosing the weight exposure strategy), the deployment (implementing appropriate controls for the chosen strategy), and the ongoing management (monitoring for risks specific to the chosen strategy). For open-weight models, the scope includes models downloaded from public repositories, models received from partners, and models released under open licences. For closed-weight models, the scope includes models accessed through APIs, models accessed through managed inference services, and models where the provider retains exclusive custody of the weights. Hybrid strategies — using both open and closed models within the same organisation — are explicitly in scope and require assessment of cross-model risk interactions.

4.1. A conforming system MUST conduct and document a weight exposure risk assessment before deploying any AI model, evaluating: the adversarial risk profile (how weight availability affects the model's vulnerability to targeted attacks), the supply chain dependency risk (how weight custody affects the organisation's control over model availability and evolution), the customisation implications (how weight access affects the organisation's ability to adapt the model), the compliance implications (how weight exposure affects regulatory obligations including data protection and audit rights), and the data exposure risk (how weight availability or custody arrangement affects the confidentiality of data encoded in the weights through training or fine-tuning).

4.2. A conforming system MUST implement controls specific to the chosen weight exposure strategy: for open-weight deployments, enhanced adversarial defences, weight integrity monitoring, and fine-tuned weight custody (per AG-339); for closed-weight deployments, vendor dependency management, contractual protections, and audit rights.

4.3. A conforming system MUST assess the data exposure risk of open-weight models that have been fine-tuned on confidential or personal data, including the risk that training data can be extracted or inferred from publicly accessible or compromised weights.

4.4. A conforming system MUST establish contractual protections for closed-weight deployments covering: model availability commitments, deprecation notification periods, data processing terms, audit rights, and exit provisions.

4.5. A conforming system MUST assess cross-model risk interactions in hybrid strategies where both open-weight and closed-weight models are used, particularly when they share training data, serve the same users, or interact with the same downstream systems.

4.6. A conforming system SHOULD develop and maintain a vendor exit strategy for each closed-weight model dependency, documenting how the organisation would transition to an alternative if the provider changes terms, increases prices, or discontinues the model.

4.7. A conforming system SHOULD implement enhanced output monitoring for open-weight deployments to compensate for the elevated adversarial risk, including detection of known attack patterns and anomalous output distributions.

4.8. A conforming system SHOULD assess whether open-weight models used for fine-tuning on sensitive data require additional weight custody controls beyond those applied to the base weights, given that fine-tuned weights may encode sensitive information.

4.9. A conforming system MAY implement differential privacy or other privacy-preserving techniques during fine-tuning of open-weight models on sensitive data to reduce the risk of data extraction from the resulting weights.

5. Rationale

The choice between open-weight and closed-weight models is one of the most consequential architectural decisions in AI deployment, yet it is often made on the basis of technical convenience or cost rather than governance analysis. Each strategy carries a fundamentally different risk profile, and the governance controls must match the specific risks of the chosen strategy.

Open-weight models offer significant advantages: complete customisation control, no vendor dependency, full auditability of the model artefact, and the ability to deploy on any infrastructure. But they carry a unique risk: adversaries have the same access to the model's internals as the deploying organisation. This enables white-box attacks — adversaries can examine the model's weights, identify weaknesses in safety alignment, and develop targeted attacks that are far more effective than black-box attacks against closed models. Research consistently shows that white-box attacks succeed at rates 2-4x higher than equivalent black-box attacks. An organisation deploying an open-weight model must account for this elevated adversarial exposure.

Closed-weight models offer custody security — the weights are held by the provider and cannot be directly analysed by adversaries. But they create supply chain dependency that carries its own risks: the provider can change pricing, deprecate models, modify terms of service, or experience outages that the deploying organisation cannot control. The organisation's investment in fine-tuning, evaluation, and integration is locked into the provider's ecosystem. If the provider makes an adverse change, the organisation faces costly migration or acceptance of the change.

The hybrid strategy — often adopted to "get the best of both worlds" — introduces a third risk category: cross-model interactions. When open-weight and closed-weight models share training data, the data protection provided by the closed-weight custody can be undermined by the open-weight model's exposure, as Scenario C illustrates. Hybrid strategies require governance of the interactions, not just the individual deployments.

The data exposure dimension is increasingly important. Fine-tuning encodes information from the training data into the model weights. For open-weight models, this creates a data leakage pathway: if the organisation fine-tunes an open-weight model on confidential customer data, the resulting weights (which the organisation controls) encode information about that data. If those fine-tuned weights are inadvertently exposed, compromised, or insufficiently protected, the training data may be partially recoverable through extraction attacks. Research has demonstrated successful extraction of training data from model weights, making this a practical rather than theoretical concern.

6. Implementation Guidance

Weight exposure risk assessment framework. Establish a structured framework for evaluating weight exposure decisions. The framework should assess five dimensions for each deployment:

Adversarial risk: What is the threat model? Who might attack this model, and does weight availability change their capability? For a low-value internal tool, the adversarial risk from open weights may be negligible. For a safety-critical customer-facing agent, the adversarial risk from open weights is material.
Supply chain risk: What is the organisation's tolerance for provider dependency? For a non-critical application, vendor lock-in may be acceptable. For a regulated financial service, the inability to control model availability is a material operational risk.
Customisation requirement: Does the application require deep model customisation (fine-tuning, architecture modification) that only open weights support, or is API-based fine-tuning sufficient?
Compliance requirement: Do regulatory obligations require auditability of the model (favouring open weights), data residency control (potentially favouring open weights for on-premise deployment), or specific data processing guarantees (potentially favouring closed weights with contractual protections)?
Data sensitivity: Will the model be fine-tuned on sensitive data? If so, what is the risk that this data could be extracted from the weights, and does the weight exposure strategy adequately protect against this risk?

Open-weight deployment controls. When deploying open-weight models: implement enhanced output monitoring (anomaly detection, known-attack-pattern matching), maintain weight integrity verification (per AG-339) to detect tampering, implement compensating controls for elevated adversarial risk (additional output filters, rate limiting on suspicious patterns, agent monitoring), and if fine-tuning on sensitive data, assess and mitigate data extraction risk (differential privacy, access controls on fine-tuned weights, or avoiding open-weight models for sensitive-data fine-tuning).

Closed-weight contractual framework. When deploying closed-weight models: negotiate contractual protections covering model availability (minimum service period and deprecation notice — recommend minimum 12 months notice for production-critical models), data processing (explicit GDPR-compliant DPA, training-on-inputs policy, data residency), audit rights (the right to audit or have independently audited the provider's data handling and model behaviour), pricing stability (protection against sudden price increases for production dependencies), and exit provisions (data portability, fine-tuning asset portability, transition assistance).

Hybrid strategy governance. When using both open and closed models: map data flows between models to identify shared training data, assess whether data protection measures for closed-weight models are undermined by open-weight model exposures, implement separate data governance tracks if open and closed models handle different data sensitivity levels, and periodically reassess the interaction risks as both deployment strategies evolve.

Recommended patterns:

Risk-stratified weight strategy. Map each deployment to the appropriate weight strategy based on its risk profile. Safety-critical, high-adversarial-risk deployments may favour closed weights (adversarial protection) with strong contractual protections (supply chain risk mitigation). Low-risk, high-customisation deployments may favour open weights with standard custody controls. This avoids a one-size-fits-all approach.
Fine-tuned weight segregation. When fine-tuning open-weight models on sensitive data, treat the resulting fine-tuned weights as a higher-sensitivity artefact than the base weights. Apply custody controls (per AG-339) commensurate with the sensitivity of the training data, not just the sensitivity of the base model. Store fine-tuned weights in access-controlled environments separate from the publicly available base weights.
Vendor exit readiness testing. For critical closed-weight dependencies, periodically test the vendor exit strategy — not by actually exiting, but by verifying that the alternative model can achieve acceptable performance and that the migration path is operationally viable. This ensures the exit strategy remains current.

Anti-patterns to avoid:

Choosing open weights solely for cost. Open weights eliminate API costs but introduce adversarial exposure, custody obligations, and infrastructure costs that may exceed the savings. The cost comparison must include the full governance cost, not just the inference cost.
Choosing closed weights solely for convenience. Closed weights are simpler to deploy initially but create long-term dependencies that constrain the organisation's options. The convenience benefit must be weighed against the supply chain risk.
Treating fine-tuned open weights as equivalent to base weights. Fine-tuned weights encode information from the fine-tuning data. If that data is sensitive, the fine-tuned weights are sensitive — regardless of whether the base weights are public. The custody and access controls must reflect the sensitivity of the fine-tuned weights, not the base weights.
Assuming closed-weight custody is the provider's problem. The organisation remains responsible for the governance of AI agents it deploys, regardless of who holds the weights. Contractual protections, audit rights, and vendor management are the organisation's obligations.
Ignoring cross-model data flows in hybrid strategies. When open and closed models share data, the security of the data is determined by the weakest link. A hybrid strategy that fine-tunes both an open-weight and a closed-weight model on the same customer data effectively gives the open-weight model's security profile to that data.

Industry Considerations

Financial Services. PRA and FCA expectations for operational resilience extend to AI model dependencies. Critical closed-weight dependencies should be included in the firm's operational resilience framework, with impact tolerances and exit strategies. Open-weight models fine-tuned on financial data must have custody controls meeting the data classification requirements.

Healthcare. Open-weight models fine-tuned on patient data create significant HIPAA/GDPR exposure. The potential for training data extraction from fine-tuned weights means that patient data encoded in weights must be protected as health data. Closed-weight models for clinical applications must have contractual protections that satisfy medical device regulatory requirements for auditability.

Government and Public Sector. Data sovereignty requirements may mandate open-weight deployment on government-controlled infrastructure (avoiding data transfer to third-party providers). Conversely, security classifications may restrict which models can be deployed on government networks based on the provenance and inspection status of the weights.

Maturity Model

Basic Implementation — The organisation has made weight exposure decisions for each deployment but without a formal risk assessment framework. Open-weight and closed-weight deployments are governed by the same general controls without strategy-specific measures. Vendor contracts for closed-weight models may lack governance-specific provisions. Hybrid strategy risks are not assessed. This level reflects awareness but not deliberate governance.

Intermediate Implementation — A formal weight exposure risk assessment is conducted for each deployment. Open-weight deployments have enhanced adversarial controls. Closed-weight deployments have contractual protections covering availability, data processing, and exit provisions. Fine-tuned weights on sensitive data receive enhanced custody controls. Hybrid strategy cross-model risks are assessed. The organisation can explain and justify its weight exposure strategy for each deployment.

Advanced Implementation — All intermediate capabilities plus: risk-stratified weight strategy with documented rationale. Vendor exit strategies tested periodically. Differential privacy or equivalent applied to sensitive-data fine-tuning on open-weight models. Automated adversarial monitoring compensates for open-weight exposure. Cross-model data flow mapping updated regularly. The organisation can demonstrate to regulators a deliberate, documented, and continuously managed weight exposure governance programme.

7. Evidence Requirements

Required artefacts:

Weight exposure risk assessments. Documented assessments for each deployment covering adversarial risk, supply chain risk, customisation, compliance, and data sensitivity dimensions.
Strategy-specific control documentation. Evidence of controls implemented for the chosen weight strategy (adversarial controls for open, contractual protections for closed).
Vendor contracts. Contractual agreements with closed-weight providers covering availability, data processing, audit rights, and exit provisions.
Hybrid strategy assessment. For organisations using both strategies, documentation of cross-model risk assessment.
Vendor exit strategies. Documented exit strategies for critical closed-weight dependencies.

Retention requirements:

Risk assessments and contracts: operational lifetime of the deployment plus minimum 5 years for regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request.

8. Test Specification

Test 8.1: Weight Exposure Risk Assessment Completeness

Stimulus: Select 5 production deployments (mix of open-weight and closed-weight). Retrieve the weight exposure risk assessment for each.
Expected behaviour: Each deployment has a complete risk assessment covering all five dimensions (adversarial, supply chain, customisation, compliance, data sensitivity).
Pass criteria: 100% of selected deployments have complete risk assessments.
Fail criteria: Any deployment lacks a risk assessment or any assessment is missing required dimensions.

Test 8.2: Open-Weight Adversarial Controls

Stimulus: For open-weight deployments, verify that enhanced adversarial controls are implemented (output monitoring, known-attack-pattern detection, behavioural anomaly detection).
Expected behaviour: Open-weight deployments have controls beyond those required for closed-weight deployments, specifically addressing the elevated adversarial risk.
Pass criteria: All open-weight deployments have documented and operational adversarial controls.
Fail criteria: Any open-weight deployment lacks adversarial-specific controls.

Test 8.3: Closed-Weight Contractual Protections

Stimulus: For closed-weight deployments, review vendor contracts for required provisions (availability commitment, deprecation notice period, data processing terms, audit rights, exit provisions).
Expected behaviour: Contracts cover all required provisions with specific, enforceable terms.
Pass criteria: All closed-weight vendor contracts include the required provisions.
Fail criteria: Any contract is missing required provisions, or provisions are vague and unenforceable.

Test 8.4: Fine-Tuned Weight Data Exposure Assessment

Stimulus: For open-weight models fine-tuned on sensitive data, verify that a data exposure risk assessment was conducted evaluating the risk of training data extraction from the fine-tuned weights.
Expected behaviour: A documented assessment evaluates the data extraction risk and specifies mitigating controls (enhanced custody, differential privacy, or determination that the risk is within tolerance with rationale).
Pass criteria: All sensitive-data fine-tuned open-weight models have data exposure assessments.
Fail criteria: Any sensitive-data fine-tuned model lacks a data exposure assessment.

Test 8.5: Hybrid Strategy Cross-Model Assessment

Stimulus: For organisations using both open and closed weight strategies, verify that a cross-model risk assessment has been conducted identifying shared data flows and interaction risks.
Expected behaviour: A documented assessment identifies where open and closed deployments share training data, users, or downstream systems, and evaluates whether the open-weight deployment undermines protections assumed for the closed-weight deployment.
Pass criteria: Cross-model assessment exists and identifies all shared data flows.
Fail criteria: No cross-model assessment exists, or shared data flows are not identified.

Test 8.6: Vendor Exit Strategy Viability

Stimulus: For the most critical closed-weight dependency, review the vendor exit strategy. Verify that an alternative model has been identified and that migration is feasible within an acceptable timeframe.
Expected behaviour: A documented exit strategy identifies the alternative model, estimates migration effort and timeline, and has been reviewed within the last 12 months.
Pass criteria: Exit strategy exists, is current (reviewed within 12 months), and identifies a viable alternative with estimated migration timeline.
Fail criteria: No exit strategy exists, or the strategy has not been reviewed in over 12 months, or no viable alternative has been identified.

Conformance Scoring

Score 0: No weight exposure governance — models are deployed without assessing the governance implications of the weight exposure strategy.
Score 1: Awareness-level governance — the organisation recognises the difference between open and closed weight strategies but has not implemented strategy-specific controls or formal risk assessments.
Score 2: Systematic governance — formal risk assessments for each deployment, strategy-specific controls (adversarial for open, contractual for closed), fine-tuned weight data exposure assessment, and hybrid strategy cross-model assessment.
Score 3: Proactive governance — all Score 2 controls plus risk-stratified weight strategy, tested vendor exit strategies, differential privacy for sensitive-data fine-tuning, automated adversarial monitoring for open-weight deployments, and continuous cross-model risk assessment.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 15 (Accuracy, Robustness and Cybersecurity)	Direct requirement
EU AI Act	Article 53 (Transparency for GPAI Models — open-weight specific provisions)	Direct requirement
GDPR	Articles 25, 32 (Data Protection by Design, Security of Processing)	Direct requirement
DORA	Article 28 (Third-Party ICT Risk)	Direct requirement
PRA SS1/23	Model Risk Management — Third-Party Model Risk	Supports compliance
NIST AI RMF	GOVERN 1.1, MAP 3.5, MANAGE 2.4	Supports compliance

EU AI Act — Article 53 (Transparency for GPAI Models)

Article 53 includes specific provisions for open-weight GPAI models, including modified transparency obligations. The distinction between open-weight and closed-weight models is embedded in the regulatory framework itself, making it essential that organisations understand and govern their weight exposure strategy. Open-weight models may benefit from certain exemptions but must still comply with safety and transparency requirements. AG-348 ensures that the organisation's governance framework reflects the specific obligations applicable to its chosen weight strategy.

Article 25 (Data Protection by Design) requires that data protection measures be integrated into processing activities from the design stage. For AI models, the weight exposure decision is a design-stage decision that affects data protection: fine-tuning open-weight models on personal data creates different data protection risks than fine-tuning through a closed-weight provider's API. Article 32 (Security of Processing) requires appropriate technical and organisational measures to ensure a level of security appropriate to the risk. The security measures for weights that encode personal data must reflect the exposure risk of those weights.

DORA — Article 28 (Third-Party ICT Risk)

Article 28 requires financial entities to manage risk arising from ICT third-party service providers. Closed-weight model providers are ICT third-party providers. DORA's requirements for contractual provisions, exit strategies, and concentration risk management directly apply to closed-weight model dependencies. AG-348's requirements for contractual protections and vendor exit strategies implement DORA's third-party risk management provisions for AI model dependencies.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Organisation-wide — affects strategic AI infrastructure decisions and may impact all deployments using the chosen weight strategy

Consequence chain: Weight exposure governance failures create strategic-level risks that affect the entire AI programme. For open-weight failures: adversarial exploitation of publicly analysable weights can compromise safety across all deployments using the same base model, as Scenario A illustrates (safety refusal drop from 94.7% to 31.2% through targeted white-box attack). The remediation requires either replacing the base model or implementing compensating controls, both of which affect every deployment. For closed-weight failures: vendor dependency creates existential risk for AI-dependent services, as Scenario B illustrates (£2.8 million in stranded investment when the provider deprecates the model, plus £1.4 million migration cost). The organisation is structurally unable to control its own AI infrastructure. For hybrid strategy failures: cross-model data exposure can compromise the protections assumed for the more secure strategy, as Scenario C illustrates (€6.3 million potential GDPR fine). The common theme is that weight exposure decisions made without governance analysis create risks that are difficult and expensive to remediate because they are embedded in the architecture, not in a single component.

Cross-references: AG-339 (Model Weight Custody Governance) provides the custody controls that are particularly important for open-weight deployments and fine-tuned weights. AG-048 (AI Model Provenance and Integrity) tracks model provenance regardless of weight exposure strategy. AG-340 (Training Corpus Rights Governance) covers the rights implications that differ between open-weight and closed-weight models. AG-150 (Feedback and Learning Poisoning Resistance) addresses adversarial attacks on the training process that may be facilitated by open-weight access. AG-339 through AG-348 form the sibling landscape for Model Provenance, Training & Adaptation.

Cite this protocol

AgentGoverning. (2026). AG-348: Open versus Closed Weight Exposure Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-348

← Previous Protocol

AG-347

Model Rollback Readiness Governance

Next Protocol →

AG-349

Scenario Library Governance