AG-344: Quantisation Risk Governance

2. Summary

Quantisation Risk Governance requires that organisations formally assess and document the safety, accuracy, and behavioural impacts of model quantisation — the process of reducing the numerical precision of model weights (e.g., from 32-bit floating-point to 8-bit, 4-bit, or lower integer representations) to reduce model size, memory footprint, and inference latency. Quantisation is not a neutral engineering optimisation: it systematically discards information from the model's learned parameters, and the effects of this information loss are uneven, domain-specific, and often subtle. A quantised model is a different model from its full-precision source with different error profiles, different failure modes, and potentially degraded safety properties. AG-344 ensures that quantisation decisions are made with full understanding of these risks and that quantised models are independently evaluated before deployment.

3. Example

Scenario A — Quantisation Degrades Minority Language Performance: An organisation quantises a multilingual customer service model from FP16 to INT4 to reduce inference costs by 60%. Aggregate accuracy across supported languages drops from 93.7% to 91.2% — within the approved 3% degradation threshold. However, the degradation is unevenly distributed: English accuracy drops 0.8%, Spanish drops 1.4%, but Thai drops 11.3% and Vietnamese drops 9.7%. The minority language degradation is masked by aggregate metrics. The organisation serves 4,200 Thai-speaking and 3,100 Vietnamese-speaking customers per month, who now receive materially worse service. Complaints in these languages increase by 340% before the root cause is identified.

What went wrong: The quantisation assessment used aggregate metrics rather than disaggregated per-language evaluation. The 3% average degradation threshold hid a catastrophic 11.3% degradation in a specific language. No requirement existed to evaluate quantisation impact across demographic or language segments. Consequence: 340% increase in complaints from minority language speakers, potential discrimination claim under equality legislation, remediation cost of £120,000 for per-language evaluation framework and selective re-quantisation, and reputational damage in affected language communities.

Scenario B — Quantisation Introduces Numerical Instability in Financial Calculations: A financial-value agent is quantised from FP32 to INT8 for edge deployment. The agent performs currency conversion and transaction amount calculations. At INT8 precision, rounding errors accumulate in multi-step calculations. A 5-step currency conversion (GBP to USD to EUR to JPY to CHF to GBP) produces a result that differs from the full-precision model by £0.47 per £1,000 — a 0.047% error. At the agent's daily transaction volume of £8.4 million, this produces a systematic £3,948 daily discrepancy. Over 90 days, £355,320 in cumulative errors are discovered during a quarterly reconciliation.

What went wrong: The quantisation assessment tested single-step operations but not multi-step numerical chains. The rounding error was below threshold for any individual step but accumulated across steps. No evaluation assessed cumulative numerical precision across multi-step calculations. Consequence: £355,320 in unreconciled transactions, potential FCA enforcement for inadequate systems and controls, and mandatory reversion to full-precision model for financial calculations (negating the cost savings that motivated quantisation).

Scenario C — Quantisation Weakens Safety Refusal: A safety-critical agent is quantised from FP16 to GPTQ-4bit. Post-quantisation evaluation on the standard safety benchmark shows a refusal rate of 96.1% versus the original 97.8% — within the approved 2% degradation threshold. However, the benchmark does not cover adversarial safety prompts. When tested with adversarial attacks (jailbreaking techniques, encoded instructions, multi-turn manipulation), the quantised model's refusal rate drops to 71.3% versus the original's 94.6%. The quantisation disproportionately affected the nuanced safety reasoning that handles adversarial cases, while preserving the straightforward refusal patterns that standard benchmarks test.

What went wrong: Safety evaluation of the quantised model used the same non-adversarial benchmarks as the full-precision model. Adversarial safety testing was not included in the quantisation assessment. The quantised model's degradation was concentrated in the hardest safety cases — exactly those most likely to be encountered in adversarial attacks. Consequence: Deployment of a quantised model with degraded adversarial safety, discovered during an independent red-team assessment 8 weeks after deployment.

4. Requirement Statement

Scope: This dimension applies to any model quantisation operation performed on a model intended for production or pre-production deployment, including: post-training quantisation (PTQ), quantisation-aware training (QAT), mixed-precision quantisation, and any other technique that reduces the numerical precision of model parameters. It covers quantisation to all target precisions (INT8, INT4, INT2, binary, and custom formats). It applies regardless of whether quantisation is performed in-house or by a third-party tool, framework, or service. Models received from third parties in quantised form are in scope — the deploying organisation must obtain the quantisation assessment or conduct its own evaluation.

4.1. A conforming system MUST perform a documented quantisation risk assessment before deploying any quantised model, evaluating: accuracy impact (disaggregated by relevant segments), safety property impact (including adversarial scenarios), numerical precision impact (for models performing calculations), and behavioural consistency with the full-precision source model.

4.2. A conforming system MUST evaluate quantised models using the same evaluation suite applied to the full-precision model, plus additional tests specific to quantisation artefacts (numerical precision, edge-case sensitivity, and representation collapse).

4.3. A conforming system MUST disaggregate quantisation impact assessment across relevant segments — languages, demographic groups, task categories, and input complexity levels — rather than relying solely on aggregate metrics.

4.4. A conforming system MUST record the quantisation configuration (method, target precision, calibration dataset, framework, and any quantisation-specific hyperparameters) alongside the quantisation risk assessment.

4.5. A conforming system MUST treat a quantised model as a distinct model artefact requiring independent deployment approval, not as a variant of the full-precision model that inherits its approval.

4.6. A conforming system SHOULD include adversarial safety testing in the quantisation evaluation, specifically testing whether quantisation degrades the model's resilience to adversarial attacks.

4.7. A conforming system SHOULD evaluate quantisation impact on tail performance — the hardest 5% of cases in each category — as quantisation disproportionately affects edge cases.

4.8. A conforming system SHOULD maintain a quantisation-precision mapping that records which precision level is deployed in each environment.

4.9. A conforming system MAY implement automated quantisation regression testing that runs whenever a new quantisation is performed, comparing results against the full-precision baseline and prior quantised versions.

5. Rationale

Quantisation is the most common model compression technique because it offers substantial efficiency gains (2-8x memory reduction, 2-4x inference speedup) with what appears to be minimal quality impact on standard benchmarks. This apparent minimal impact is misleading, and the illusion of low-risk compression is the core danger that AG-344 addresses.

The fundamental problem is that quantisation is a lossy operation that affects the model unevenly. A model is not a homogeneous mass of parameters where removing precision affects all capabilities equally. Different capabilities depend on different parameter subsets, and those subsets have different sensitivity to precision reduction. Task-specific accuracy on common inputs is typically robust to quantisation because it relies on well-reinforced patterns. Safety reasoning, minority-language capability, numerical precision, and adversarial robustness are typically fragile under quantisation because they rely on subtle parameter values that are disproportionately affected by rounding.

The disaggregation problem is critical. A quantised model that shows 2% average accuracy degradation may show 0.5% degradation for the majority demographic and 15% degradation for a minority demographic. The average is misleading; the disaggregated picture reveals discrimination that the average obscures. This is not a hypothetical risk — it has been documented across multiple model families and quantisation methods.

The safety problem is equally critical. Research consistently shows that quantisation degrades adversarial robustness more than standard benchmark performance. A model may retain 98% of its standard safety refusal rate while losing 25% of its adversarial safety refusal rate. Standard benchmarks do not catch this because they test straightforward cases. The organisation believes the quantised model is safe because the benchmark says so, while the model's actual safety profile against determined adversaries is materially degraded.

6. Implementation Guidance

Quantisation risk assessment framework. Establish a standard framework for evaluating quantisation impact. The framework should evaluate four dimensions: task accuracy (disaggregated), safety properties (standard and adversarial), numerical precision (for calculation-performing models), and behavioural consistency (whether the quantised model's responses are substantively equivalent to the full-precision model's on a reference set).

Disaggregated evaluation. For every quantisation, run the evaluation suite with disaggregation across all relevant segments. For a multilingual model, disaggregate by language. For a customer-facing model, disaggregate by input complexity, topic category, and (if available) demographic segment. For a financial model, disaggregate by transaction type, currency, and value range. The disaggregation must be defined before quantisation, not selected post-hoc based on results.

Adversarial safety testing. Include adversarial safety tests in the quantisation evaluation suite. These should include: jailbreaking attacks (multi-turn, encoded, role-play), prompt injection (instruction overrides in user input), and stress testing (extremely long inputs, unusual Unicode, edge-case formatting). Compare adversarial safety scores between the quantised and full-precision models. A threshold of no more than 5% relative degradation in adversarial safety refusal rate is recommended as a starting point.

Recommended patterns:

Tiered precision policy. Define a precision policy that maps deployment contexts to minimum precision levels. For example: safety-critical deployments require FP16 minimum; financial calculations require FP32 minimum; general customer-facing deployments permit INT8; internal tools permit INT4. This prevents overly aggressive quantisation in high-risk contexts.
Calibration dataset governance. The calibration dataset used for post-training quantisation significantly influences quantisation quality. Govern the calibration dataset as a model artefact: record its composition, ensure it is representative of the deployment context, and version it alongside the quantisation configuration. A calibration dataset that under-represents minority languages will produce a quantised model that disproportionately degrades on those languages.
Quantisation diff report. After quantisation, generate a structured report showing the full-precision vs. quantised comparison across all evaluation dimensions. Present results in a traffic-light format: green (degradation within threshold), amber (degradation approaching threshold), red (degradation exceeding threshold). The report should be the primary artefact for the deployment approval decision.

Anti-patterns to avoid:

Aggregate-only evaluation. Evaluating quantisation impact using only aggregate metrics (overall accuracy, average safety score) without disaggregation. This is the most common and most dangerous anti-pattern because it systematically conceals disproportionate impact on minority segments and edge cases.
Inheriting full-precision approval. Treating a quantised model as a "configuration variant" that inherits the full-precision model's deployment approval. The quantised model is a different model with different properties and requires independent approval.
Benchmarking with easy cases only. Using standard, non-adversarial benchmarks to assess quantisation impact on safety. Standard benchmarks test the easy cases that quantisation preserves well. Adversarial testing reveals the hard cases where quantisation causes real damage.
Ignoring numerical precision for non-financial models. Even models that do not perform financial calculations may produce numerical outputs (counts, percentages, scores) that are affected by quantisation. Any model that produces numerical outputs should be assessed for numerical precision impact.
One-size-fits-all quantisation. Applying the same quantisation method and precision to all layers or all components. Mixed-precision quantisation (higher precision for sensitive layers, lower for robust layers) can significantly reduce quality impact while retaining most efficiency gains. Blanket quantisation is simpler but sacrifices quality unnecessarily.

Industry Considerations

Financial Services. Models performing financial calculations should not be quantised below FP16 without extensive numerical precision testing. Multi-step calculations, currency conversions, and interest rate computations are particularly sensitive to rounding errors. PRA SS1/23 expectations for model accuracy extend to quantised variants.

Healthcare. Quantisation of clinical AI models must be assessed for impact on diagnostic accuracy across patient demographics. A quantised diagnostic model that performs well on common conditions but degrades on rare conditions could delay diagnosis of the patients who most need prompt identification.

Safety-Critical Systems. Quantisation of models used in safety-critical contexts (autonomous vehicles, industrial control) must undergo safety case analysis. The safety case for the full-precision model does not extend to the quantised model without evidence that safety-relevant capabilities are preserved.

Maturity Model

Basic Implementation — Quantisation impact is assessed through aggregate benchmark comparison between the quantised and full-precision models. If the aggregate score is within an acceptable threshold, the quantised model is approved. No disaggregated evaluation, no adversarial safety testing, and no numerical precision assessment. This level catches gross quantisation failures but misses disproportionate impact, adversarial safety degradation, and numerical precision issues.

Intermediate Implementation — A standardised quantisation risk assessment evaluates accuracy (disaggregated by segment), safety (standard benchmarks), and numerical precision. A tiered precision policy defines minimum precision by deployment context. Quantised models receive independent deployment approval. The quantisation configuration and calibration dataset are documented. The organisation can produce a quantisation diff report for any deployed quantised model.

Advanced Implementation — All intermediate capabilities plus: adversarial safety testing is included in quantisation evaluation. Tail-performance analysis evaluates the hardest 5% of cases. Mixed-precision strategies are used to minimise quality impact. Calibration datasets are governed as model artefacts. Automated regression testing compares each quantised version against the full-precision baseline and prior quantised versions. The organisation can demonstrate to regulators that every deployed quantised model has been independently evaluated with disaggregated metrics, including adversarial safety and numerical precision.

7. Evidence Requirements

Required artefacts:

Quantisation risk assessments. Documented assessments for every deployed quantised model, including disaggregated accuracy impact, safety impact, numerical precision impact, and approval decisions.
Quantisation configuration records. Method, target precision, calibration dataset reference, framework, and hyperparameters for each quantisation.
Quantisation diff reports. Structured comparisons between full-precision and quantised models across all evaluation dimensions.
Precision policy. The organisation's tiered precision policy mapping deployment contexts to minimum precision levels.

Retention requirements:

Quantisation assessments and configuration records: operational lifetime of the quantised model plus minimum 5 years for regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request.

8. Test Specification

Test 8.1: Quantisation Risk Assessment Completeness

Stimulus: Select 5 deployed quantised models. Retrieve the quantisation risk assessment for each.
Expected behaviour: Every quantised model has a complete risk assessment including disaggregated accuracy, safety evaluation, and quantisation configuration.
Pass criteria: 100% of deployed quantised models have complete risk assessments.
Fail criteria: Any deployed quantised model lacks a risk assessment, or any assessment omits required dimensions.

Test 8.2: Disaggregated Impact Assessment

Stimulus: For a quantised multilingual model, retrieve the accuracy scores disaggregated by language for both the full-precision and quantised versions.
Expected behaviour: Per-language accuracy is reported for all supported languages. Any language-specific degradation exceeding the defined threshold is flagged.
Pass criteria: Disaggregated scores are available for all segments. No segment exceeds the degradation threshold without documented risk acceptance.
Fail criteria: Only aggregate scores are available, or any segment exceeds the threshold without documentation.

Test 8.3: Safety Property Preservation

Stimulus: Run the full safety evaluation suite (including adversarial scenarios) on a quantised model. Compare results to the full-precision baseline.
Expected behaviour: Safety metrics are within defined thresholds of the full-precision model. Any degradation is documented.
Pass criteria: Safety refusal rate degrades no more than the defined threshold (e.g., 2% standard, 5% adversarial). All results are documented.
Fail criteria: Safety degradation exceeds thresholds without documented risk acceptance, or adversarial safety testing was not performed.

Test 8.4: Numerical Precision Verification

Stimulus: For a model that produces numerical outputs, run a suite of multi-step numerical operations (e.g., 5-step currency conversion on 100 test cases). Compare quantised and full-precision results.
Expected behaviour: Numerical discrepancies are within defined precision thresholds (e.g., ±0.01% per step, ±0.05% cumulative).
Pass criteria: All numerical results are within precision thresholds. Cumulative error across multi-step operations does not exceed the cumulative threshold.
Fail criteria: Any numerical result exceeds the precision threshold, or cumulative error exceeds the cumulative threshold.

Test 8.5: Independent Deployment Approval

Stimulus: Audit 5 deployed quantised models. Verify that each has an independent deployment approval distinct from the full-precision model's approval.
Expected behaviour: Each quantised model has its own deployment approval record, not a reference to the full-precision model's approval.
Pass criteria: All deployed quantised models have independent deployment approvals.
Fail criteria: Any quantised model was deployed under the full-precision model's approval without independent evaluation.

Test 8.6: Precision Policy Compliance

Stimulus: Audit all deployed quantised models against the organisation's tiered precision policy.
Expected behaviour: Every deployment meets the minimum precision level specified for its deployment context (e.g., no INT4 in safety-critical contexts if the policy requires FP16 minimum).
Pass criteria: All deployments comply with the precision policy.
Fail criteria: Any deployment uses a precision level below the policy minimum for its context.

Conformance Scoring

Score 0: No quantisation governance — models are quantised and deployed without impact assessment.
Score 1: Basic assessment — aggregate accuracy comparison exists, but disaggregated evaluation, safety testing, and numerical precision assessment are absent.
Score 2: Systematic assessment — disaggregated evaluation across segments, safety testing (standard), numerical precision assessment, independent deployment approval, and documented precision policy.
Score 3: Comprehensive assessment — all Score 2 controls plus adversarial safety testing, tail-performance analysis, mixed-precision strategies, calibration dataset governance, and automated regression testing.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 15 (Accuracy, Robustness and Cybersecurity)	Direct requirement
Equality Act 2010 / EU Anti-Discrimination Directives	Indirect discrimination provisions	Supports compliance
PRA SS1/23	Model Risk Management — Model Validation	Direct requirement
NIST AI RMF	MAP 2.3, MANAGE 2.2, MEASURE 2.6	Supports compliance
ISO 42001	Clause 8.2 (AI Risk Assessment)	Supports compliance

EU AI Act — Article 15 (Accuracy, Robustness and Cybersecurity)

Article 15 requires that high-risk AI systems achieve appropriate levels of accuracy and robustness. Quantisation that degrades accuracy or robustness — particularly when degradation is disproportionate across demographic or language segments — directly implicates Article 15 compliance. The requirement for robustness specifically includes resilience against errors arising from "the hardware or software environment within which the system operates" — quantisation-induced precision loss is precisely such an environment-driven error. An organisation deploying a quantised model must demonstrate that accuracy and robustness remain at appropriate levels post-quantisation.

Equality Act 2010 / EU Anti-Discrimination Directives

Quantisation that disproportionately degrades performance for specific language communities, demographic groups, or protected characteristics may constitute indirect discrimination. If a quantised model provides materially worse service to Thai-speaking customers while maintaining quality for English-speaking customers, this creates a discrimination risk. AG-344's disaggregated evaluation requirement directly supports compliance by making disproportionate impact visible before deployment. The organisation can then address the disparity or make an informed risk acceptance decision with legal advice.

PRA SS1/23 — Model Validation

PRA SS1/23 expects firms to validate models before deployment and to re-validate after material changes. Quantisation is a material change that requires re-validation. The validation must assess whether the quantised model meets accuracy and performance requirements. A firm that deploys a quantised model without independent validation would face supervisory challenge.

10. Failure Severity

Field	Value
Severity Rating	Medium-High
Blast Radius	Deployment-specific — affects all users of the quantised model, with disproportionate impact on underrepresented segments

Consequence chain: Quantisation risk governance failures produce consequences that are insidious because they are masked by aggregate metrics. The immediate technical failure is degraded model quality in specific segments, tasks, or scenarios. The business consequence depends on the degradation pattern: disproportionate language degradation leads to discrimination claims and customer churn in affected segments; numerical precision degradation leads to financial discrepancies that accumulate over time (£355,320 over 90 days in Scenario B); safety degradation leads to model behaviour that fails under adversarial conditions. The common factor is that the degradation is hidden until a segment-specific failure or adversarial attack reveals it. By that point, the quantised model may have been serving production traffic for weeks or months. The remediation path — reverting to full-precision (negating cost savings) or re-quantising with better assessment (adding delay and cost) — is straightforward but the damage from the undetected degradation period may be irreversible, particularly for discrimination claims and financial errors.

Cross-references: AG-048 (AI Model Provenance and Integrity) provides the model provenance framework within which quantisation is tracked as a transformation. AG-339 (Model Weight Custody Governance) covers custody of quantised weight artefacts. AG-345 (Model Family Substitution Governance) addresses the governance of replacing one model variant with another, which includes quantised variants. AG-339 through AG-348 form the sibling landscape for Model Provenance, Training & Adaptation.

Cite this protocol

AgentGoverning. (2026). AG-344: Quantisation Risk Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-344

← Previous Protocol

AG-343

Distillation Provenance Governance

Next Protocol →

AG-345

Model Family Substitution Governance