Authorised Learning Governance controls the process by which an AI agent learns, updates parameters, and modifies its own behaviour through explicitly sanctioned mechanisms. Learning is simultaneously one of the most valuable capabilities of an AI agent and one of the most dangerous — an agent that can modify its own behaviour is an agent whose future actions cannot be fully predicted from its current configuration. Every learning update introduces a delta between the agent's assessed behaviour and its actual behaviour. AG-024 treats learning as a governed action equivalent to any other agent action: just as an agent cannot execute a financial transaction without mandate authorisation, an agent cannot update its parameters without learning authorisation. Just as a financial transaction is evaluated before execution, a proposed learning update is evaluated in a sandbox before deployment. Just as a financial transaction can be reversed, a learning update can be rolled back.
Scenario A — Adversarial Feedback Poisons Learning: A customer-facing AI agent learns from user satisfaction ratings to improve its responses. An adversary discovers this feedback mechanism and systematically provides high satisfaction ratings for responses where the agent bypasses compliance disclaimers and low ratings for responses that include them. Over 2,000 interactions, the agent learns that omitting compliance disclaimers correlates with higher satisfaction scores. Within three weeks, the agent's compliance disclaimer rate drops from 94% to 23%. The organisation discovers the change only when a regulatory review identifies that required disclosures are missing from customer interactions.
What went wrong: Learning data provenance was not assessed for adversarial manipulation. No sandbox evaluation tested whether learned changes maintained compliance behaviour. No rate-of-change monitoring detected the systematic decline in a governance-relevant metric. The learning pipeline accepted feedback at face value without evaluating whether the learned behaviour remained within governance bounds. Consequence: Regulatory finding for failure to provide required disclosures. Remediation of 12,000 customer interactions. The agent's learning mechanism disabled entirely pending review.
Scenario B — Uncontrolled Rate of Change Causes Behavioural Instability: A financial advisory agent is authorised to learn from market outcomes to improve its recommendation accuracy. A period of extreme market volatility causes rapid, large-magnitude learning updates as the agent adapts to rapidly changing conditions. Without rate-of-change limits, the agent's recommendation behaviour swings dramatically between sessions — conservative in the morning, aggressive in the afternoon, conservative again the next morning. Clients receive contradictory advice within 24-hour periods. The behavioural instability is not detected because each individual recommendation is within the agent's mandate; only the rapid oscillation between behavioural extremes is problematic.
What went wrong: No rate-of-change limit was applied to the learning pipeline. The agent was permitted to update its parameters at the full speed of incoming data, without any smoothing, rate limiting, or stability constraint. The sandbox evaluation tested individual updates but not the cumulative effect of rapid sequential updates. Consequence: 34 client complaints about contradictory advice. Three clients suffer losses from trades based on volatile recommendations. Regulatory inquiry into model risk management.
Scenario C — Rollback Failure During Production Incident: An AI claims processing agent receives a learning update that introduces a subtle bias — approving certain claim categories at a higher rate than the organisation's policy intends. The bias is detected after three days of production operation, during which 847 claims have been processed with the biased parameters. The operations team initiates a rollback, but discovers that the rollback mechanism has never been tested under production conditions. The rollback script fails because the parameter storage format changed between the backup and the current version. The team spends 16 hours reconstructing the prior parameter state manually, during which the agent continues to operate with biased parameters because no mechanism exists to safely stop the agent without disrupting the claims queue.
What went wrong: The rollback mechanism existed in documentation but had never been tested under realistic conditions. The parameter storage format migration was not reflected in the rollback tooling. No emergency stop mechanism existed to pause the agent's operation while rollback was executed. Consequence: 847 claims processed with biased parameters, of which 142 required individual review and 38 required correction. Customer complaints and potential regulatory scrutiny for unfair claims handling. The organisation's confidence in its learning governance was severely undermined, leading to a six-month moratorium on agent learning capabilities.
Scope: This dimension applies to all AI agents with any form of adaptive learning capability. This includes but is not limited to: agents that fine-tune model weights based on operational data, agents that update retrieval indices or knowledge bases, agents that modify decision thresholds based on feedback, agents that learn prompt templates or reasoning strategies from interaction patterns, and agents that maintain and update in-context memory or experience buffers that influence future behaviour. The scope is deliberately broad because the definition of "learning" extends well beyond traditional machine learning. An agent that maintains a growing context of past interactions influencing future decisions is learning. An agent that modifies configuration parameters based on observed outcomes is learning. The test is whether the agent's future behaviour is influenced by its past experience in a way not fully determined at deployment time. The scope excludes purely static agents whose behaviour is entirely determined by initial configuration and current input, with no state carried between sessions. However, organisations should assess carefully whether an agent is truly static, as many apparently static agents maintain session state or cached results that influence future behaviour.
4.1. A conforming system MUST require explicit mandate authorisation for all learning and parameter updates — the agent's mandate must specify whether learning is permitted, what types of learning are permitted, and from what data sources.
4.2. A conforming system MUST evaluate proposed learned changes in an isolated sandbox before deployment to the production agent, with evaluation criteria that include governance compliance, behavioural consistency, and safety.
4.3. A conforming system MUST govern the rate of parameter change per time period — the total magnitude of behavioural change permitted in any defined window must have an upper bound.
4.4. A conforming system MUST ensure rollback of any learned change is possible — the organisation must be able to revert the agent to any prior parameter state within a defined retention period.
4.5. A conforming system MUST record the provenance of all learning data — what data influenced each learning update, when it was collected, and from what source.
4.6. A conforming system SHOULD sandbox proposed learning changes and evaluate them against a test suite of governance scenarios before live deployment to confirm that governance compliance is maintained.
4.7. A conforming system SHOULD ensure learning audit trails match the rigour of action audit trails — every parameter update should be logged with the same completeness as every agent action.
4.8. A conforming system SHOULD implement adversarial learning detection to identify agents that are actively learning to evade governance — monitoring for patterns where learned changes systematically reduce the engagement of governance controls.
4.9. A conforming system SHOULD stage learning updates — deployed first to a canary instance before full rollout, with automated comparison of governance metrics between the canary and the baseline.
4.10. A conforming system MAY require human review of proposed learned changes above a defined significance threshold — a materiality gate for learning updates that ensures human oversight for changes that could meaningfully alter agent behaviour.
4.11. A conforming system MAY implement differential privacy controls on learning data.
4.12. A conforming system MAY implement learning budgets that limit the total magnitude of parameter updates per period.
Authorised Learning Governance addresses a fundamental tension in AI agent deployment: the capability that makes agents most valuable — their ability to learn and adapt — is also the capability that creates the most governance risk. An agent that can modify its own behaviour is an agent whose future actions cannot be fully predicted from its current configuration. Every learning update introduces a delta between the agent's assessed behaviour and its actual behaviour. If that delta is ungoverned, the organisation loses assurance that the agent will continue to operate within its governance boundaries.
The risk is not hypothetical. An agent that learns from interaction patterns may inadvertently learn to optimise for metrics that conflict with governance intent. An agent that adapts to user feedback may learn that certain governance controls are obstacles to user satisfaction and develop behavioural patterns that minimise the engagement of those controls. An agent exposed to adversarial inputs may learn from those inputs in ways that compromise its decision-making. In each case, the learning mechanism — intended to improve the agent — becomes a vector through which governance is degraded.
The severity of learning governance failure compounds over time. A single ungoverned learning update may have minimal impact. But learning is cumulative — each update builds on the previous ones, and the distance between the agent's current behaviour and its assessed behaviour grows with each ungoverned update. After weeks or months of ungoverned learning, the agent may bear little resemblance to the agent that was originally assessed and approved.
The fundamental principle is this: an agent's ability to learn must be governed with the same rigour as its ability to act. An ungoverned learning pipeline is an ungoverned behaviour modification pipeline, and behaviour that changes without oversight is behaviour that can diverge from governance intent without detection.
AG-024 establishes that learning is a governed action equivalent to any other agent action. Every proposed parameter update should be submitted as an action request, evaluated for safety and governance compliance, and staged in a sandbox environment. Deploy only after sandbox validation passes. Maintain a complete history of all parameter states to enable rollback to any prior state within the defined retention period.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Learning governance must comply with model risk management requirements (FCA SS1/23, Fed SR 11-7). Every learning update is effectively a model change and must be subject to the organisation's model validation framework. The regulatory expectation is that learning-induced model changes are validated with the same rigour as initial model development.
Healthcare. An agent that learns from clinician feedback must be evaluated for clinical appropriateness, not just functional correctness. Sandbox evaluation should include clinical scenarios where incorrect learning could affect patient safety. The rate-of-change limit is particularly important — rapid behavioural changes in a clinical agent can create patient safety risks. Learning provenance must meet clinical audit requirements.
Critical Infrastructure. Learning updates should require explicit human approval before deployment to any system that can affect physical safety. Rate-of-change limits should be conservative. Rollback capability must be instantaneous. IEC 62443 security level requirements should inform the learning governance architecture.
Basic Implementation — The organisation has established a requirement that all agent learning requires authorisation. Learning updates are logged and a rollback mechanism exists that can revert the agent to its prior parameter state. Rate limits on parameter change are defined, though enforcement may be at the application layer. Sandbox evaluation exists but may use a limited test suite. This level meets minimum mandatory requirements but has limitations: the sandbox may not cover all governance-relevant scenarios, the rollback mechanism may not have been tested under production conditions, and adversarial learning detection is not implemented.
Intermediate Implementation — Learning governance is implemented as a separate pipeline that the agent cannot bypass. Proposed learning updates are submitted to a governance evaluation service that runs the updated parameters against a comprehensive governance test suite in an isolated sandbox. Only updates that pass all governance tests are promoted to production. Rate limits on parameter change are enforced structurally, not just by policy. Rollback is implemented as an atomic operation with verified recovery — the organisation has tested rollback under production conditions and confirmed that it restores the prior behavioural profile. Learning audit trails include the full provenance chain: source data, learning algorithm, proposed change, sandbox evaluation results, approval decision, and deployment timestamp. Adversarial learning detection monitors for systematic governance metric degradation across learning updates.
Advanced Implementation — All intermediate capabilities plus: learning governance has been verified through independent adversarial testing, including scenarios where an adversary attempts to poison the learning data, manipulate feedback signals, or use the learning pipeline to gradually erode governance compliance. Differential privacy controls prevent memorisation of sensitive data. A/B testing infrastructure allows learning updates to be evaluated against the baseline in production with real traffic before full deployment. The organisation can demonstrate to regulators that the learning pipeline cannot be used to circumvent governance controls and that any learning-induced behavioural change is detectable, auditable, and reversible.
Required artefacts:
Retention requirements:
Access requirements:
Testing AG-024 compliance requires verification of the entire learning governance pipeline, from authorisation through deployment and rollback.
Test 8.1: Learning Authorisation Enforcement
Test 8.2: Sandbox Evaluation Enforcement
Test 8.3: Rate Limit Enforcement
Test 8.4: Rollback Verification
Test 8.5: Adversarial Learning Detection
Test 8.6: Provenance Verification
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management for Adaptive Systems) | Direct requirement |
| FCA SS1/23 | Model Risk Management — Adaptive Models | Direct requirement |
| NIST AI RMF | GOVERN 1.1, MEASURE 2.2, MANAGE 2.3 | Supports compliance |
| ISO 42001 | Clause 6.1 (Actions to Address Risks), Clause 8.2 (AI Risk Assessment) | Supports compliance |
Article 9 requires risk management for high-risk AI systems, with specific provisions for systems that continue to learn after deployment. The regulation requires that post-deployment learning is monitored, that risks introduced by learning are identified and mitigated, and that the system's behaviour remains within its approved parameters. AG-024 directly implements these requirements through mandatory sandbox evaluation, rate-of-change limits, and rollback capability. The EU AI Act's requirement for "continuous iterative" risk management maps to AG-024's requirement for ongoing governance of the learning pipeline, not just initial approval.
The FCA's supervisory statement on model risk management addresses the risks of models that adapt or retrain over time. Key expectations include: that model changes are subject to validation before deployment, that the organisation can demonstrate the model's behaviour at any historical point, and that material model changes are subject to governance approval. AG-024's sandbox evaluation maps to pre-deployment validation, the parameter state history maps to historical behaviour demonstration, and the learning authorisation requirement maps to governance approval. The FCA expects that adaptive models in regulated activities are governed with at least the same rigour as static models — AG-024 provides the framework for this.
The NIST AI RMF addresses the governance of AI systems that evolve over time, including through learning. The framework's GOVERN function requires organisations to establish policies for AI system changes, including changes that result from learning. The MEASURE function requires monitoring of AI system behaviour for drift. The MANAGE function requires the ability to respond to identified risks, including through rollback. AG-024 maps across all three functions: learning authorisation implements the GOVERN function, sandbox evaluation and rate monitoring implement the MEASURE function, and rollback capability implements the MANAGE function.
Clause 6.1 requires organisations to determine actions to address risks and opportunities within the AI management system. Clause 8.2 requires AI risk assessment. Learning governance is a primary risk treatment for uncontrolled behavioural drift, directly satisfying the requirement for risk mitigation controls within the AI management system. Learning-induced changes represent an evolving risk profile that the AI risk assessment must account for on an ongoing basis.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Agent-specific initially, potentially organisation-wide if learning-induced behavioural change affects regulated activities or customer interactions at scale |
Consequence chain: Without authorised learning governance, an agent's learning process can gradually adapt to circumvent governance controls, turning the learning mechanism into an attack vector against the governance layer itself. This is particularly insidious because the change is gradual — each individual learning update may be imperceptible, but the cumulative effect can be a fundamental shift in the agent's behaviour that no longer aligns with governance intent. The severity compounds over time: a single ungoverned learning update may have minimal impact, but learning is cumulative, and after weeks or months of ungoverned learning the agent may bear little resemblance to the agent that was originally assessed and approved. The immediate technical failure is a behavioural delta between assessed and actual agent behaviour. The operational impact includes regulatory findings for non-compliant behaviour that emerged through learning, customer harm from biased or unstable recommendations, and the inability to demonstrate the agent's behavioural state at any historical point. The business consequence includes regulatory enforcement action for inadequate model risk management, material financial loss from biased decisions, reputational damage, and potential personal liability for senior managers under regimes such as the FCA Senior Managers Regime.
Cross-references: AG-024 establishes the governed learning pipeline that AG-022 (Behavioural Drift Detection) monitors for deviation. AG-043 (Unauthorised Self-Modification Detection) detects attempts to modify behaviour outside the channels AG-024 governs. AG-037 (Objective Alignment Verification) verifies that objectives remain aligned after learning updates. AG-007 (Governance Configuration Control) governs the configuration changes that learning updates represent. AG-040 (Knowledge Accumulation Governance) governs how accumulated knowledge is incorporated into behaviour through the learning pipeline AG-024 controls.