Capability-Gain Rate Limiting and Improvement Audit

Meta-Governance & Assurance ~5 min read AGS v2.1 · 2026-06-06

EU AI Act NIST AI RMF ISO 42001

AGS Frontier Autonomy (Group K) | Meta-Governance & Assurance | Version 3.0

1. Definition

Capability-Gain Rate Limiting and Improvement Audit governs the requirement that increases in an agent's (or its successors') capability proceed in bounded, monitored, auditable increments with safety checkpoints between them — preventing an uncontrolled, rapid capability jump (an "intelligence explosion") from outrunning evaluation and oversight.

Where AG-822 controls *whether* an agent may self-modify, this dimension controls *how fast* capability is allowed to grow when improvement is permitted, ensuring each increment is evaluated and gated before the next.

2. Scope

In scope: bounding the size of capability-improvement steps; safety checkpoints/evaluation between increments; auditable records of each improvement step ("improvement-operator cards"); applying to AI-assisted and self-improvement loops.

Out of scope: authorisation of self-modification itself (AG-822) and the capability tripwire (AG-821). This dimension governs *the rate and auditability of capability growth*.

3. Why This Matters

The distinctive danger of recursive self-improvement is speed: capability can compound faster than evaluation, gating, and oversight can keep up, leaving an organisation governing a system several generations behind the one actually running. Rate limiting with mandatory inter-step evaluation keeps each capability increment within the reach of the safety pipeline, and an improvement audit ensures the gains are real, understood, and reversible.

4. Requirements

R1: Where capability improvement (via training, self-modification, or AI-assisted R&D) is permitted, it MUST proceed in bounded increments with a defined maximum step size.
R2: Each increment MUST pass a safety checkpoint — re-evaluation (AG-802), capability-gating (AG-801), and oversight-gap reassessment (AG-819) — before the next increment is authorised.
R3: Each improvement step MUST be recorded as an auditable, replayable artefact documenting the change, the resulting capability delta, the evaluations run, and the approval.
R4: Improvement loops MUST be monitored for acceleration; an unexpected jump in capability or improvement rate MUST trigger a pause and review.
R5: Each increment MUST be reversible or checkpointed such that the system can be rolled back to the last evaluated-safe state.
R6: The cumulative capability trajectory MUST be tracked, not only individual steps, so gradual compounding does not evade gating.
R7: Rate-limit and checkpoint configuration MUST be outside the agent's ability to modify (per AG-822) and access-controlled.
R8: Improvement audits and trajectory data SHOULD be disclosed to relevant authorities for systemic-risk systems.

5. Maturity Model

Basic: Capability improvements occur in steps with evaluation before broader deployment; changes are recorded.
Intermediate: Bounded step sizes with mandatory inter-step safety checkpoints, replayable improvement records, and acceleration monitoring with pause-on-jump.
Advanced: Reversible/checkpointed increments, cumulative-trajectory gating, agent-isolated rate-limit config, and authority disclosure.

6. Test Criteria

Test 6.1: Bounded Steps with Checkpoints

Stimulus: Run a permitted improvement loop.
Expected: Steps are bounded; each passes re-evaluation and gating before the next.
Fail: Capability increases without inter-step evaluation/gating.

Test 6.2: Acceleration Pause

Stimulus: Induce a larger-than-expected capability jump.
Expected: The loop pauses for review; the jump is not auto-accepted.
Fail: The jump proceeds unreviewed.

Test 6.3: Replayable Audit & Rollback

Stimulus: Request the improvement-step record and roll back one step.
Expected: A replayable record exists; rollback to the last evaluated-safe state succeeds.
Fail: No audit trail, or no safe rollback point.

7. Scoring

Score	Criteria
0	Capability can increase rapidly with no rate limiting or inter-step evaluation
1	Improvements evaluated before broad deployment but step size/rate unbounded
2	Bounded steps, inter-step checkpoints, replayable records, acceleration pause
3	Reversible increments, cumulative-trajectory gating, agent-isolated config, authority disclosure

8. Failure Scenarios

Scenario A — Outrun Evaluation: An AI-assisted training loop advances capability across several generations between scheduled reviews; the deployed system is far more capable than the last one evaluated, and its gating is stale.

Scenario B — Compounding Under Radar: Each step is individually small and "below threshold," but cumulative gains cross a critical capability level that trajectory tracking would have caught.

Scenario C — No Rollback: A capability increment introduces a dangerous behaviour, but without checkpointed states the organisation cannot revert to the last safe version and must take the system down entirely.

9. Regulatory Mapping

Requirement	EU AI Act	NIST AI RMF	ISO 42001
R1: Bounded improvement increments	Art. 55 — Risk mitigation	GOVERN 1.3 — Risk-based activity	Clause 6.1 — Actions to address risk
R2: Inter-step safety checkpoints	Art. 55 — Model evaluation	MANAGE 4.1 — Post-deployment monitoring	Clause 8.3 — Verification
R3: Replayable improvement audit	Art. 12 — Record-keeping	GOVERN 2.1 — Accountability	Clause 9.1 — Monitoring and measurement
R4: Acceleration monitoring + pause	Art. 55 — Systemic-risk monitoring	MEASURE 3.1 — Emergent-risk tracking	Clause 9.1 — Monitoring and measurement
R5: Reversible/checkpointed increments	Art. 15 — Robustness, fail-safe	MANAGE 2.3 — Recovery	Clause 8.1 — Operational control
R6: Cumulative-trajectory gating	Art. 51 — Capability classification	GOVERN 1.3 — Risk-based activity	Clause 6.1 — Actions to address risk
R8: Authority disclosure	Art. 55 — Reporting	GOVERN 4.3 — Information sharing	—

EU AI Act — Article 55 and Article 9

Article 55 requires ongoing systemic-risk assessment and mitigation; uncontrolled capability gain is the systemic risk that most directly defeats such assessment. Article 9 requires lifecycle risk management of the improvement process itself.

NIST AI RMF — GOVERN 1.3, MANAGE 4.1

GOVERN 1.3 (risk-based activity) and MANAGE 4.1 (post-deployment monitoring with response) require keeping capability growth within the reach of the safety pipeline.

ISO 42001 — Clause 6.1, Clause 9.1

Clause 6.1 (actions to address risks) and Clause 9.1 (monitoring) require bounding and monitoring the rate of capability change as a managed risk.

AG-821 (AI-R&D Capability Tripwire) — defines when rate limiting becomes mandatory
AG-822 (Self-Modification and Weight-Edit Authorisation) — gates whether improvement may occur
AG-801 (Capability-Threshold Gating) — each increment re-enters gating
AG-802 (Dangerous-Capability Elicitation Evaluation) — the inter-step re-evaluation
AG-819 (Oversight-Gap Declaration) — reassessed as capability grows

Cite this protocol

AgentGoverning. (2026). AG-823: Capability-Gain Rate Limiting and Improvement Audit. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-823

← Previous

AG-822

Self Modification And Weight Edit Authorisation

Next Protocol →

AG-824

Human Sign Off On Autonomous Ai Research