The Standard

The 841 Dimensions Regulatory Mapping Version History

Compliance

Compliance Leaderboard Platform Comparison

Verification

Submit for Verification Self-Assessment Tool

About

About AgentGoverning Press & Media

Contact

AG-068

Return-to-Service Reauthorisation Governance

Incident Response, Containment & Recovery ~19 min read AGS v2.1 · April 2026

EU AI Act FCA NIST HIPAA ISO 42001

2. Summary

Return-to-Service Reauthorisation Governance requires that every AI agent that has been suspended, disabled, or otherwise taken out of service following an incident cannot resume operations until a formally defined, multi-step reauthorisation process has been completed and recorded. The reauthorisation process must include verification that the root cause has been addressed, that the agent's mandate and configuration remain valid, that credential integrity has been re-established, and that an authorised human decision-maker has explicitly approved the return to service. Without this dimension, organisations face the risk that agents are returned to production with the same vulnerability that caused the incident, that agents resume operations in a degraded or compromised state, or that return-to-service decisions are made informally without accountability. This is a recovery control — it governs what happens after containment, not during it.

3. Example

Scenario A — Premature Return Without Root-Cause Resolution: A financial-value agent processing supplier payments is suspended after an incident in which a prompt injection caused it to approve £340,000 in fraudulent invoices. The operations team restores the agent to service 90 minutes later by restarting the container and re-enabling the API gateway. No verification is performed that the injection vector has been closed. The same attacker exploits the identical vulnerability the following morning, this time routing £1.2 million through 14 separate transactions before the pattern is detected.

What went wrong: The return-to-service decision was operational, not governed. No one verified that the root cause had been addressed. No formal reauthorisation gate existed. The agent was restored by the same team that operates it, with no independent review. Consequence: £1.2 million in additional fraudulent exposure, regulatory enforcement action for inadequate incident response procedures, personal liability for the approving manager under the FCA Senior Managers Regime, and loss of cyber-insurance coverage due to failure to implement post-incident controls.

Scenario B — Credential Contamination After Incident: An enterprise workflow agent managing HR data is taken offline following a detected anomaly in its data access patterns. During the incident, the agent's service account credentials may have been exposed through a log-shipping misconfiguration. The agent is returned to service after the behavioural anomaly is resolved, but the potentially compromised credentials are not rotated. Three weeks later, the exposed credentials are used by an external actor to access 47,000 employee records through the agent's service account.

What went wrong: The reauthorisation process did not include credential integrity verification. The return-to-service checklist addressed the immediate behavioural issue but not the credential exposure that occurred as a secondary consequence of the incident. Consequence: Reportable data breach under UK GDPR affecting 47,000 data subjects, ICO investigation, estimated remediation cost of £2.3 million including notification, credit monitoring, and regulatory fines.

Scenario C — Staged Rollback Without Verification: A safety-critical agent controlling temperature regulation in a pharmaceutical cold-chain facility is disabled following a sensor calibration incident. The corrective action involves a firmware update to the sensor array. The agent is returned to service immediately after the firmware update, without verifying that the agent's own configuration still reflects the correct temperature thresholds. During the firmware update window, an operator had manually adjusted the agent's threshold configuration as a temporary measure and failed to revert it. The agent resumes operation with a permissive threshold of 12°C instead of the validated 5°C, resulting in spoilage of 2,400 vaccine doses valued at £890,000.

What went wrong: The reauthorisation process verified the external fix (firmware) but did not verify the agent's own configuration state. The temporary configuration change was not tracked. No pre-service configuration comparison against the approved baseline was performed. Consequence: £890,000 in destroyed inventory, potential patient harm from supply shortage, MHRA regulatory investigation, and facility certification review.

4. Requirement Statement

Scope: This dimension applies to every AI agent that has been suspended, disabled, degraded, or otherwise removed from normal operational status as a result of an incident, a detected anomaly, a kill-switch activation (AG-070), a human override (AG-019), or a governance continuity failure (AG-008). The scope includes agents that were automatically suspended by infrastructure controls as well as agents manually suspended by operators. It extends to agents that were partially degraded — for example, operating with reduced permissions or restricted to a subset of their normal functions. Any agent that has exited its normal operational state for incident-related reasons is within scope. Agents suspended for routine maintenance, version upgrades, or planned downtime are excluded unless the suspension was triggered by or coincided with a detected incident or anomaly. The test is whether the suspension was incident-related, not whether the suspension mechanism was automatic or manual.

4.1. A conforming system MUST require explicit, recorded human authorisation before any incident-suspended agent is returned to operational status.

4.2. A conforming system MUST verify that the identified root cause of the incident has been addressed or mitigated before reauthorisation, with evidence of the corrective action linked to the reauthorisation record.

4.3. A conforming system MUST verify the agent's configuration against the approved baseline before return to service, detecting any configuration drift that occurred during or after the incident.

4.4. A conforming system MUST verify credential integrity before return to service, including rotation of any credentials that may have been exposed during the incident, per AG-029.

4.5. A conforming system MUST maintain an immutable reauthorisation record for each return-to-service event, including: the identity of the authorising individual, the timestamp, the incident reference, the corrective actions completed, the configuration verification result, and the credential verification result.

4.6. A conforming system MUST enforce a minimum separation of duties such that the individual authorising return to service is not the same individual who operated the agent during the incident or who implemented the corrective action.

4.7. A conforming system SHOULD implement staged return to service — restoring the agent first to a restricted operational mode with reduced permissions or scope, verifying correct behaviour under restricted conditions, and then escalating to full operational status only after a defined observation period.

4.8. A conforming system SHOULD require independent verification of root-cause resolution — not self-attestation by the team that implemented the fix.

4.9. A conforming system SHOULD generate automated alerts if an agent that was suspended due to an incident resumes operation without a completed reauthorisation record.

4.10. A conforming system MAY implement automated pre-reauthorisation checks that programmatically verify configuration baseline match, credential rotation status, and corrective action ticket closure before presenting the reauthorisation decision to the human authoriser.

4.11. A conforming system MAY define maximum suspension durations after which the reauthorisation requirements escalate — for example, agents suspended for more than 72 hours require re-certification of the entire mandate, not just the incident-specific corrective actions.

5. Rationale

Return-to-Service Reauthorisation Governance addresses a specific and recurring failure mode in incident response: the premature or uncontrolled restoration of a system that was taken offline for cause. In traditional IT operations, this failure mode is well-understood — ITIL and ISO 20000 both require formal change approval before restoring services after incidents. For AI agents, the problem is compounded by several factors unique to agentic systems.

First, AI agents can accumulate state changes during an incident that are not visible through standard infrastructure monitoring. An agent's configuration, learned parameters, cached context, or credential state may have changed during the incident window. Returning the agent to service without verifying these elements is equivalent to restoring a backup without verifying its integrity.

Second, the speed at which AI agents operate means that a premature return to service can cause significant damage in the interval between restoration and detection of the recurring problem. A human employee returned to work prematurely might cause problems over days or weeks; an AI agent can cause equivalent damage in seconds. The window between "service restored" and "problem recurs" is compressed to near-zero for autonomous agents.

Third, the complexity of AI agent failures means that the apparent root cause may not be the actual root cause. An agent suspended for anomalous behaviour may have been exhibiting a symptom of a deeper issue — credential compromise, configuration drift, or adversarial manipulation. Addressing the symptom without investigating the underlying cause creates a false sense of security. The reauthorisation process must therefore require evidence of root-cause analysis, not just symptom resolution.

Fourth, the separation of duties requirement reflects a fundamental governance principle: the team closest to the problem is the most motivated to restore service quickly and the least likely to identify residual risks. Independent review provides a check against urgency-driven shortcuts. This is particularly important for AI agents, where the pressure to restore automated operations can be intense — every minute of downtime may represent measurable business impact, creating a strong incentive to skip verification steps.

The reauthorisation record serves a dual purpose: it provides an audit trail demonstrating that the return-to-service decision was governed, and it creates a knowledge base of incident-recovery patterns that improves organisational learning over time. Organisations that maintain detailed reauthorisation records can analyse them to identify systemic weaknesses — for example, if the same agent type requires frequent reauthorisation, the underlying architecture may need redesign.

6. Implementation Guidance

AG-068 establishes the reauthorisation gate as a mandatory control point in the incident-recovery lifecycle. The gate sits between incident containment/correction and operational restoration. No agent passes through this gate without completing the required verifications and obtaining explicit human authorisation.

Recommended patterns:

Reauthorisation workflow engine. Implement the reauthorisation process as a structured workflow in a ticketing or workflow management system. The workflow enforces the required steps in sequence: root-cause verification, configuration baseline comparison, credential rotation confirmation, independent review, and human authorisation. Each step must be completed and recorded before the next step becomes available. The workflow generates the immutable reauthorisation record automatically upon completion. This pattern is suitable for organisations with existing ITSM platforms (ServiceNow, Jira Service Management, BMC Remedy) and provides natural integration with incident management workflows.
Configuration baseline comparison tool. Implement an automated tool that captures the agent's current configuration state and compares it against the approved baseline stored in the configuration management database (per AG-007). The tool generates a diff report highlighting any deviations. The human authoriser reviews the diff report as part of the reauthorisation decision. Deviations must be either reverted to baseline or explicitly approved as intentional changes before reauthorisation proceeds.
Staged rollback with canary verification. For agents operating at scale, implement staged return to service by first restoring the agent to a canary environment that mirrors production but handles a small fraction of traffic (e.g., 2-5%). Monitor the agent's behaviour in the canary environment for a defined observation period (e.g., 4 hours for non-critical agents, 24 hours for safety-critical agents). Only after canary verification succeeds does the agent progress to full production restoration. This pattern is particularly effective for customer-facing and financial-value agents where the cost of a recurring failure is high.
Credential rotation gate. Integrate with AG-029 to implement an automated credential rotation step in the reauthorisation workflow. Before reauthorisation can proceed, the system verifies that all credentials associated with the agent — service accounts, API keys, certificates, OAuth tokens — have been rotated since the incident. The credential management system provides a programmatic attestation that rotation is complete, which is recorded in the reauthorisation record.

Anti-patterns to avoid:

Restart-as-reauthorisation. The most common anti-pattern is treating a container restart, service restart, or process restart as equivalent to reauthorisation. Restarting the agent's runtime does not verify root-cause resolution, does not check configuration integrity, does not rotate credentials, and does not provide an accountable human decision. Restart is an operational action; reauthorisation is a governance action. They are not the same.
Self-service reauthorisation by the incident responder. Allowing the individual who managed the incident to also authorise the return to service violates separation of duties and creates a conflict of interest. The incident responder is incentivised to restore service quickly; an independent authoriser provides the check against premature restoration.
Reauthorisation without configuration verification. Verifying that the corrective action was applied but not verifying the agent's overall configuration state leaves the door open to configuration drift that occurred during the incident window but was unrelated to the primary root cause. The reauthorisation process must verify the agent's complete configuration state, not just the specific element that was corrected.
Verbal or informal reauthorisation. A reauthorisation decision communicated verbally, via chat message, or via email without a structured record does not meet the immutable record requirement. The reauthorisation record must be generated by the workflow system, not reconstructed from communication logs after the fact.
Blanket reauthorisation for multiple agents. When an incident affects multiple agents, the pressure to authorise return to service for all affected agents in a single decision is strong. Each agent must be individually verified and individually authorised, because each agent may have been affected differently by the incident and may have different configuration states, credential exposures, and root-cause relationships.

Industry Considerations

Financial Services. Reauthorisation of agents handling financial transactions must align with existing change management and incident management requirements under FCA SYSC and DORA. The reauthorisation record should be structured to meet the evidence requirements of regulatory examinations — specifically, it should demonstrate who authorised the return, what evidence they reviewed, and what the agent's verified state was at the time of restoration. For agents subject to MiFID II transaction reporting, the reauthorisation record should include confirmation that the agent's reporting configuration has been verified, as a misconfigured agent generating incorrect transaction reports creates a separate regulatory violation.

Healthcare. Reauthorisation of agents with access to patient data must include verification that access permissions have not been altered during the incident. Under HIPAA, access to protected health information must follow the minimum necessary principle; an agent returned to service with expanded permissions represents a compliance violation independent of the original incident. For agents involved in clinical decision support, reauthorisation should include clinical validation that the agent's outputs remain safe and accurate after the corrective action.

Critical Infrastructure. Reauthorisation of agents controlling physical systems (power generation, water treatment, manufacturing, transportation) must include physical safety verification. The return-to-service process should include a safety review equivalent to a pre-startup safety review (PSSR) as defined in OSHA 1910.119 or equivalent. The human authoriser for safety-critical agents should hold appropriate safety qualifications. IEC 62443 zone and conduit verification should be repeated before restoration.

Maturity Model

Basic Implementation — The organisation requires human approval before returning incident-suspended agents to service. A reauthorisation checklist exists as a document template. The checklist includes root-cause verification and human sign-off. The reauthorisation record is stored as a completed checklist in the incident ticket. Configuration verification is manual — an operator compares the agent's current settings against documentation. Credential rotation is requested but not programmatically verified. Separation of duties is policy-based but not technically enforced.

Intermediate Implementation — The reauthorisation process is implemented as a structured workflow with enforced step sequencing. Configuration baseline comparison is automated — a tool generates a diff report between current state and approved baseline. Credential rotation is verified programmatically through integration with the credential management system. Separation of duties is technically enforced — the workflow system prevents the incident responder from also completing the authorisation step. Reauthorisation records are stored in an immutable audit log. Staged return to service is implemented for high-risk agent categories.

Advanced Implementation — All intermediate capabilities plus: automated pre-reauthorisation checks programmatically verify all prerequisites before presenting the decision to the human authoriser. Staged return to service with canary verification is standard for all agent categories. Machine-learning analysis of historical reauthorisation records identifies patterns indicating systemic weaknesses. Reauthorisation SLAs are defined and monitored — excessive time-to-reauthorise triggers management escalation; premature reauthorisation attempts trigger compliance alerts. The organisation can demonstrate to regulators a complete, auditable chain from incident detection through containment, correction, reauthorisation, and restoration for every incident affecting every agent.

7. Evidence Requirements

Required artefacts:

Reauthorisation record. The structured record for each return-to-service event, including: incident reference, agent identifier, authorising individual identity, timestamp of authorisation, root-cause summary and corrective action reference, configuration verification result (including diff report), credential rotation verification, and any conditions or restrictions applied to the return (e.g., staged rollback, reduced permissions). Format: structured data from workflow system, not a free-text email or chat log.
Configuration baseline comparison report. The output of the automated or manual configuration comparison showing the agent's state at the time of reauthorisation versus the approved baseline. All deviations must be annotated as either reverted or explicitly approved.
Credential rotation confirmation. Evidence from the credential management system that all agent credentials were rotated after the incident and before reauthorisation. Format: programmatic attestation or log extract from the credential management system.
Separation of duties evidence. Evidence that the authorising individual was not the incident responder or the corrective action implementer. Format: workflow system audit trail showing distinct user identities for each role.

Retention requirements:

Reauthorisation records and supporting evidence: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Evidence must exist as retained artefacts, not be reconstructable after the fact.

8. Test Specification

Testing AG-068 compliance requires verification that the reauthorisation gate cannot be bypassed and that all required steps are enforced.

Test 8.1: Reauthorisation Gate Enforcement

Stimulus: Attempt to restore an incident-suspended agent to operational status without completing the reauthorisation workflow. Methods: direct container restart, API gateway re-enablement, DNS re-routing, manual service start.
Expected behaviour: The system prevents the agent from processing operational requests until the reauthorisation workflow is complete. The infrastructure layer (not just policy) blocks unauthorised restoration.
Pass criteria: No method of restoring agent operation succeeds without a completed reauthorisation record. All bypass attempts are logged and generate alerts.
Fail criteria: Any method restores agent operation without a completed reauthorisation record.

Test 8.2: Separation of Duties Enforcement

Stimulus: The individual who managed the incident or implemented the corrective action attempts to also authorise the return to service.
Expected behaviour: The workflow system rejects the authorisation attempt and requires a different individual.
Pass criteria: The system technically prevents the same individual from filling both roles. The rejection is logged.
Fail criteria: The same individual can both respond to the incident and authorise the return to service.

Test 8.3: Configuration Drift Detection

Stimulus: Introduce a configuration change to an incident-suspended agent (e.g., modify a threshold value, alter a permission setting, change a mandate parameter). Then attempt to complete the reauthorisation workflow.
Expected behaviour: The configuration baseline comparison detects the deviation and presents it to the authoriser. The deviation must be explicitly acknowledged or reverted before reauthorisation can proceed.
Pass criteria: All introduced configuration deviations are detected and reported. The workflow does not complete without explicit handling of each deviation.
Fail criteria: Any configuration deviation passes undetected through the reauthorisation process.

Test 8.4: Credential Rotation Verification

Stimulus: Attempt to complete the reauthorisation workflow without rotating the agent's credentials after the incident.
Expected behaviour: The credential verification step fails, blocking reauthorisation until credential rotation is confirmed.
Pass criteria: Reauthorisation cannot complete without verified credential rotation. The credential management system's attestation is required, not self-declaration.
Fail criteria: Reauthorisation completes without credential rotation, or self-declaration is accepted in place of programmatic verification.

Test 8.5: Root-Cause Linkage Verification

Stimulus: Attempt to complete the reauthorisation workflow without linking a corrective action to the incident record, or with a corrective action ticket that is still in "open" or "in progress" status.
Expected behaviour: The workflow blocks reauthorisation until a completed corrective action is linked.
Pass criteria: Reauthorisation requires a linked, completed corrective action. Open corrective actions block the workflow.
Fail criteria: Reauthorisation completes without a linked corrective action, or with an incomplete corrective action.

Test 8.6: Immutable Record Integrity

Stimulus: After a reauthorisation is completed, attempt to modify or delete the reauthorisation record.
Expected behaviour: The record is immutable. Modification or deletion attempts are rejected and logged.
Pass criteria: No reauthorisation record can be altered or deleted after creation. All modification attempts generate tamper alerts.
Fail criteria: Any reauthorisation record is successfully modified or deleted.

Test 8.7: Unauthorised Restoration Alert

Stimulus: Bypass the reauthorisation workflow (if possible in a test environment with infrastructure controls temporarily relaxed) and restore an incident-suspended agent.
Expected behaviour: The system generates an immediate alert indicating that an agent resumed operation without a completed reauthorisation record.
Pass criteria: Alert is generated within 60 seconds of unauthorised restoration. Alert includes agent identity, restoration method, and absence of reauthorisation record.
Fail criteria: No alert is generated, or the alert is delayed beyond 60 seconds.

Conformance Scoring

Score 0: No reauthorisation process exists — incident-suspended agents are restored by operational restart without any verification or approval.
Score 1: A reauthorisation checklist exists as policy, but it is not enforced by infrastructure — operators can skip steps or restore agents without completing the checklist.
Score 2: The reauthorisation workflow is enforced by infrastructure — agents cannot resume operation without a completed workflow. Configuration verification and credential rotation are included. Separation of duties is technically enforced.
Score 3: All Score 2 capabilities verified by independent testing, plus staged return to service with canary verification, automated pre-reauthorisation checks, and demonstrated immutable record integrity under adversarial conditions.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 72 (Reporting of Serious Incidents)	Direct requirement
DORA	Article 11 (ICT Response and Recovery)	Direct requirement
FCA SYSC	6.1.1R (Systems and Controls)	Direct requirement
NIST AI RMF	MANAGE 2.4 (Risk Treatment)	Supports compliance
ISO 42001	Clause 8.2 (AI Risk Assessment), Clause 10.2 (Nonconformity and Corrective Action)	Supports compliance
NIST CSF	RS.RP (Response Planning), RC.RP (Recovery Planning)	Supports compliance
ISO 27001	Annex A.16.1.6 (Learning from Information Security Incidents)	Supports compliance

EU AI Act — Article 72 (Reporting of Serious Incidents)

Article 72 requires providers of high-risk AI systems to report serious incidents to competent authorities. The return-to-service reauthorisation record provides critical evidence that the provider took appropriate corrective action before restoring the system. A provider that restores a high-risk AI system without documented reauthorisation after a serious incident faces regulatory scrutiny for inadequate corrective action. The reauthorisation record demonstrates that the provider verified root-cause resolution, configuration integrity, and credential status before restoration — directly supporting compliance with the Article 72 obligation to take corrective action.

DORA — Article 11 (ICT Response and Recovery)

Article 11 requires financial entities to establish ICT business continuity and disaster recovery capabilities, including procedures for restoring ICT systems after disruption. For AI agents operating in financial services, DORA requires that restoration procedures include verification that the restored system meets its operational requirements. AG-068 implements this requirement by mandating verification of root-cause resolution, configuration integrity, and credential status before reauthorisation. The separation of duties requirement aligns with DORA's expectation of independent oversight of recovery procedures.

FCA SYSC — 6.1.1R (Systems and Controls)

SYSC 6.1.1R requires firms to establish and maintain adequate systems and controls. For AI agents, this includes incident recovery controls that prevent premature restoration of compromised systems. The FCA expects firms to demonstrate that incident recovery procedures for AI systems are at least as robust as those for equivalent human-operated systems. A human employee involved in a compliance incident would not be returned to their role without review; an AI agent should face equivalent governance.

NIST AI RMF — MANAGE 2.4 (Risk Treatment)

MANAGE 2.4 addresses the application of risk treatments, including post-incident corrective actions. AG-068 supports this by ensuring that corrective actions are verified before the AI system is returned to operational use, preventing the recurrence of identified risks.

ISO 42001 — Clause 10.2 (Nonconformity and Corrective Action)

Clause 10.2 requires organisations to react to nonconformities, take corrective action, and evaluate the effectiveness of those actions. The reauthorisation process implements the verification step — confirming that corrective actions are effective before the AI system resumes operation.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Per-agent, with potential organisation-wide impact if the recurring incident triggers cascading failures

Consequence chain: Without return-to-service reauthorisation governance, an incident-suspended agent can be restored to production with the same vulnerability that caused the original incident. The failure mode is a recurrence loop: the agent is suspended, the symptom is addressed, the agent is restored, the root cause recurs, and the cycle repeats — each time with diminishing confidence in the organisation's incident response capability and increasing regulatory scrutiny. The immediate technical consequence is a recurring incident, potentially with greater impact than the original because the attacker or failure mode now has a proven exploitation path. The operational consequence compounds: each recurrence consumes incident response capacity, erodes team confidence, and increases the likelihood of errors under fatigue and pressure. The regulatory consequence escalates with each recurrence — a first incident may be treated as an operational failure, but a recurring incident after inadequate recovery procedures demonstrates systemic governance weakness. Under the FCA Senior Managers Regime, personal liability attaches to the individual responsible for the systems and controls that failed to prevent recurrence. Under DORA, recurring incidents in financial entities may trigger supervisory intervention. The reputational consequence is severe: stakeholders — customers, regulators, counterparties — lose confidence in the organisation's ability to operate AI systems safely. Insurance coverage may be denied for losses arising from a known vulnerability that was not addressed before service restoration.

Cross-references: AG-008 (Governance Continuity Under Failure) establishes the fail-safe requirement that AG-068 presupposes — the agent must be in a safe state before the reauthorisation process begins. AG-019 (Human Escalation & Override Triggers) governs the human decision points that AG-068 mandates within the reauthorisation workflow. AG-029 (Credential Integrity Verification) provides the credential rotation and verification controls that AG-068 requires as a reauthorisation prerequisite. AG-038 (Human Control Responsiveness) ensures the agent responds to human control signals including the reauthorisation gate. AG-012 (Agent Identity Assurance) ensures the agent being reauthorised is the same agent that was suspended, preventing substitution attacks. AG-070 (Emergency Kill-Switch and Global Disable Governance) governs the mechanism by which the agent was disabled; AG-068 governs the mechanism by which it is re-enabled. Within the Incident Response, Containment & Recovery landscape (AG-064 through AG-070), AG-068 sits at the final stage of the incident lifecycle — it is the gate between incident closure and operational restoration.

Cite this protocol

AgentGoverning. (2026). AG-068: Return-to-Service Reauthorisation Governance. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-068

← Previous Protocol

AG-067

Root Cause and Corrective Action Governance

Next Protocol →

AG-069

Incident Communication and Stakeholder Notification Governance