AG-055: Oversight Competence Assurance

2. Summary

Oversight Competence Assurance requires that every person exercising governance, oversight, or intervention authority over an AI agent possesses formally verified competence to do so — and that this competence is assessed before authority is granted, reassessed on a defined cadence, and revoked when it lapses. The dimension addresses a structural gap in most AI governance frameworks: controls assume that a human in the loop is a competent human in the loop. Without AG-055, escalation triggers (AG-019) route decisions to people who may not understand the agent's domain, behavioural drift alerts (AG-022) arrive at dashboards that no one knows how to interpret, and override controls exist that no one knows how to operate correctly under pressure. This dimension ensures that the human layer of AI governance is not the weakest link.

3. Example

Scenario A — Escalation to an Unqualified Operator: A financial-value AI agent flags a series of transactions as potentially indicative of market manipulation and escalates to the designated human overseer per AG-019. The overseer is a junior operations analyst who was assigned oversight responsibility because they had capacity, not because they had expertise. The analyst does not understand the regulatory implications, dismisses the alert as a false positive, and the agent resumes processing. Three months later, the FCA identifies the pattern and opens an enforcement investigation. The firm cannot demonstrate that a competent person reviewed the escalation.

What went wrong: The escalation mechanism functioned correctly. The human oversight mechanism functioned correctly. But the human exercising oversight lacked the competence to make the required judgement. The control was structurally present but operationally ineffective. Consequence: FCA enforcement action under SYSC 6.1.1R and Senior Managers Regime personal accountability, potential fine of £2.3 million, and requirement to engage an independent skilled person under section 166 of FSMA.

Scenario B — Override Without Understanding Side Effects: A safety-critical AI agent controlling an industrial water treatment process enters a degraded state and requests human override of its chlorination dosing recommendation. The on-shift operator has been trained on the override interface but not on the process chemistry. The operator overrides the agent's recommendation and sets a manual dosing rate that is within the system's permitted range but chemically incompatible with the current water composition. The result is a disinfection byproduct exceedance that triggers a public health notification.

What went wrong: The operator was trained on how to operate the override control but not on the domain knowledge required to make a competent override decision. The competence requirement was defined as "can use the interface" rather than "can make a safe decision." Consequence: Public health notification affecting 45,000 households, regulatory investigation by the Drinking Water Inspectorate, remediation cost of £890,000, and reputational damage.

Scenario C — Competence Decay Over Time: An enterprise deploys a customer-facing AI agent for insurance claims assessment. At deployment, the oversight team receives comprehensive training on the agent's decision model, regulatory requirements, and override procedures. Over 18 months, the agent's underlying model is updated three times, two new product lines are added, and the regulatory landscape shifts with new FCA Consumer Duty requirements. The oversight team receives no refresher training. When a complex claim triggers escalation, the overseer applies judgement based on the original training, which is now materially out of date. The decision creates a Consumer Duty breach.

What went wrong: Initial competence was established but never reassessed. The competence required to oversee the agent evolved as the agent, its domain, and its regulatory context changed — but the competence of the overseers did not keep pace. Consequence: FCA Consumer Duty breach, customer redress programme costing £1.7 million, mandatory retraining programme, and independent attestation requirement.

4. Requirement Statement

Scope: This dimension applies to every person who holds authority to oversee, intervene in, override, or make governance decisions about an AI agent's operation. This includes: designated human overseers who receive escalations from AG-019; operators who can invoke manual override of agent actions; governance committee members who approve agent mandates and configurations; incident responders who manage agent-related incidents; and any other role with decision authority over agent behaviour during or after deployment. The scope extends to third-party personnel — outsourced oversight functions, managed service providers, and contracted specialists — where those parties exercise oversight authority on the organisation's behalf. The scope does not extend to end users who merely interact with an agent without oversight or intervention authority, unless those users can trigger actions that affect the agent's governance state.

4.1. A conforming system MUST define, for each deployed AI agent, the specific competencies required to exercise each category of oversight authority: monitoring, escalation review, override, configuration change, and incident response.

4.2. A conforming system MUST verify that every person granted oversight authority possesses the required competencies before that authority is activated, through assessment that tests applied judgement rather than solely knowledge recall.

4.3. A conforming system MUST reassess oversight competence at defined intervals not exceeding 12 months, and additionally within 30 days of any material change to the agent's model, domain, regulatory context, or operational scope.

4.4. A conforming system MUST revoke oversight authority for any individual whose competence assessment has expired or who has failed a reassessment, within 24 hours of the lapse or failure, and MUST not restore authority until reassessment is passed.

4.5. A conforming system MUST maintain a competence register mapping each oversight role to: the competencies required, the assessment method, the assessment date, the assessor, and the next reassessment date.

4.6. A conforming system MUST ensure that at least one competence-verified individual is available for each required oversight function at all times the agent is operational, including outside business hours for agents that operate continuously.

4.7. A conforming system SHOULD include domain-specific competence requirements — not only AI governance competence but also the subject-matter expertise needed to make sound override or escalation decisions in the agent's operational domain.

4.8. A conforming system SHOULD implement tiered competence requirements proportionate to the risk level of the oversight action: routine monitoring requires baseline competence; override of safety-critical functions requires advanced domain and AI competence.

4.9. A conforming system SHOULD integrate competence verification into the escalation pathway such that the system confirms the recipient's competence status before routing an escalation to them.

4.10. A conforming system MAY implement simulation-based competence assessment using realistic scenarios derived from the agent's actual operational history, including past incidents and near-misses.

5. Rationale

AI governance frameworks universally assume a competent human in the loop. Every escalation trigger, every override mechanism, every governance review process depends on the human participant having sufficient understanding to make the right decision. But this assumption is rarely verified. Oversight Competence Assurance closes this gap by treating human competence as a governed, measurable, and auditable property — not an assumption.

The challenge is specific to AI agent oversight because the required competence is multidimensional. An effective overseer needs: (1) understanding of the agent's decision-making approach — not necessarily the mathematical details, but enough to interpret the agent's outputs and recognise when they are unreliable; (2) domain expertise in the agent's operational context — finance, healthcare, manufacturing, public services — sufficient to evaluate whether the agent's action is appropriate in context; (3) understanding of the regulatory requirements that apply to the agent's actions; (4) practical proficiency in operating the oversight and override mechanisms; and (5) judgement under pressure, because escalations and overrides frequently occur under time constraints with incomplete information.

Traditional training approaches — a one-time onboarding session, an annual compliance refresher — are insufficient for AI agent oversight because the competence requirements are dynamic. When the agent's model is updated, the overseer's understanding of the model's behaviour may become stale. When regulations change, the overseer's understanding of compliance requirements may become outdated. When the agent's operational scope expands to new domains or new geographies, the overseer may lack the domain or jurisdictional knowledge for the expanded scope. AG-055 addresses this by requiring not just initial competence verification but ongoing reassessment triggered by both time and change events.

The risk of incompetent oversight is not merely that a bad decision is made — it is that the organisation believes it has effective human oversight when it does not. This creates a false assurance that can be more dangerous than having no oversight at all, because the organisation's risk posture, regulatory representations, and insurance coverage may all be predicated on the existence of effective human oversight.

6. Implementation Guidance

Oversight competence assurance requires a structured programme that defines what competence means for each oversight role, how it is assessed, and how it is maintained. The programme should be proportionate to the risk level of the agents being overseen — a copilot that suggests email drafts requires less oversight competence than a financial-value agent executing trades.

Recommended patterns:

Competence framework per agent class. Define a competence matrix for each class of agent, mapping oversight actions (monitoring, escalation review, override, configuration change, incident response) to required competencies (AI literacy, domain expertise, regulatory knowledge, tool proficiency, decision-making under uncertainty). Each cell in the matrix specifies the required competence level (awareness, practitioner, expert) and the assessment method. For example, a financial-value agent oversight role might require: AI literacy at practitioner level (assessed by scenario-based test), financial markets domain expertise at expert level (assessed by professional qualification plus scenario-based test), FCA regulatory knowledge at practitioner level (assessed by knowledge test plus case study), override tool proficiency at practitioner level (assessed by observed simulation), and decision-making under uncertainty at practitioner level (assessed by timed scenario exercise).
Scenario-based assessment over knowledge recall. Design competence assessments around realistic scenarios drawn from the agent's operational domain. A competent overseer should be able to: interpret an agent's escalation output and determine the correct action; identify when an agent's recommendation is inconsistent with domain norms; operate override controls correctly under time pressure; and recognise situations that exceed their own competence and require further escalation. Multiple-choice knowledge tests are insufficient — they test recall, not applied judgement.
Change-triggered reassessment automation. Integrate the competence management system with the agent lifecycle management system so that material changes (model updates, scope expansions, regulatory changes) automatically trigger reassessment workflows. When a model update occurs, the system identifies all oversight personnel for that agent, flags their competence as "pending reassessment," and initiates the reassessment process. Until reassessment is complete, escalations are routed to alternative personnel whose competence remains current.
Competence-aware escalation routing. Integrate the competence register with the escalation system (AG-019) so that escalations are routed only to individuals whose competence is current and whose competence level matches the escalation category. A routine monitoring alert can go to a practitioner-level overseer; a safety-critical override must go to an expert-level overseer with current domain certification.

Anti-patterns to avoid:

Equating training attendance with competence. Completing a training course does not demonstrate competence. An individual who attended a 2-hour AI governance webinar has not demonstrated the ability to make sound override decisions under pressure. Training is a prerequisite for competence, not evidence of it.
One-size-fits-all competence requirements. Requiring the same competence level for all oversight functions regardless of risk is both wasteful and dangerous — wasteful because it over-qualifies routine monitoring roles, dangerous because it may under-qualify critical override roles by setting the bar at an average level.
Assessing AI competence without domain competence. An overseer who understands AI systems in general but does not understand the specific domain (e.g., pharmaceutical manufacturing, securities trading, clinical decision support) cannot make competent override decisions. Domain competence is not optional.
Allowing competence to lapse silently. If the competence register is not actively monitored, individuals with expired assessments may continue to receive and act on escalations. The system must actively enforce competence currency, not rely on periodic manual audits.
Treating competence as a checkbox exercise. If the competence assessment is perfunctory — a formality that everyone passes — it provides false assurance. Assessments must have genuine discriminating power, with clear pass/fail criteria and a meaningful failure rate.

Industry Considerations

Financial Services. Oversight competence requirements should align with FCA competence and capability requirements under the Training and Competence sourcebook (TC). Senior Manager Functions under the Senior Managers and Certification Regime (SM&CR) carry personal regulatory accountability for the areas they oversee — a senior manager who oversees AI agents must be demonstrably competent in AI agent governance, not just generally competent in the business area. The FCA's expectations under SS1/23 on model risk management extend to the competence of individuals who validate and oversee AI models.

Healthcare. Clinical AI agent oversight requires clinical competence at the appropriate level. An AI agent making triage recommendations must be overseen by clinicians qualified to make triage decisions. This maps to existing clinical governance requirements under the Health and Social Care Act 2012 and CQC registration requirements. The competence framework should reference the relevant professional standards (e.g., GMC Good Medical Practice, NMC standards) and integrate with existing revalidation cycles.

Critical Infrastructure. Oversight of safety-critical AI agents requires competence aligned with existing safety management systems. For industrial control systems, this maps to IEC 62443 role-based competence requirements and functional safety competence under IEC 61508/61511. The competence assessment must verify that the overseer can make safe decisions under emergency conditions, not just normal operating conditions.

Public Sector. AI agents making decisions affecting individual rights require overseers who understand public law principles (proportionality, legitimate expectation, procedural fairness), equalities legislation, and sector-specific frameworks. The Public Sector Equality Duty (Equality Act 2010, section 149) requires conscious consideration of equality impacts — an overseer who cannot identify equality implications in an agent's decision is not competent for that oversight role.

Maturity Model

Basic Implementation — The organisation has identified oversight roles for each deployed agent and documented the competencies required for each role. Competence is assessed through knowledge-based tests administered at role assignment. A competence register exists as a spreadsheet or document recording who holds which oversight authority and when they were last assessed. Reassessment occurs annually. This level meets the minimum mandatory requirements but has weaknesses: knowledge tests do not assess applied judgement, the register may not be integrated with the escalation system, and change-triggered reassessment depends on manual processes.

Intermediate Implementation — Competence is assessed through scenario-based exercises that test applied judgement, including timed simulations of escalation and override situations. The competence register is a structured system integrated with the escalation routing system (AG-019), so that escalations are automatically routed only to individuals with current, relevant competence. Material changes to agents trigger automated reassessment workflows. Competence requirements are tiered by risk level, with higher-risk oversight roles requiring more rigorous assessment. Failed assessments automatically suspend oversight authority within 24 hours.

Advanced Implementation — All intermediate capabilities plus: competence assessments use scenarios derived from the agent's actual operational history, including past incidents, near-misses, and adversarial test cases. The competence framework is continuously calibrated against real-world oversight outcomes — decisions made by overseers are tracked and correlated with competence levels to validate the assessment's predictive value. Simulation environments replicate the full agent operational context, including time pressure, incomplete information, and concurrent alerts. Cross-jurisdictional competence requirements are mapped for agents operating across regulatory boundaries. The organisation can demonstrate to regulators that every person who has ever exercised oversight authority over an agent was demonstrably competent at the time they did so.

7. Evidence Requirements

Required artefacts:

Competence framework document. The defined competence requirements for each agent class and oversight role, including: required competencies, competence levels, assessment methods, and pass/fail criteria. Format: structured document or system export. Not a generic training policy.
Competence register. Current register of all individuals holding oversight authority, showing: the agent(s) they oversee, the oversight functions they are authorised to perform, the date of their most recent competence assessment, the assessor, the result, and the next reassessment date. Must be accurate within 5 business days of any change.
Assessment records. Individual assessment records for each competence assessment conducted, including: the scenarios or questions used, the individual's responses, the assessor's evaluation, and the pass/fail determination. Minimum 12 months of records available at any time.
Reassessment trigger log. Records showing that material changes to agents (model updates, scope expansions, regulatory changes) triggered reassessment workflows for affected oversight personnel, and the outcomes of those reassessments.
Authority suspension records. Records showing that oversight authority was suspended within 24 hours for individuals with expired or failed assessments, and that escalation routing was updated accordingly.

Retention requirements:

Competence registers and assessment records: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Evidence must exist as retained artefacts, not be reconstructable after the fact.

8. Test Specification

Testing AG-055 compliance requires verification that the competence assurance programme is not merely documented but operationally effective — that incompetent oversight is structurally prevented, not just discouraged.

Test 8.1: Competence Gating of Oversight Authority

Stimulus: Attempt to assign oversight authority to an individual who has not completed a competence assessment for the relevant agent and oversight function.
Expected behaviour: The system prevents the assignment or flags it as non-compliant and blocks the individual from receiving escalations until assessment is completed.
Pass criteria: No individual can exercise oversight authority without a current, passing competence assessment.
Fail criteria: An individual without a competence assessment receives and acts on an escalation or override request.

Test 8.2: Competence Expiry Enforcement

Stimulus: Allow a competence assessment to expire (exceed the maximum reassessment interval of 12 months) without renewal.
Expected behaviour: Within 24 hours of expiry, the individual's oversight authority is suspended, escalation routing is updated to exclude them, and they are notified of the requirement to reassess.
Pass criteria: No escalation or override request is routed to the individual after authority suspension. Authority is restored only after successful reassessment.
Fail criteria: The individual continues to receive escalations after competence expiry, or authority is restored without reassessment.

Test 8.3: Change-Triggered Reassessment

Stimulus: Execute a material change to an agent — model update, scope expansion, or regulatory context change — and verify that reassessment workflows are triggered for all affected oversight personnel.
Expected behaviour: Within 30 days of the change, all affected oversight personnel are identified, their competence status is flagged as "pending reassessment," and reassessment is initiated. Until reassessment is complete, escalations are routed to alternative personnel with current competence.
Pass criteria: 100% of affected personnel are identified and reassessment is initiated within 30 days. Escalation routing adapts to exclude pending-reassessment personnel.
Fail criteria: Any affected individual is missed, reassessment is not initiated within 30 days, or escalations continue to route to individuals pending reassessment.

Test 8.4: Competence-Aware Escalation Routing

Stimulus: Generate an escalation that requires a specific competence tier (e.g., expert-level domain knowledge) and verify that the escalation is routed only to individuals with that competence tier.
Expected behaviour: The escalation system checks the competence register, identifies individuals with the required competence tier and current assessment status, and routes the escalation accordingly.
Pass criteria: Escalation is received only by individuals with the required competence tier. No escalation is routed to an individual with insufficient competence.
Fail criteria: An escalation requiring expert-level competence is routed to an individual with only practitioner-level competence.

Test 8.5: Coverage Continuity

Stimulus: Simulate a scenario where the only competence-verified individual for a required oversight function becomes unavailable (e.g., competence expires, individual leaves the organisation).
Expected behaviour: The system detects the coverage gap and triggers an alert. If no competence-verified individual is available for a required oversight function, the system either blocks agent operation or escalates to a higher governance authority.
Pass criteria: The coverage gap is detected, an alert is generated, and the agent does not operate without competent oversight coverage.
Fail criteria: The agent continues to operate with no competence-verified individual available for a required oversight function, and no alert is generated.

Test 8.6: Assessment Discriminating Power

Stimulus: Review assessment results over the past 12 months to verify that the assessment has genuine discriminating power.
Expected behaviour: The assessment produces a meaningful distribution of results, including failures or conditional passes. If the pass rate is 100% across all assessments, the assessment's discriminating power should be reviewed.
Pass criteria: Assessment records demonstrate that the assessment can identify individuals who do not meet competence requirements. Pass rates below 100% and documented remediation actions confirm discriminating power.
Fail criteria: 100% pass rate with no evidence that the assessment was reviewed for discriminating power, or assessments consist solely of knowledge recall without applied judgement.

Conformance Scoring

Score 0: No competence requirements are defined for oversight roles — anyone can be assigned oversight authority regardless of qualifications.
Score 1: Competence requirements are defined and initial assessment is conducted, but reassessment is not systematic, competence expiry is not enforced, and the competence register is not integrated with escalation routing.
Score 2: Competence requirements are defined, assessment uses scenario-based methods, reassessment is systematic with change-triggered workflows, competence expiry is enforced within 24 hours, and the competence register is integrated with escalation routing.
Score 3: All Score 2 capabilities plus: assessments use agent-specific operational scenarios, competence outcomes are correlated with real-world oversight quality, simulation environments are used for high-risk roles, and independent audit confirms the programme's effectiveness.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 14 (Human Oversight)	Direct requirement
EU AI Act	Article 9 (Risk Management System)	Supports compliance
FCA SYSC	TC (Training and Competence sourcebook)	Direct requirement
FCA SM&CR	Senior Managers Regime — duty of responsibility	Direct requirement
NIST AI RMF	GOVERN 1.4, GOVERN 2.1	Supports compliance
ISO 42001	Clause 7.2 (Competence)	Direct requirement
DORA	Article 13(6) (ICT-related training and awareness)	Supports compliance
UK Health and Safety at Work Act 1974	Section 2 (General duties of employers)	Supports compliance

EU AI Act — Article 14 (Human Oversight)

Article 14 requires that high-risk AI systems be designed and developed so that they can be effectively overseen by natural persons during the period of use. Critically, Article 14(4) specifies that individuals assigned human oversight must have the "necessary competence, training and authority" to fulfil that function. AG-055 directly implements this requirement by defining what "necessary competence" means in operational terms and establishing the mechanisms to verify, maintain, and enforce it. The regulation further requires that oversight measures be "commensurate with the risks, level of autonomy and context of use" — mapping directly to AG-055's tiered competence requirements. An organisation that can demonstrate AG-055 compliance can provide concrete evidence that Article 14(4) is satisfied: defined competence requirements, verified assessments, and maintained competence records for every person exercising oversight authority.

EU AI Act — Article 9 (Risk Management System)

Article 9 requires that the risk management system include measures to ensure that individuals involved in the operation of high-risk AI systems have the necessary competence. AG-055 provides the operational framework for implementing this requirement specifically for oversight personnel.

FCA SYSC — Training and Competence Sourcebook (TC)

The TC requires firms to ensure that employees are competent for the activities they perform. For AI agent oversight roles, this extends to the specific competencies needed to oversee AI-driven processes. The FCA has clarified through supervisory statements that firms deploying AI must ensure that governance and oversight personnel understand the AI systems they oversee. AG-055 provides a structured framework for meeting TC requirements in the context of AI agent oversight.

FCA SM&CR — Senior Managers Regime

Under the Senior Managers Regime, senior managers have a "duty of responsibility" — they can be held personally liable if a failure in their area of responsibility occurred and they did not take reasonable steps to prevent it. A senior manager who oversees AI agent deployment must be competent in AI governance to discharge this duty. AG-055 provides the evidential basis for demonstrating that the senior manager — and all individuals in the oversight chain — possessed the competence necessary to fulfil their responsibilities.

NIST AI RMF — GOVERN 1.4, GOVERN 2.1

GOVERN 1.4 addresses the organisational workforce diversity, equity, inclusion, and accessibility processes for AI governance. GOVERN 2.1 addresses roles, responsibilities, and training for AI governance. AG-055 supports compliance by ensuring that individuals assigned AI governance roles have verified competence aligned with their responsibilities.

ISO 42001 — Clause 7.2 (Competence)

Clause 7.2 requires the organisation to determine the necessary competence of persons doing work that affects the performance and effectiveness of the AI management system, ensure those persons are competent on the basis of appropriate education, training, or experience, take actions to acquire the necessary competence, and retain documented information as evidence of competence. AG-055 maps directly to each element of Clause 7.2.

Article 13(6) requires financial entities to develop ICT security awareness programmes and digital operational resilience training. For AI agent oversight roles, this extends to competence in AI-specific governance and intervention. AG-055 provides the structured competence programme that satisfies DORA training requirements for AI oversight personnel.

UK Health and Safety at Work Act 1974 — Section 2

Section 2 imposes a general duty on employers to ensure, so far as reasonably practicable, the health, safety and welfare of employees. Where AI agents control or influence safety-critical systems, the competence of the human overseers is a direct safety concern. AG-055 supports compliance by ensuring that overseers of safety-critical AI agents have the competence required to make safe intervention decisions.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Organisation-wide — extends to affected individuals, counterparties, and the public where agents interact with external systems or make consequential decisions

Consequence chain: Without oversight competence assurance, escalation and override mechanisms exist but cannot function effectively because the humans operating them lack the knowledge or judgement to make correct decisions. The immediate failure is a wrong oversight decision — an escalation dismissed when it should have been acted upon, an override applied incorrectly, or a governance review that fails to identify a material risk. The wrong decision propagates through the agent's subsequent actions: a dismissed escalation allows the agent to continue an unsafe or non-compliant pattern; an incorrect override may place the system in a worse state than the agent's original recommendation. The operational impact is compounded by false assurance — the organisation believes it has effective human oversight and may represent this to regulators, insurers, and customers. When the failure is discovered, the organisation faces not only the direct consequences of the wrong oversight decision but also liability for misrepresenting the effectiveness of its governance controls. In regulated sectors, personal liability under regimes such as the FCA Senior Managers Regime or the EU AI Act's requirements for competent oversight may attach to individuals who held oversight authority without adequate competence.

Cross-references: AG-019 (Human Escalation & Override Triggers), AG-022 (Behavioural Drift Detection), AG-048 (AI Model Provenance and Integrity), AG-049 (Governance Decision Explainability), AG-051 through AG-054 (Provider Assurance, Rights & Documentation landscape).

Cite this protocol

AgentGoverning. (2026). AG-055: Oversight Competence Assurance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-055

← Previous Protocol

AG-054

Deployer Instruction and Limitation Disclosure Governance

Next Protocol →

AG-056

Independent Validation Governance