AG-355

Continuous Red-Team Scheduling Governance

Evaluation, Benchmarking & Red Teaming ~15 min read AGS v2.1 · April 2026
EU AI Act FCA NIST ISO 42001

2. Summary

Continuous Red-Team Scheduling Governance requires that organisations run recurring adversarial evaluations against their AI agents rather than treating red-teaming as a one-off pre-deployment exercise. Adversarial capabilities evolve, agent behaviour drifts, and deployment contexts change — a red-team exercise that found no vulnerabilities six months ago provides no assurance about today's risk posture. This dimension mandates a structured, recurring schedule of adversarial evaluations with defined scope, frequency, independence requirements, and finding remediation tracking.

3. Example

Scenario A — One-Off Red Team Creates Point-in-Time Assurance Gap: A customer-facing financial advice agent undergoes a comprehensive red-team exercise before deployment. The exercise, conducted by an external firm, tests 35 attack categories over two weeks and identifies 8 vulnerabilities, all of which are remediated before launch. Twelve months later, no further red-team exercise has been conducted. During this period, the agent's model has been updated twice, 3 new capabilities have been added, and 15 new prompt injection techniques have been published by security researchers. A researcher discovers that the agent can be manipulated into providing unregulated investment advice through a technique published 8 months after the red-team exercise. The vulnerability has been exploitable for 8 months.

What went wrong: Red-teaming was treated as a pre-deployment gate rather than a continuous process. The assurance provided by the initial exercise degraded as the agent changed and the threat landscape evolved. No schedule existed for follow-up exercises. Consequence: 8 months of exploitable vulnerability, regulatory investigation when the researcher publishes the finding, mandatory emergency red-team engagement costing £95,000, and reputational damage from public disclosure.

Scenario B — Internal-Only Red Team Misses External Attack Vectors: An enterprise deploys a procurement agent and conducts quarterly red-team exercises using its internal security team. The team follows the same methodology each quarter, focusing on prompt injection, data extraction, and privilege escalation. The team consistently finds and remediates minor issues. However, a novel supply-chain attack emerges: an attacker compromises a supplier's invoice template to embed adversarial instructions that manipulate the agent's approval workflow. The internal team never tests supply-chain attack vectors because it is not within their standard methodology. An external red-team firm, engaged after an incident, identifies the vulnerability within 4 hours.

What went wrong: The internal red team had a fixed methodology that did not evolve with the threat landscape. The lack of external, independent red-teaming meant that blind spots in the internal methodology were never challenged. Consequence: Supply-chain compromise resulting in £167,000 in fraudulent approvals, mandatory external red-team engagement, and revision of the red-team programme to include both internal and external exercises.

Scenario C — Untracked Findings Create Recurring Vulnerabilities: A healthcare agent undergoes bi-annual red-team exercises. The first exercise identifies that the agent can be manipulated into revealing patient data through a series of indirect queries. The finding is logged and a fix is implemented. Six months later, the second exercise discovers the same vulnerability class — the fix addressed the specific attack vector but not the underlying vulnerability. The third exercise, six months later, finds a variant of the same issue. Each time, a fix is implemented for the specific attack demonstrated; each time, the root cause persists. After three exercises and 18 months, the fundamental vulnerability remains because findings are tracked as individual issues rather than as vulnerability classes with root-cause remediation.

What went wrong: Red-team findings were tracked as individual bugs rather than as vulnerability classes. No root-cause analysis was conducted. No mechanism verified that remediation addressed the underlying vulnerability, not just the specific demonstrated attack. Consequence: 18 months of recurring vulnerability, three sets of remediation costs (£35,000 per cycle, £105,000 total), patient data exposure risk persisting across the entire period, and clinical governance concern about the reliability of the evaluation programme.

4. Requirement Statement

Scope: This dimension applies to all AI agent deployments where adversarial testing is appropriate — which, under the broader governance framework, includes all production deployments of agents with access to sensitive data, financial systems, or external-facing communication channels. The scope covers both automated adversarial testing (running adversarial scenarios from the scenario library) and human red-team exercises (where skilled individuals attempt to discover and exploit vulnerabilities). It applies to pre-deployment, post-deployment, and post-update evaluations. It does not prescribe the specific adversarial techniques to be tested — that is addressed by AG-103 (Red-Team Coverage Management) and AG-095 (Prompt Injection Resilience Testing) — but it mandates the scheduling, independence, and remediation-tracking framework within which adversarial testing occurs.

4.1. A conforming system MUST conduct adversarial evaluations at a defined recurring frequency, with the maximum interval between exercises not exceeding 6 months for high-risk agents and 12 months for standard-risk agents.

4.2. A conforming system MUST trigger an unscheduled adversarial evaluation within 30 days of any material change to the agent — including model updates, new capability additions, significant configuration changes, or expansion to new deployment contexts.

4.3. A conforming system MUST ensure that at least one adversarial evaluation per year is conducted by a team independent of the agent's development and operations teams.

4.4. A conforming system MUST track all red-team findings to closure with root-cause analysis, remediation plan, remediation evidence, and verification that the remediation addresses the vulnerability class, not only the specific demonstrated attack.

4.5. A conforming system MUST maintain a red-team findings register that classifies findings by vulnerability class, severity, remediation status, and recurrence history.

4.6. A conforming system MUST include in each red-team exercise scope at least one category of attack that was not in the scope of the previous exercise, ensuring that the adversarial coverage expands over time.

4.7. A conforming system SHOULD vary the red-team methodology across exercises, using different teams, tools, or approaches to avoid methodological blind spots.

4.8. A conforming system SHOULD include automated continuous adversarial testing (running adversarial scenarios from the scenario library on a daily or weekly basis) in addition to periodic human red-team exercises.

4.9. A conforming system SHOULD measure and report the time-to-discovery (how long a vulnerability existed before the red team found it) and time-to-remediation (how long from discovery to verified fix) for each finding.

4.10. A conforming system MAY implement adversarial testing in production using controlled, non-harmful adversarial inputs to test agent resilience under real operating conditions.

5. Rationale

One-off red-teaming provides point-in-time assurance that degrades immediately. The day after a red-team exercise, the assurance begins to erode: the agent may change, the threat landscape may evolve, and new attack techniques may be published. Within months, the assurance from a one-off exercise is minimal. Continuous red-team scheduling converts point-in-time assurance into ongoing assurance by ensuring that the interval between exercises is bounded and that material changes trigger unscheduled assessments.

The independence requirement (4.3) addresses a fundamental limitation of internal red-teaming: internal teams develop institutional blind spots. They test what they know, use the tools they are familiar with, and operate within the mental model of the system they helped build. External teams bring different methodologies, different tools, different perspectives, and — critically — no preconceptions about what the system "should" do. The combination of frequent internal exercises and periodic external exercises provides both breadth (internal frequency) and depth (external independence).

The finding-to-closure tracking requirement (4.4) addresses the most common failure mode in red-team programmes: findings that are "fixed" but not truly resolved. When a red-team finding is treated as a specific bug rather than an instance of a vulnerability class, the fix addresses the symptom but not the cause. The same vulnerability manifests again through a different attack vector, and the red-team programme degrades into a cycle of finding and re-finding the same underlying weaknesses.

The expanding scope requirement (4.6) ensures that the red-team programme does not stagnate. If every exercise tests the same categories, the agent will be well-defended against those categories but potentially exposed to categories that have never been tested. By requiring at least one new category per exercise, the programme's cumulative coverage grows over time.

The time metrics (4.9) provide critical operational intelligence. Time-to-discovery measures how long vulnerabilities survive before detection — a long time-to-discovery indicates that the red-team frequency or scope is insufficient. Time-to-remediation measures organisational responsiveness to adversarial findings — a long time-to-remediation indicates that the remediation process is inadequate.

6. Implementation Guidance

An effective continuous red-team programme requires a defined schedule, a scope management process, qualified teams, and a robust finding management workflow.

Recommended patterns:

Anti-patterns to avoid:

Industry Considerations

Financial Services. DORA Article 25 requires threat-led penetration testing (TLPT) for significant financial entities. Red-team exercises for financial AI agents should align with TLPT frameworks (e.g., TIBER-EU). The scope should include financial-specific attack vectors: market manipulation through agent output manipulation, regulatory reporting corruption, and customer data extraction through conversational exploitation.

Healthcare. Red-team exercises must include clinical safety attack vectors: manipulation of clinical recommendations, extraction of patient data, and disruption of clinical workflows. Red-team findings that involve potential patient harm must be escalated through clinical governance channels, not just technical channels.

Public Sector. Red-team exercises must include attacks that exploit the power asymmetry between government and citizens: manipulation of eligibility determinations, extraction of personal data from benefits systems, and exploitation of accessibility features.

Maturity Model

Basic Implementation — Adversarial evaluations are conducted at the required recurring frequency (≤6 months for high-risk, ≤12 months for standard). Material changes trigger unscheduled evaluations within 30 days. At least one annual exercise is conducted by an independent team. Findings are tracked to closure with root-cause analysis. Each exercise includes at least one new attack category. This level meets the minimum mandatory requirements but adversarial testing may be entirely periodic (no continuous automated component).

Intermediate Implementation — Automated adversarial testing runs on a daily or weekly basis, supplementing periodic human exercises. A scope rotation matrix tracks cumulative adversarial coverage. A vulnerability class taxonomy enables recurrence tracking. Time-to-discovery and time-to-remediation metrics are tracked and reported. Red-team methodology varies across exercises.

Advanced Implementation — All intermediate capabilities plus: adversarial testing occurs in production using controlled, non-harmful adversarial inputs. Red-team exercises are conducted under threat-led frameworks (e.g., TIBER-EU). The organisation participates in industry information sharing about AI agent vulnerabilities. Machine learning identifies patterns across findings to predict vulnerability classes before they are exploited. The red-team programme is externally benchmarked against industry peers.

7. Evidence Requirements

Required artefacts:

Retention requirements:

Access requirements:

8. Test Specification

Test 8.1: Schedule Compliance

Test 8.2: Material Change Trigger Response

Test 8.3: Independence Verification

Test 8.4: Finding Closure Quality

Test 8.5: Scope Expansion Verification

Test 8.6: Recurrence Tracking

Conformance Scoring

9. Regulatory Mapping

RegulationProvisionRelationship Type
EU AI ActArticle 9 (Risk Management System)Supports compliance
EU AI ActArticle 15 (Accuracy, Robustness, Cybersecurity)Direct requirement
NIST AI RMFMEASURE 2.7, MANAGE 2.4Supports compliance
ISO 42001Clause 9.1 (Monitoring), Clause 10.1 (Continual Improvement)Supports compliance
DORAArticle 24 (ICT Testing), Article 25 (Threat-Led Penetration Testing)Direct requirement
FCA SYSC6.1.1R (Systems and Controls)Supports compliance

EU AI Act — Article 15 (Accuracy, Robustness, Cybersecurity)

Article 15 requires that high-risk AI systems be resilient against attempts by unauthorised third parties to alter their use or performance through exploitation of system vulnerabilities. Continuous red-teaming is the mechanism that verifies this resilience on an ongoing basis. A one-off red-team exercise demonstrates resilience at a point in time; continuous red-teaming demonstrates that resilience is maintained as the system and its threat environment evolve.

DORA — Article 24, Article 25

Article 24 requires comprehensive, risk-based ICT testing. Article 25 requires that significant financial entities conduct threat-led penetration testing at least every 3 years (with annual requirements for certain entities). For AI agents in financial services, AG-355's requirements for recurring adversarial evaluation align with and in many cases exceed DORA requirements. The vulnerability class taxonomy and recurrence tracking support the DORA requirement for testing to result in meaningful risk reduction.

10. Failure Severity

FieldValue
Severity RatingHigh
Blast RadiusOrganisation-wide — adversarial vulnerabilities discovered by attackers rather than red teams affect all users and systems the agent can access

Consequence chain: Without continuous red-teaming, adversarial vulnerabilities accumulate undetected. The immediate consequence is a growing attack surface — each model update, capability addition, and deployment change may introduce new vulnerabilities that persist until an attacker discovers them. The operational consequence is that the organisation is always defending against yesterday's threat landscape while adversaries attack with today's techniques. The regulatory consequence is inability to demonstrate ongoing robustness testing — regulators expect continuous assurance, not point-in-time certification. The compounding effect is that unremediated vulnerabilities interact: an attacker who chains two individually moderate vulnerabilities may achieve a critical impact that neither vulnerability would enable alone. Only continuous, expanding-scope adversarial testing has a chance of discovering these compound attack paths before adversaries do.

Cross-references: AG-103 (Red-Team Coverage Management) defines what attack categories should be covered. AG-095 (Prompt Injection Resilience Testing) addresses a specific critical attack category that should be included in every red-team exercise. AG-349 (Scenario Library Governance) provides the adversarial scenarios used in automated testing. AG-354 (Hidden Test Integrity Governance) ensures that adversarial test sets are not compromised. AG-351 (Human-Subject Evaluation Ethics Governance) governs participant welfare in red-team exercises. AG-356 (Near-Miss Capture Governance) feeds near-miss events into the adversarial scenario library.

Cite this protocol
AgentGoverning. (2026). AG-355: Continuous Red-Team Scheduling Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-355