AG-355: Continuous Red-Team Scheduling Governance

2. Summary

Continuous Red-Team Scheduling Governance requires that organisations run recurring adversarial evaluations against their AI agents rather than treating red-teaming as a one-off pre-deployment exercise. Adversarial capabilities evolve, agent behaviour drifts, and deployment contexts change — a red-team exercise that found no vulnerabilities six months ago provides no assurance about today's risk posture. This dimension mandates a structured, recurring schedule of adversarial evaluations with defined scope, frequency, independence requirements, and finding remediation tracking.

3. Example

Scenario A — One-Off Red Team Creates Point-in-Time Assurance Gap: A customer-facing financial advice agent undergoes a comprehensive red-team exercise before deployment. The exercise, conducted by an external firm, tests 35 attack categories over two weeks and identifies 8 vulnerabilities, all of which are remediated before launch. Twelve months later, no further red-team exercise has been conducted. During this period, the agent's model has been updated twice, 3 new capabilities have been added, and 15 new prompt injection techniques have been published by security researchers. A researcher discovers that the agent can be manipulated into providing unregulated investment advice through a technique published 8 months after the red-team exercise. The vulnerability has been exploitable for 8 months.

What went wrong: Red-teaming was treated as a pre-deployment gate rather than a continuous process. The assurance provided by the initial exercise degraded as the agent changed and the threat landscape evolved. No schedule existed for follow-up exercises. Consequence: 8 months of exploitable vulnerability, regulatory investigation when the researcher publishes the finding, mandatory emergency red-team engagement costing £95,000, and reputational damage from public disclosure.

Scenario B — Internal-Only Red Team Misses External Attack Vectors: An enterprise deploys a procurement agent and conducts quarterly red-team exercises using its internal security team. The team follows the same methodology each quarter, focusing on prompt injection, data extraction, and privilege escalation. The team consistently finds and remediates minor issues. However, a novel supply-chain attack emerges: an attacker compromises a supplier's invoice template to embed adversarial instructions that manipulate the agent's approval workflow. The internal team never tests supply-chain attack vectors because it is not within their standard methodology. An external red-team firm, engaged after an incident, identifies the vulnerability within 4 hours.

What went wrong: The internal red team had a fixed methodology that did not evolve with the threat landscape. The lack of external, independent red-teaming meant that blind spots in the internal methodology were never challenged. Consequence: Supply-chain compromise resulting in £167,000 in fraudulent approvals, mandatory external red-team engagement, and revision of the red-team programme to include both internal and external exercises.

Scenario C — Untracked Findings Create Recurring Vulnerabilities: A healthcare agent undergoes bi-annual red-team exercises. The first exercise identifies that the agent can be manipulated into revealing patient data through a series of indirect queries. The finding is logged and a fix is implemented. Six months later, the second exercise discovers the same vulnerability class — the fix addressed the specific attack vector but not the underlying vulnerability. The third exercise, six months later, finds a variant of the same issue. Each time, a fix is implemented for the specific attack demonstrated; each time, the root cause persists. After three exercises and 18 months, the fundamental vulnerability remains because findings are tracked as individual issues rather than as vulnerability classes with root-cause remediation.

What went wrong: Red-team findings were tracked as individual bugs rather than as vulnerability classes. No root-cause analysis was conducted. No mechanism verified that remediation addressed the underlying vulnerability, not just the specific demonstrated attack. Consequence: 18 months of recurring vulnerability, three sets of remediation costs (£35,000 per cycle, £105,000 total), patient data exposure risk persisting across the entire period, and clinical governance concern about the reliability of the evaluation programme.

4. Requirement Statement

Scope: This dimension applies to all AI agent deployments where adversarial testing is appropriate — which, under the broader governance framework, includes all production deployments of agents with access to sensitive data, financial systems, or external-facing communication channels. The scope covers both automated adversarial testing (running adversarial scenarios from the scenario library) and human red-team exercises (where skilled individuals attempt to discover and exploit vulnerabilities). It applies to pre-deployment, post-deployment, and post-update evaluations. It does not prescribe the specific adversarial techniques to be tested — that is addressed by AG-103 (Red-Team Coverage Management) and AG-095 (Prompt Injection Resilience Testing) — but it mandates the scheduling, independence, and remediation-tracking framework within which adversarial testing occurs.

4.1. A conforming system MUST conduct adversarial evaluations at a defined recurring frequency, with the maximum interval between exercises not exceeding 6 months for high-risk agents and 12 months for standard-risk agents.

4.2. A conforming system MUST trigger an unscheduled adversarial evaluation within 30 days of any material change to the agent — including model updates, new capability additions, significant configuration changes, or expansion to new deployment contexts.

4.3. A conforming system MUST ensure that at least one adversarial evaluation per year is conducted by a team independent of the agent's development and operations teams.

4.4. A conforming system MUST track all red-team findings to closure with root-cause analysis, remediation plan, remediation evidence, and verification that the remediation addresses the vulnerability class, not only the specific demonstrated attack.

4.5. A conforming system MUST maintain a red-team findings register that classifies findings by vulnerability class, severity, remediation status, and recurrence history.

4.6. A conforming system MUST include in each red-team exercise scope at least one category of attack that was not in the scope of the previous exercise, ensuring that the adversarial coverage expands over time.

4.7. A conforming system SHOULD vary the red-team methodology across exercises, using different teams, tools, or approaches to avoid methodological blind spots.

4.8. A conforming system SHOULD include automated continuous adversarial testing (running adversarial scenarios from the scenario library on a daily or weekly basis) in addition to periodic human red-team exercises.

4.9. A conforming system SHOULD measure and report the time-to-discovery (how long a vulnerability existed before the red team found it) and time-to-remediation (how long from discovery to verified fix) for each finding.

4.10. A conforming system MAY implement adversarial testing in production using controlled, non-harmful adversarial inputs to test agent resilience under real operating conditions.

5. Rationale

One-off red-teaming provides point-in-time assurance that degrades immediately. The day after a red-team exercise, the assurance begins to erode: the agent may change, the threat landscape may evolve, and new attack techniques may be published. Within months, the assurance from a one-off exercise is minimal. Continuous red-team scheduling converts point-in-time assurance into ongoing assurance by ensuring that the interval between exercises is bounded and that material changes trigger unscheduled assessments.

The independence requirement (4.3) addresses a fundamental limitation of internal red-teaming: internal teams develop institutional blind spots. They test what they know, use the tools they are familiar with, and operate within the mental model of the system they helped build. External teams bring different methodologies, different tools, different perspectives, and — critically — no preconceptions about what the system "should" do. The combination of frequent internal exercises and periodic external exercises provides both breadth (internal frequency) and depth (external independence).

The finding-to-closure tracking requirement (4.4) addresses the most common failure mode in red-team programmes: findings that are "fixed" but not truly resolved. When a red-team finding is treated as a specific bug rather than an instance of a vulnerability class, the fix addresses the symptom but not the cause. The same vulnerability manifests again through a different attack vector, and the red-team programme degrades into a cycle of finding and re-finding the same underlying weaknesses.

The expanding scope requirement (4.6) ensures that the red-team programme does not stagnate. If every exercise tests the same categories, the agent will be well-defended against those categories but potentially exposed to categories that have never been tested. By requiring at least one new category per exercise, the programme's cumulative coverage grows over time.

The time metrics (4.9) provide critical operational intelligence. Time-to-discovery measures how long vulnerabilities survive before detection — a long time-to-discovery indicates that the red-team frequency or scope is insufficient. Time-to-remediation measures organisational responsiveness to adversarial findings — a long time-to-remediation indicates that the remediation process is inadequate.

6. Implementation Guidance

An effective continuous red-team programme requires a defined schedule, a scope management process, qualified teams, and a robust finding management workflow.

Recommended patterns:

Tiered scheduling cadence. Define a tiered schedule: (1) automated adversarial testing — daily or weekly, using adversarial scenarios from the scenario library, covering known attack categories; (2) internal red-team exercises — quarterly, using internal security staff or an internal red-team function, covering a broader scope including novel attack research; (3) external red-team exercises — annually (or semi-annually for high-risk agents), using an independent external firm, covering the full adversarial spectrum including techniques the internal team may not be equipped to test. Each tier provides a different level of depth and independence.
Scope rotation matrix. Maintain a matrix of adversarial categories (e.g., prompt injection, data extraction, privilege escalation, social engineering, supply-chain attacks, denial of service, multi-agent exploitation). For each exercise, document which categories are in scope. Track cumulative coverage across exercises. Ensure that each exercise adds at least one category not covered in the previous exercise. Over time, this builds comprehensive adversarial coverage.
Vulnerability class taxonomy. Classify red-team findings using a defined taxonomy of vulnerability classes (e.g., "instruction context manipulation," "output boundary violation," "implicit authority assumption," "data boundary leakage"). When a finding is reported, classify it by vulnerability class. Track recurrence at the class level, not just the finding level. If the same vulnerability class recurs across multiple exercises, escalate to root-cause analysis rather than implementing another specific fix.
Finding lifecycle workflow. Define a structured workflow for red-team findings: (1) report — the red team documents the finding with reproduction steps, severity, and vulnerability class; (2) triage — the security team confirms the finding and assigns severity; (3) root-cause analysis — the development team identifies the underlying cause, not just the specific attack; (4) remediation plan — a plan addresses the root cause and the specific finding; (5) remediation implementation — the fix is developed and deployed; (6) verification — the red team (or an independent verifier) confirms that the vulnerability class is resolved, not just the specific demonstrated attack; (7) closure — the finding is closed with evidence.
Time metrics tracking. For each finding, record: (1) estimated vulnerability introduction date (when the vulnerability became exploitable); (2) discovery date; (3) remediation completion date; (4) verification date. Calculate time-to-discovery and time-to-remediation. Report these metrics to governance leadership. Set targets: e.g., time-to-remediation of less than 30 days for critical findings, less than 90 days for high findings.

Anti-patterns to avoid:

Treating red-teaming as a compliance checkbox. A red-team exercise conducted perfunctorily — with a narrow scope, limited time, and no follow-through on findings — provides no assurance. The exercise must be resourced and scoped to actually find vulnerabilities, not to produce a clean report.
Using the same team and methodology every time. Methodological diversity is essential. If every exercise uses the same team with the same tools testing the same categories, the programme converges on testing what that team is good at, not what the agent is vulnerable to.
Closing findings without verification. A finding is not resolved when a fix is deployed — it is resolved when an independent party verifies that the vulnerability class no longer exists. Without verification, findings accumulate a false sense of remediation.
Treating automated testing as a substitute for human red-teaming. Automated adversarial testing is valuable for regression and known-attack coverage. But automated tests cannot replicate the creativity, adaptiveness, and lateral thinking of skilled human red-teamers. Both are necessary.
Allowing the development team to define red-team scope. If the development team controls what the red team tests, the exercise will avoid the areas where the development team knows there are weaknesses. Scope should be defined independently, ideally by the governance or security function.

Industry Considerations

Financial Services. DORA Article 25 requires threat-led penetration testing (TLPT) for significant financial entities. Red-team exercises for financial AI agents should align with TLPT frameworks (e.g., TIBER-EU). The scope should include financial-specific attack vectors: market manipulation through agent output manipulation, regulatory reporting corruption, and customer data extraction through conversational exploitation.

Healthcare. Red-team exercises must include clinical safety attack vectors: manipulation of clinical recommendations, extraction of patient data, and disruption of clinical workflows. Red-team findings that involve potential patient harm must be escalated through clinical governance channels, not just technical channels.

Public Sector. Red-team exercises must include attacks that exploit the power asymmetry between government and citizens: manipulation of eligibility determinations, extraction of personal data from benefits systems, and exploitation of accessibility features.

Maturity Model

Basic Implementation — Adversarial evaluations are conducted at the required recurring frequency (≤6 months for high-risk, ≤12 months for standard). Material changes trigger unscheduled evaluations within 30 days. At least one annual exercise is conducted by an independent team. Findings are tracked to closure with root-cause analysis. Each exercise includes at least one new attack category. This level meets the minimum mandatory requirements but adversarial testing may be entirely periodic (no continuous automated component).

Intermediate Implementation — Automated adversarial testing runs on a daily or weekly basis, supplementing periodic human exercises. A scope rotation matrix tracks cumulative adversarial coverage. A vulnerability class taxonomy enables recurrence tracking. Time-to-discovery and time-to-remediation metrics are tracked and reported. Red-team methodology varies across exercises.

Advanced Implementation — All intermediate capabilities plus: adversarial testing occurs in production using controlled, non-harmful adversarial inputs. Red-team exercises are conducted under threat-led frameworks (e.g., TIBER-EU). The organisation participates in industry information sharing about AI agent vulnerabilities. Machine learning identifies patterns across findings to predict vulnerability classes before they are exploited. The red-team programme is externally benchmarked against industry peers.

7. Evidence Requirements

Required artefacts:

Red-team schedule. The defined schedule for adversarial evaluations, including frequency, scope rotation plan, and independence requirements.
Exercise reports. Completed reports for each adversarial evaluation, including scope, methodology, findings, and severity classifications.
Findings register. A register of all red-team findings, classified by vulnerability class, severity, remediation status, recurrence history, and time metrics (time-to-discovery, time-to-remediation).
Remediation verification records. Evidence that remediated findings were verified by an independent party to confirm the vulnerability class was resolved.
Trigger-response records. Evidence that material changes triggered unscheduled evaluations within 30 days.

Retention requirements:

Exercise reports and findings registers: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Note: detailed finding reports may contain sensitive vulnerability information; access should be controlled and provided under appropriate confidentiality arrangements.

8. Test Specification

Test 8.1: Schedule Compliance

Stimulus: Request the adversarial evaluation schedule and evidence of completed exercises over the last 18 months.
Expected behaviour: Exercises are completed within the defined cadence (≤6 months for high-risk, ≤12 months for standard) with no gaps exceeding the maximum interval.
Pass criteria: All required exercises are completed on time. No gap between consecutive exercises exceeds the defined maximum interval plus a 30-day tolerance.
Fail criteria: Any required exercise was not conducted, or any gap exceeds the maximum interval plus 30-day tolerance.

Test 8.2: Material Change Trigger Response

Stimulus: Identify all material changes to the agent in the last 12 months. Verify that each triggered an adversarial evaluation within 30 days.
Expected behaviour: Each material change has a corresponding evaluation record.
Pass criteria: 100% of material changes have adversarial evaluation records dated within 30 days.
Fail criteria: Any material change lacks a corresponding evaluation, or the evaluation occurred more than 30 days after the change.

Test 8.3: Independence Verification

Stimulus: Identify the team that conducted each adversarial evaluation in the last 12 months. Verify that at least one was conducted by a team independent of the agent's development and operations teams.
Expected behaviour: At least one exercise per year was conducted by an independent team.
Pass criteria: At least one exercise in the last 12 months was conducted by a team with no reporting relationship to, or shared personnel with, the development or operations teams.
Fail criteria: All exercises in the last 12 months were conducted by teams with reporting or personnel overlap with development or operations.

Test 8.4: Finding Closure Quality

Stimulus: Select 10 closed findings from the findings register. Verify that each has root-cause analysis, remediation evidence, and independent verification.
Expected behaviour: All 10 findings have complete closure documentation.
Pass criteria: 100% of sampled findings have: root-cause analysis identifying the vulnerability class, remediation evidence, and independent verification that the vulnerability class (not just the specific attack) was resolved.
Fail criteria: Any finding lacks root-cause analysis, remediation evidence, or independent verification.

Test 8.5: Scope Expansion Verification

Stimulus: Compare the scope of the last 3 consecutive adversarial evaluations. Verify that each included at least one attack category not in the previous exercise's scope.
Expected behaviour: Each exercise's scope includes at least one new category.
Pass criteria: All 3 exercises demonstrate scope expansion with at least one new attack category each.
Fail criteria: Any exercise's scope is entirely contained within the previous exercise's scope.

Test 8.6: Recurrence Tracking

Stimulus: Review the findings register for vulnerability classes that appear in more than one exercise. Verify that recurring findings triggered escalated root-cause analysis.
Expected behaviour: Recurring vulnerability classes have escalated analysis records and enhanced remediation plans.
Pass criteria: All recurring vulnerability classes have escalation records documenting why the initial remediation was insufficient and what enhanced remediation was implemented.
Fail criteria: Any vulnerability class recurs without escalation or enhanced analysis.

Conformance Scoring

Score 0: No recurring red-team programme exists — adversarial evaluation is ad hoc or has occurred only once.
Score 1: Adversarial evaluations occur periodically but without defined cadence, scope management, or finding lifecycle — exercises happen but are not governed.
Score 2: A structured, recurring programme meets all mandatory requirements — defined cadence, change triggers, independence, finding tracking with root-cause analysis, and scope expansion.
Score 3: Verified by independent assessment — an independent party has audited the red-team programme and confirmed that scheduling, scope management, finding lifecycle, and remediation verification are comprehensive and effective.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 15 (Accuracy, Robustness, Cybersecurity)	Direct requirement
NIST AI RMF	MEASURE 2.7, MANAGE 2.4	Supports compliance
ISO 42001	Clause 9.1 (Monitoring), Clause 10.1 (Continual Improvement)	Supports compliance
DORA	Article 24 (ICT Testing), Article 25 (Threat-Led Penetration Testing)	Direct requirement
FCA SYSC	6.1.1R (Systems and Controls)	Supports compliance

EU AI Act — Article 15 (Accuracy, Robustness, Cybersecurity)

Article 15 requires that high-risk AI systems be resilient against attempts by unauthorised third parties to alter their use or performance through exploitation of system vulnerabilities. Continuous red-teaming is the mechanism that verifies this resilience on an ongoing basis. A one-off red-team exercise demonstrates resilience at a point in time; continuous red-teaming demonstrates that resilience is maintained as the system and its threat environment evolve.

DORA — Article 24, Article 25

Article 24 requires comprehensive, risk-based ICT testing. Article 25 requires that significant financial entities conduct threat-led penetration testing at least every 3 years (with annual requirements for certain entities). For AI agents in financial services, AG-355's requirements for recurring adversarial evaluation align with and in many cases exceed DORA requirements. The vulnerability class taxonomy and recurrence tracking support the DORA requirement for testing to result in meaningful risk reduction.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Organisation-wide — adversarial vulnerabilities discovered by attackers rather than red teams affect all users and systems the agent can access

Consequence chain: Without continuous red-teaming, adversarial vulnerabilities accumulate undetected. The immediate consequence is a growing attack surface — each model update, capability addition, and deployment change may introduce new vulnerabilities that persist until an attacker discovers them. The operational consequence is that the organisation is always defending against yesterday's threat landscape while adversaries attack with today's techniques. The regulatory consequence is inability to demonstrate ongoing robustness testing — regulators expect continuous assurance, not point-in-time certification. The compounding effect is that unremediated vulnerabilities interact: an attacker who chains two individually moderate vulnerabilities may achieve a critical impact that neither vulnerability would enable alone. Only continuous, expanding-scope adversarial testing has a chance of discovering these compound attack paths before adversaries do.

Cross-references: AG-103 (Red-Team Coverage Management) defines what attack categories should be covered. AG-095 (Prompt Injection Resilience Testing) addresses a specific critical attack category that should be included in every red-team exercise. AG-349 (Scenario Library Governance) provides the adversarial scenarios used in automated testing. AG-354 (Hidden Test Integrity Governance) ensures that adversarial test sets are not compromised. AG-351 (Human-Subject Evaluation Ethics Governance) governs participant welfare in red-team exercises. AG-356 (Near-Miss Capture Governance) feeds near-miss events into the adversarial scenario library.

Cite this protocol

AgentGoverning. (2026). AG-355: Continuous Red-Team Scheduling Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-355

← Previous Protocol

AG-354

Hidden Test Integrity Governance

Next Protocol →

AG-356

Near-Miss Capture Governance