External Bounty Intake Governance establishes the processes, protections, and response obligations for receiving, triaging, and acting on vulnerability reports and issue disclosures from external researchers, users, and the public. AI agents present novel vulnerability surfaces that traditional software bug bounty programmes are not designed to cover — including prompt injection, alignment failures, output manipulation, and emergent harmful behaviours. This dimension mandates that organisations deploying AI agents maintain a structured intake channel for external reports, with defined response timelines, legal safe harbour for good-faith reporters, triage processes tailored to AI-specific vulnerability types, and feedback loops that convert validated reports into evaluation improvements.
Scenario A — No Intake Channel Creates Disclosure Chaos: A security researcher discovers that a customer-facing healthcare agent can be manipulated through a series of carefully crafted queries to reveal other patients' appointment information. The researcher attempts to report the vulnerability responsibly. The organisation has no published vulnerability disclosure policy and no dedicated intake channel. The researcher emails the general contact address; the email is routed to customer service, which responds with a template message about privacy complaints. The researcher emails the CTO directly; the email is not responded to for 3 weeks. Frustrated, the researcher publishes the vulnerability on a security forum. The publication triggers media coverage, patient panic, and a mandatory ICO breach notification. The vulnerability was exploitable for the 6 weeks between the researcher's initial contact and the public disclosure.
What went wrong: No external intake channel existed. The organisation had no way to receive and triage external vulnerability reports. The researcher's responsible disclosure attempts were routed to inappropriate teams. The delay between discovery and remediation was entirely avoidable. Consequence: Public disclosure of patient data vulnerability, ICO investigation, mandatory breach notification to affected patients, £420,000 in legal and remediation costs, and severe reputational damage.
Scenario B — Legal Threats Deter Future Reporting: A university researcher identifies that a financial agent can be tricked into generating unregulated investment advice by framing requests as academic research questions. The researcher contacts the deploying firm through its general legal email (no dedicated disclosure channel exists). The firm's legal department responds with a cease-and-desist letter, alleging that the researcher violated the firm's terms of service by interacting with the agent for purposes other than legitimate financial advice. The researcher withdraws the report and warns the academic community about the firm's hostile response. Over the next year, no external researchers engage with the firm's AI products, and the vulnerability persists until it is exploited by an actual attacker.
What went wrong: The organisation treated the external report as a legal threat rather than a security signal. No safe harbour policy existed for good-faith researchers. The hostile response not only failed to address the vulnerability but actively deterred future reporting. Consequence: Vulnerability exploited 11 months later causing £89,000 in regulatory penalties, permanent damage to the firm's reputation in the security research community, and inability to attract external security researchers for future assessments.
Scenario C — Validated Report Produces No Action: A user of a government benefits agent reports through the published feedback channel that the agent provided incorrect eligibility information that resulted in the user being denied benefits they were entitled to. The report is logged in the feedback system. No triage process distinguishes between general feedback and potential vulnerability reports. The report sits in a queue of 2,300 feedback items, reviewed monthly by a single analyst who focuses on satisfaction metrics. Eight months later, a systematic review of denied benefits cases reveals that 47 users received incorrect eligibility guidance from the agent, all sharing the same input pattern. The original user's report, which identified the pattern, was never escalated for investigation.
What went wrong: The intake channel existed but had no triage process to identify reports that represented potential vulnerabilities or systematic failures. The report was treated as generic feedback rather than a signal requiring investigation. Consequence: 47 users incorrectly denied benefits over 8 months, retrospective case review costing £78,000, compensation payments to affected users, Parliamentary scrutiny of the AI benefits system, and mandatory triage process implementation.
Scope: This dimension applies to all AI agent deployments that are accessible to external parties — whether external users, customers, members of the public, security researchers, regulators, or any party outside the deploying organisation. The scope covers all categories of external reports: security vulnerabilities (prompt injection, data extraction, privilege escalation), safety issues (harmful outputs, incorrect guidance, bias), compliance issues (regulatory non-compliance, privacy violations), and functionality issues that indicate systematic failures. It does not cover internal bug reports from the organisation's own staff (which are addressed through internal incident management processes), though the intake channel may receive reports that originate from both internal and external parties.
4.1. A conforming system MUST publish a vulnerability disclosure policy that is accessible from the agent's public-facing interface, specifying: the scope of issues the organisation will accept reports for, the intake channel (dedicated email, web form, or disclosure platform), the expected response timeline, and the legal safe harbour for good-faith reporters.
4.2. A conforming system MUST provide legal safe harbour for good-faith security researchers who report vulnerabilities through the designated intake channel, committing not to pursue legal action against reporters who act within the published scope and in good faith.
4.3. A conforming system MUST acknowledge receipt of every external report within 5 business days and provide an initial triage assessment within 15 business days.
4.4. A conforming system MUST implement a triage process that classifies incoming reports by type (security, safety, compliance, functionality), severity (critical, high, medium, low), and validates the reported issue through reproduction.
4.5. A conforming system MUST track validated external reports through the same finding lifecycle as internal red-team findings (AG-355), including root-cause analysis, remediation, and verification.
4.6. A conforming system MUST convert validated external reports into evaluation scenarios for the scenario library (AG-349), ensuring that externally discovered issues are tested for in future evaluations.
4.7. A conforming system SHOULD maintain a public acknowledgement mechanism (e.g., a security hall of fame or acknowledgement in release notes) for external reporters whose reports lead to security improvements, subject to the reporter's consent.
4.8. A conforming system SHOULD offer a structured bounty or reward programme for validated vulnerability reports, with reward levels calibrated to the severity and impact of the reported issue.
4.9. A conforming system SHOULD publish aggregate statistics on external reports received, validated, and remediated — at least annually — to demonstrate engagement with the external research community.
4.10. A conforming system MAY participate in coordinated vulnerability disclosure programmes operated by industry bodies, national cybersecurity agencies, or AI safety organisations.
External researchers, users, and the public represent an evaluation resource that no organisation can replicate internally. External researchers bring different perspectives, different tools, and different motivations. Users discover issues through real-world usage patterns that no test suite anticipates. The public encounters edge cases at a scale that internal testing cannot match. An organisation that fails to harness this resource — or worse, actively deters it — loses its most cost-effective source of vulnerability discovery.
Traditional software bug bounty programmes have demonstrated the value of external reporting at scale. Major technology companies receive thousands of validated vulnerability reports annually through their bounty programmes, many of which would never have been discovered through internal testing alone. AI agents present an even stronger case for external reporting because the vulnerability surface is larger, less well-understood, and more novel. Prompt injection, alignment failures, output manipulation, and emergent behaviours are vulnerability categories that did not exist a decade ago. External researchers — particularly academic researchers — are actively studying these vulnerability categories and can identify issues that internal teams, focused on functionality and delivery, may overlook.
The legal safe harbour requirement (4.2) is foundational. Without it, the entire external reporting ecosystem collapses. Researchers who face legal threats for responsible disclosure will either stop reporting (leaving vulnerabilities unexploited but unpatched) or disclose publicly (creating immediate exploitation risk). Neither outcome serves the organisation's interests. Safe harbour aligns the organisation's incentives with the researcher's: both want the vulnerability fixed, and safe harbour removes the legal friction that prevents cooperation.
The response timeline requirements (4.3) serve two purposes. First, they demonstrate respect for the reporter's effort and expertise, maintaining the relationship that enables future reporting. Second, they create accountability for the organisation — without defined timelines, reports can languish in queues indefinitely, as demonstrated in Scenario C. The 5-business-day acknowledgement and 15-business-day triage timelines are industry-standard and achievable for any organisation with a minimal security function.
The feedback loop into the scenario library (4.6) ensures that externally discovered issues improve future evaluation. Without this feedback loop, the same issue could be discovered and reported multiple times by different external parties, each time requiring the organisation to rediscover and remediate it afresh.
An effective external bounty intake programme requires a published policy, a dedicated intake channel, a triage process, a response workflow, and feedback loops into the governance programme.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. External reports about financial agents may involve regulatory implications (e.g., reports that the agent provides unregulated advice). The triage process must include regulatory escalation for reports that indicate potential regulatory non-compliance. The FCA expects firms to demonstrate receptiveness to external feedback about AI systems.
Healthcare. Reports involving patient safety must be triaged with clinical urgency. A report that the healthcare agent provides incorrect medication guidance is a patient safety issue, not just a software bug. Clinical governance must be involved in triage and remediation for safety-related reports.
Public Sector. Government agencies deploying AI agents should consider integrating their bounty intake with the National Cyber Security Centre (NCSC) vulnerability disclosure framework, which provides established processes for coordinated disclosure in the public sector context.
Basic Implementation — A vulnerability disclosure policy is published and accessible. A dedicated intake channel receives reports. Legal safe harbour is committed. Acknowledgement occurs within 5 business days. Triage classifies reports by type and severity within 15 business days. Validated reports are tracked to remediation. Validated reports generate scenario library entries. This level meets the minimum mandatory requirements but the programme may be reactive rather than proactively engaged with the research community.
Intermediate Implementation — An AI-specific vulnerability disclosure policy covers AI-unique vulnerability categories. AI-aware triage staff evaluate reproducibility, scope, and model dependency. Response SLAs are calibrated to severity with automated tracking. A public acknowledgement mechanism recognises reporters. Aggregate statistics are published annually. External reports are systematically integrated into the scenario library and red-team scope.
Advanced Implementation — All intermediate capabilities plus: a structured bounty programme with AI-specific reward calibration. The organisation actively engages with the AI security research community through conferences, publications, and collaborative research. Coordinated vulnerability disclosure with industry bodies and national agencies. The organisation's external intake programme is externally benchmarked and independently audited. Predictive analysis of external report patterns identifies emerging vulnerability trends.
Required artefacts:
Retention requirements:
Access requirements:
Test 8.1: Policy Accessibility
Test 8.2: Acknowledgement SLA Compliance
Test 8.3: Triage SLA Compliance
Test 8.4: Safe Harbour Verification
Test 8.5: Finding Lifecycle Compliance
Test 8.6: Scenario Library Integration
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management System) | Supports compliance |
| EU AI Act | Article 72 (Post-Market Monitoring) | Direct requirement |
| EU AI Act | Article 62 (Serious Incident Reporting) | Supports compliance |
| NIST AI RMF | GOVERN 1.2, MANAGE 2.3 | Supports compliance |
| ISO 42001 | Clause 9.1 (Monitoring), Clause 10.1 (Continual Improvement) | Supports compliance |
| DORA | Article 10 (ICT Incident Management) | Supports compliance |
| NIS2 Directive | Article 21 (Cybersecurity Risk Management), Article 23 (Reporting) | Direct requirement |
Article 72 requires providers to actively collect data about AI system performance throughout its lifetime. External reports are a critical component of post-market monitoring — they provide real-world performance data from users and researchers that complements internal monitoring. An organisation that does not maintain an external intake channel is failing to collect data that Article 72 requires it to collect.
Article 62 requires reporting of serious incidents. External reports may identify serious incidents before the organisation's internal monitoring detects them. A functioning intake channel ensures that externally identified serious incidents reach the organisation promptly, enabling timely Article 62 reporting.
Article 21 requires cybersecurity risk management measures including vulnerability handling and disclosure. Article 23 requires incident reporting. A vulnerability disclosure policy with a structured intake process directly supports Article 21 compliance. External reports that identify security incidents support Article 23 reporting obligations.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Organisation-wide — failure to receive and act on external reports allows vulnerabilities to persist and potentially be exploited, affecting all users |
Consequence chain: Without external bounty intake governance, the organisation operates in a closed loop — it can only find what it looks for internally, while external researchers, users, and adversaries interact with the agent's full attack surface. The immediate consequence is that externally discoverable vulnerabilities go unreported (or are reported through inappropriate channels that do not trigger action). The escalation consequence is that unreported vulnerabilities are eventually discovered by adversaries rather than researchers, converting potential security improvements into actual security incidents. The reputational consequence is twofold: hostile responses to researchers deter future reporting, creating a self-reinforcing cycle where the organisation becomes increasingly blind to external perspectives; and public disclosure of unpatched vulnerabilities (when responsible disclosure fails) damages trust. The regulatory consequence is non-compliance with post-market monitoring obligations that increasingly require external feedback mechanisms.
Cross-references: AG-349 (Scenario Library Governance) receives new scenarios generated from validated external reports. AG-355 (Continuous Red-Team Scheduling Governance) uses external report patterns to inform red-team scope. AG-103 (Red-Team Coverage Management) incorporates externally identified attack vectors into coverage planning. AG-095 (Prompt Injection Resilience Testing) should be informed by externally reported prompt injection techniques. AG-354 (Hidden Test Integrity Governance) must ensure that external reports do not inadvertently compromise hidden test integrity. AG-152 (Evaluation Integrity and Benchmark Leakage) governs the handling of externally reported evaluation integrity issues.