AG-524: Adverse Event Reporting Integration Governance

2. Summary

Adverse Event Reporting Integration Governance requires that AI agents operating in healthcare and life sciences contexts detect, classify, and route relevant failures, unexpected outcomes, and patient safety signals into the formal adverse event reporting channels mandated by applicable regulatory frameworks. When an AI agent contributes to, fails to prevent, or observes an adverse event — including medication errors, diagnostic failures, treatment complications, or device malfunctions — the event must be captured with sufficient detail and transmitted to the appropriate pharmacovigilance, medical device vigilance, or institutional safety reporting system within the timeframes required by regulation. This dimension ensures that AI agent failures are not siloed in technical incident logs but are integrated into the patient safety infrastructure that healthcare systems depend upon for harm prevention and learning.

3. Example

Scenario A — Medication Interaction Alert Suppressed, Adverse Event Not Reported: A clinical decision-support agent evaluates a prescription for warfarin in a 67-year-old patient already taking amiodarone. The agent's drug interaction module correctly identifies the warfarin-amiodarone interaction (a well-documented interaction that potentiates anticoagulation, increasing bleeding risk by approximately 3-fold). However, a software defect in the alert presentation layer causes the interaction alert to be displayed for only 0.3 seconds before being replaced by the next screen in the workflow — effectively suppressing it. The prescribing physician does not see the alert and proceeds with the standard warfarin dose. Nine days later, the patient presents to the emergency department with an INR of 8.7 (therapeutic range: 2.0-3.0) and a subdural haematoma. Emergency craniotomy is performed; the patient survives with residual neurological deficits. Total care cost: £186,000. The software defect is identified during the root-cause analysis 6 weeks later, but no adverse event report is filed with the national medicines regulatory authority because the institution's reporting process does not include AI-mediated alert failures as a reportable category. The defect persists for an additional 4 months, affecting an estimated 340 other warfarin-amiodarone interaction alerts across the health system.

What went wrong: The adverse event — a patient harmed by an AI system's failure to effectively communicate a drug interaction alert — was not recognised as a reportable event. The institution's adverse event reporting framework predated AI-assisted prescribing and did not include AI alert failures as a reporting trigger. The event was logged as a "software bug" in the IT ticketing system but never reached the pharmacovigilance team. The 4-month delay in addressing the defect after root-cause analysis compounded the harm, as no regulatory urgency was attached to a non-reported event. Consequence: patient harm (subdural haematoma with residual deficits), £186,000 in direct care costs, estimated 340 additional patients exposed to suppressed alerts, £1.2 million settlement, regulatory investigation for failure to report a medical device adverse event.

Scenario B — Diagnostic Agent Misses Malignancy, Institutional Reporting Delay: A radiology AI agent analysing chest CT scans fails to flag a 1.4 cm spiculated pulmonary nodule in the left upper lobe of a 61-year-old patient with a 30-pack-year smoking history. The nodule meets all criteria for immediate follow-up under established lung cancer screening guidelines. The radiologist, reviewing 47 studies in a shift, relies on the AI's negative flag and does not independently identify the nodule. Eight months later, a follow-up scan reveals a 3.8 cm mass with mediastinal lymph node involvement — stage IIIA non-small cell lung cancer. The patient undergoes chemoradiation; prognosis is significantly worse than if the cancer had been detected at 1.4 cm (5-year survival approximately 36% versus approximately 68% for stage IA2). The AI's failure is identified during a retrospective audit 2 months after the follow-up scan. The hospital's quality department classifies the event as a "near miss" rather than an adverse event because the AI is categorised as an "advisory tool" and the radiologist bears clinical responsibility. No adverse event report is filed with the medical device regulatory authority. The AI vendor is notified informally by email but receives no formal adverse event report.

What went wrong: The classification of the event as a "near miss" rather than an adverse event was incorrect — the patient experienced delayed diagnosis and stage progression, which constitutes actual harm, not a near miss. The institutional framework did not have clear criteria for classifying AI-contributed diagnostic failures as adverse events. The informal vendor notification did not trigger the vendor's own regulatory reporting obligations. The regulatory authority received no report, preventing detection of a potential systematic failure across other deployment sites. Consequence: delayed cancer diagnosis with materially worse prognosis, stage progression from potentially curable to advanced disease, estimated £280,000 in additional treatment costs, £520,000 medico-legal claim, no regulatory visibility into a systematic diagnostic failure.

Scenario C — Cross-Border Agent Fails to Route to Correct National Authority: A telemedicine platform deploys a clinical triage agent serving patients across five EU member states. The agent incorrectly triages a 42-year-old patient in Germany presenting with acute chest pain and left arm numbness as "non-urgent musculoskeletal complaint," recommending ibuprofen and a follow-up appointment within 7 days. The patient suffers a myocardial infarction 4 hours later and is hospitalised for 11 days, including 3 days in the cardiac ICU. Total care cost: EUR 62,000. The platform's adverse event reporting system files a report with the national competent authority in Ireland (the platform's country of establishment) but not with BfArM (the German federal institute for drugs and medical devices), which is the authority responsible for device vigilance in the jurisdiction where the patient received care. BfArM does not learn of the event for 14 months, when it is discovered during a routine cross-border information exchange. During that 14 months, the same triage deficiency affects an estimated 23 other acute coronary presentations across the platform's German patient population.

What went wrong: The adverse event reporting system was configured for single-jurisdiction reporting to the platform's country of establishment. EU MDR Article 87 requires reporting to the competent authority of the member state where the incident occurred, not merely the member state of establishment. The platform had no routing logic to determine the correct national authority based on the patient's jurisdiction. The 14-month reporting gap prevented the German authority from issuing a field safety corrective action and exposed an estimated 23 additional patients to the same triage failure. Consequence: patient harm (myocardial infarction with delayed treatment), EUR 62,000 in care costs, regulatory enforcement action for failure to report under EU MDR, EUR 180,000 in fines and remediation costs, 23 additional patients exposed to the defective triage algorithm.

4. Requirement Statement

Scope: This dimension applies to any AI agent deployed in a healthcare or life sciences context where the agent's outputs, actions, or failures could contribute to, cause, fail to prevent, or constitute evidence of an adverse event affecting patient safety. The scope includes clinical decision-support agents, diagnostic agents, triage agents, medication management agents, clinical trial management agents, medical device software agents, and patient-facing health information agents. The scope extends to the full adverse event lifecycle: detection (recognising that a reportable event has occurred), classification (determining the event's severity, causality, and reporting obligations), routing (transmitting the report to the correct regulatory authorities and institutional safety systems), and closure (confirming receipt, tracking follow-up actions, and integrating learnings per AG-423). Agents that operate exclusively in non-clinical healthcare contexts (scheduling, billing, facilities management) are excluded, provided their outputs do not influence clinical decisions or patient safety outcomes.

4.1. A conforming system MUST define and maintain a classification framework that maps AI agent failure modes to adverse event categories recognised by applicable regulatory frameworks, including but not limited to: medical device adverse events (EU MDR, FDA MDR), drug-related adverse events (EU GVP, FDA FAERS), and institutional patient safety events (WHO International Classification for Patient Safety).

4.2. A conforming system MUST implement automated detection mechanisms that identify when an AI agent's output, action, or omission has contributed to or may have contributed to a patient safety event, using defined trigger criteria that include: clinical outcome deviations from expected results, agent error or malfunction logs correlated with patient interactions, clinician override patterns indicating agent failure (per AG-525), and post-hoc identification of incorrect recommendations or missed findings.

4.3. A conforming system MUST route adverse event reports to the correct regulatory authorities based on the jurisdiction where the patient received care, the type of event (device vigilance, pharmacovigilance, general patient safety), and the applicable reporting obligations, within the timeframes mandated by each authority (EU MDR: serious incidents within 15 days, life-threatening incidents within 10 days, death within 10 days; FDA: 30-day and 5-day reporting depending on severity; national authorities as applicable).

4.4. A conforming system MUST include in every adverse event report: the specific AI agent involved (version, configuration, model identifiers), the clinical context (de-identified patient demographics, clinical scenario, care setting), the agent's output or action that contributed to the event, the evidence provenance chain for the agent's output (per AG-523), the severity classification (per AG-419), and the corrective actions taken or planned.

4.5. A conforming system MUST transmit adverse event data to the AI vendor or manufacturer when the agent is a third-party product, enabling the vendor to fulfil their own regulatory reporting obligations, within 72 hours of event detection.

4.6. A conforming system MUST maintain a bidirectional integration between the AI agent's operational logs and the institution's formal adverse event reporting system, ensuring that AI-related events are captured in the same patient safety infrastructure used for non-AI events — not siloed in separate technical incident management systems.

4.7. A conforming system SHOULD implement automated severity classification that triages adverse events according to the severity matrix defined in AG-419, routing critical events to expedited reporting pathways and lower-severity events to standard reporting timelines.

4.8. A conforming system SHOULD implement trend detection across reported adverse events, identifying patterns that may indicate systematic agent failures — such as recurring misclassification of specific clinical presentations, consistent failure to detect certain pathology types, or persistent alert suppression in specific workflow contexts.

4.9. A conforming system SHOULD provide feedback loops from adverse event investigations back to the agent's operational parameters, evidence corpus, and decision logic, ensuring that identified failure modes inform agent improvement (per AG-423).

4.10. A conforming system MAY implement predictive adverse event signals — monitoring agent behaviour patterns (confidence score trends, override rates, error log frequencies) that have been empirically correlated with increased adverse event risk, triggering proactive investigation before a reportable event occurs.

5. Rationale

Adverse event reporting is the cornerstone of post-market surveillance in healthcare. It is the mechanism through which regulators, manufacturers, and healthcare institutions learn about product and process failures after deployment. Without adverse event reporting, safety defects persist undetected, harm accumulates, and the feedback loops that drive safety improvement are broken. This principle, established over decades of medical device and pharmaceutical regulation, applies with equal force — and arguably greater urgency — to AI systems in clinical settings.

AI agents in healthcare create a novel category of adverse event reporting challenge. Traditional adverse event reporting frameworks were designed for physical medical devices (implants, infusion pumps, surgical instruments) and pharmaceuticals (drug reactions, formulation defects). These frameworks assume a tangible product with a clear causal chain: the device malfunctioned, the drug caused a reaction. AI agents introduce a different failure mode: a recommendation was wrong, an alert was suppressed, a finding was missed, a triage was incorrect. These failures are informational rather than physical, but their clinical consequences are identical — patients are harmed.

The challenge is compounded by three factors specific to AI systems. First, causality is diffuse. When a clinical decision-support agent provides an incorrect recommendation and a clinician acts on it, the causal chain includes the agent's algorithm, its training data, its evidence corpus, the clinician's decision to follow the recommendation, and the institutional context that influenced the clinician's reliance on the agent. This diffuse causality makes it tempting to classify AI-related events as clinician errors rather than device failures, obscuring the AI's contribution and preventing systematic correction.

Second, detection is delayed. An incorrect recommendation may not manifest as patient harm for days, weeks, or months — Scenario B's missed lung nodule was not identified for 8 months. During this delay, the same defect may affect many other patients. Traditional adverse event detection — which relies on proximate temporal association between product use and harm — is poorly suited to AI failures with long latency periods.

Third, jurisdiction is complex. AI agents increasingly operate across borders, particularly in telemedicine and clinical trial contexts. Adverse event reporting obligations vary by jurisdiction, and the responsible authority is typically the authority where the patient received care, not where the software company is incorporated. An agent serving patients in multiple EU member states must be capable of routing reports to each member state's competent authority — a requirement that Scenario C illustrates is not trivially met.

The regulatory imperative is clear. The EU MDR (Article 87) requires manufacturers to report serious incidents involving their medical devices to the competent authority of the member state where the incident occurred. The FDA's Medical Device Reporting regulation (21 CFR Part 803) requires manufacturers, importers, and device user facilities to report deaths, serious injuries, and malfunctions. EU Good Vigilance Practice (GVP) Module VI requires marketing authorisation holders to report drug-related adverse events. HIPAA's breach notification rule requires reporting of breaches involving protected health information. Each of these frameworks creates reporting obligations that apply when AI agents are the device, the component, or the contributing factor in a patient safety event.

Beyond regulatory compliance, adverse event reporting serves the learning function that AG-423 (Incident Learning Closure Governance) depends upon. Without accurate, timely, and complete adverse event reports, the incident learning cycle cannot operate. Failures are not analysed, root causes are not identified, and corrective actions are not implemented. The same failure recurs — as in Scenario A, where a suppressed alert affected 340 patients over 4 months because the initial event was never formally reported.

6. Implementation Guidance

Adverse Event Reporting Integration Governance requires bridging two historically separate systems: the AI agent's operational infrastructure (logs, monitoring, error tracking) and the institution's patient safety reporting infrastructure (incident reporting systems, pharmacovigilance databases, regulatory submission portals). The core implementation challenge is ensuring that events originating in the AI system reach the patient safety system with sufficient clinical context, within regulatory timeframes, and routed to the correct authorities.

Recommended patterns:

Unified event taxonomy mapping. Create a formal mapping between AI agent failure modes (incorrect recommendation, missed finding, suppressed alert, incorrect triage, data error, latency failure) and the adverse event categories used by each applicable regulatory framework (EU MDR serious incident categories, FDA MDR reportable event types, institutional patient safety event classes). This mapping eliminates the classification ambiguity that caused Scenario B's misclassification of a diagnostic failure as a "near miss." Each AI failure mode maps to zero or more adverse event categories; the mapping is reviewed quarterly and updated when new failure modes are identified or regulatory categories change.
Dual-channel event capture. Implement adverse event detection through two independent channels: (1) automated detection from agent operational data — monitoring error logs, confidence score anomalies, clinician override patterns, and outcome feedback for signals that an adverse event may have occurred; and (2) manual reporting by clinicians, patients, and operational staff through the institution's existing incident reporting system, with AI-specific event categories and prompts added to the reporting interface. Neither channel alone is sufficient — automated detection catches events that clinicians may not attribute to the AI, while manual reporting captures events that automated monitoring may miss.
Jurisdiction-aware routing engine. Implement a routing engine that determines the correct regulatory authorities for each adverse event based on: the patient's location at the time of care (not the agent's server location or the institution's headquarters), the type of event (device vigilance routes to competent authorities under EU MDR; pharmacovigilance routes to national medicines agencies under EU GVP; institutional safety events route to internal quality systems), and the reporting timeline for each authority. The routing engine maintains a current registry of regulatory authority contact details, submission formats, and reporting thresholds for each applicable jurisdiction. For cross-border deployments, the routing engine must support simultaneous reporting to multiple authorities.
Structured adverse event report generation. Automatically generate the adverse event report from available data sources: the agent's decision log (per AG-415), the evidence provenance chain (per AG-523), the severity classification (per AG-419), the patient's de-identified clinical context from the electronic health record (with appropriate consent and privacy controls), and the corrective actions taken. Automated report generation reduces the reporting burden on clinical staff and ensures consistency and completeness. The generated report is reviewed by a qualified individual (pharmacovigilance officer, clinical safety officer, or designee) before submission.
Vendor notification pipeline. For third-party AI agents, implement an automated notification pipeline that transmits adverse event data to the vendor within 72 hours of detection. The notification includes: the event description, the agent version and configuration, the clinical context (de-identified), the institution's preliminary severity classification, and a request for the vendor's root-cause analysis. The pipeline confirms receipt and tracks the vendor's response. This enables the vendor to fulfil their own regulatory reporting obligations (EU MDR Article 87, FDA 21 CFR Part 803) and to investigate whether the defect affects other deployment sites.
Trend analysis and signal detection. Aggregate adverse event data across all agent deployments within the institution (and, where data sharing agreements permit, across institutions) to detect systematic patterns. A single misclassified triage may be an isolated failure; 23 misclassified triages of the same clinical presentation (as in Scenario C) indicate a systematic defect. Signal detection algorithms should monitor: event frequency by agent version and clinical domain, event clustering by patient demographics or clinical presentation, and temporal trends that may indicate a regression introduced by an agent update.

Anti-patterns to avoid:

Siloed technical incident management. Routing AI agent failures exclusively to IT service management systems (ticketing systems, bug trackers) without integration into the patient safety reporting infrastructure. IT systems track software defects; patient safety systems track patient harm. The same event may be both, and it must be captured in both systems.
Clinician-only causality attribution. Classifying events as "clinician error" when the clinician acted on an AI recommendation, alert, or finding that was incorrect, suppressed, or misleading. The AI agent is a contributing factor in the causal chain, and the event must be reportable as a device-related adverse event regardless of the clinician's independent responsibility.
Single-jurisdiction reporting assumption. Configuring the reporting system for only one jurisdiction (typically the institution's home jurisdiction) and failing to identify reporting obligations in the jurisdictions where patients actually receive care. This is a particular risk for telemedicine platforms, clinical trial networks, and health systems operating across national borders.
Retrospective-only detection. Relying exclusively on retrospective audit to identify adverse events. By the time a retrospective audit detects a failure, the same failure may have affected many additional patients. Real-time or near-real-time detection through operational monitoring must supplement retrospective analysis.
Informal vendor notification. Notifying AI vendors of adverse events through informal channels (email, phone calls, support tickets) without formal documentation, acknowledgement tracking, or regulatory-compliant content. Informal notification does not satisfy the vendor's need for structured adverse event data and does not create an auditable record of the notification.

Industry Considerations

Hospitals and Health Systems. Institutions with existing patient safety reporting systems (e.g., based on WHO Minimum Information Model for Reporting) must extend these systems to include AI-specific event categories. The patient safety officer's role must explicitly include oversight of AI-related adverse events. Training for clinical staff must include recognition of AI-contributed adverse events as distinct from purely human errors.

Medical Device Manufacturers. Organisations that develop and market AI-based medical devices (clinical decision-support, diagnostic imaging analysis, clinical triage) bear primary regulatory reporting responsibility under EU MDR and FDA MDR. Their post-market surveillance plans must include specific provisions for AI failure modes, including informational failures (incorrect recommendations) that may not be captured by traditional device malfunction categories.

Pharmaceutical Companies. Organisations deploying AI agents in pharmacovigilance, clinical trial management, or drug safety assessment must ensure that AI agent failures in these domains are captured in the organisation's safety database and reported through established pharmacovigilance channels. An AI agent that fails to detect a safety signal in post-market surveillance data is itself a reportable pharmacovigilance process failure.

Telemedicine Platforms. Cross-border telemedicine operators must implement jurisdiction-aware routing from day one, not as a retrofit after a regulatory enforcement action. The routing engine must be tested against realistic cross-border scenarios covering all served jurisdictions.

Maturity Model

Basic Implementation — The organisation has defined a classification framework mapping AI agent failure modes to adverse event categories. AI-related adverse events are captured in the institution's patient safety reporting system (not only in IT systems). Reports are routed to the correct regulatory authorities based on event type and patient jurisdiction. Vendor notification occurs within 72 hours. Reports include the minimum required data elements. This level meets mandatory regulatory reporting obligations.

Intermediate Implementation — All basic capabilities plus: automated detection monitors agent operational data for adverse event signals (error logs, override patterns, confidence anomalies). Structured adverse event reports are generated automatically from available data sources and reviewed before submission. Trend detection identifies patterns across multiple events. The jurisdiction-aware routing engine supports simultaneous multi-authority reporting. Feedback loops from adverse event investigations inform agent improvement per AG-423.

Advanced Implementation — All intermediate capabilities plus: predictive adverse event signals enable proactive investigation before reportable events occur. Cross-institutional signal detection (where data sharing agreements permit) identifies systematic defects across deployment sites. Real-time dashboards show adverse event metrics by agent, clinical domain, and jurisdiction. Independent audit validates adverse event detection completeness — comparing agent operational logs against reported events to identify under-reporting. Full integration with AG-523 provenance chains ensures that every adverse event report includes the complete evidential basis for the agent's contributing output.

7. Evidence Requirements

Required artefacts:

Adverse event classification framework. The documented mapping between AI agent failure modes and adverse event categories for each applicable regulatory framework. Must include: failure mode definitions, mapped adverse event categories, severity classification criteria (per AG-419), and reporting timeline requirements per authority.
Adverse event reports filed. Complete copies of all adverse event reports submitted to regulatory authorities, including submission timestamps, authority identifiers, acknowledgement receipts, and follow-up correspondence. Reports must include all data elements required by the applicable regulation.
Detection mechanism documentation. Technical documentation of the automated and manual detection mechanisms, including: trigger criteria, monitoring parameters, data sources, and detection latency targets.
Routing configuration records. The current configuration of the jurisdiction-aware routing engine, including: authority registry, jurisdiction determination logic, submission format specifications, and timeline requirements. Configuration changes must be version-controlled.
Vendor notification records. Records of all adverse event notifications sent to AI vendors, including: notification content, transmission timestamp, receipt confirmation, and vendor response tracking.
Trend analysis reports. Periodic (minimum quarterly) trend analysis reports showing: adverse event frequency by agent and clinical domain, identified patterns, signal detection outcomes, and corrective actions taken in response to identified trends.

Retention requirements:

Adverse event reports and related correspondence: minimum 15 years (aligned with EU MDR Article 10(8) technical documentation retention) or the applicable national medical records retention period, whichever is longer.
Detection mechanism documentation and routing configuration: minimum 10 years.
Vendor notification records: minimum 10 years.
Trend analysis reports: minimum 7 years.

Access requirements:

Adverse event reports and supporting documentation must be producible to regulatory authorities within 24 hours of request — reflecting the urgency associated with patient safety investigations.
Trend analysis reports must be producible within 48 hours.
All evidence must exist as retained artefacts; reconstruction after the fact does not satisfy this requirement.

8. Test Specification

Test 8.1: Adverse Event Detection from Agent Failure

Stimulus: Simulate 10 agent failure scenarios across different clinical domains: (a) incorrect medication dose recommendation, (b) missed radiological finding, (c) suppressed drug interaction alert, (d) incorrect triage classification, (e) stale evidence leading to contraindicated recommendation, (f) delayed laboratory result notification, (g) incorrect allergy cross-reactivity assessment, (h) missed contraindication in patient history, (i) incorrect surgical site laterality recommendation, (j) failure to escalate deteriorating vital signs. Inject each failure into the agent's operational environment with associated patient outcome data indicating harm.
Expected behaviour: The automated detection mechanism identifies all 10 scenarios as potential adverse events and initiates the classification and reporting workflow.
Pass criteria: 100% of simulated failures are detected as potential adverse events. Detection latency does not exceed 24 hours from the point where outcome data indicating harm becomes available.
Fail criteria: Any simulated failure is not detected, or detection latency exceeds 24 hours.

Test 8.2: Adverse Event Classification Accuracy

Stimulus: Present the classification framework with 15 pre-defined scenarios of varying severity and type: 5 serious incidents (EU MDR definition), 5 non-serious incidents, and 5 near-misses. Each scenario includes a detailed clinical narrative and agent failure description.
Expected behaviour: The classification framework correctly categorises each scenario by severity, event type, and applicable reporting obligations.
Pass criteria: At least 14 of 15 scenarios (93%) are correctly classified. All 5 serious incidents are correctly identified as serious (zero false negatives for serious events). No near-miss is misclassified as a non-reportable event.
Fail criteria: Fewer than 14 correct classifications, or any serious incident is misclassified as non-serious, or any near-miss is classified as non-reportable.

Test 8.3: Regulatory Routing Correctness

Stimulus: Generate adverse event reports for 8 scenarios spanning different jurisdictions and event types: (a) device vigilance report for a patient in Germany, (b) device vigilance report for a patient in France, (c) pharmacovigilance report for a patient in the United States, (d) device vigilance report for a patient in the United Kingdom, (e) simultaneous device vigilance reports for a multi-site clinical trial across 3 EU member states, (f) institutional patient safety report for an internal event, (g) FDA MDR report for a US patient using a telemedicine platform based in Ireland, (h) dual pharmacovigilance and device vigilance report for a combination product event.
Expected behaviour: Each report is routed to the correct regulatory authority or authorities based on the patient's jurisdiction and event type. Multi-jurisdiction events generate separate reports for each applicable authority.
Pass criteria: 100% of reports are routed to the correct authorities. Multi-jurisdiction events generate all required reports. No report is sent to an incorrect authority. Routing occurs within the timeframe required by the most urgent applicable regulation.
Fail criteria: Any report is routed to an incorrect authority, any required authority is omitted, or routing exceeds the regulatory timeframe.

Test 8.4: Reporting Timeline Compliance

Stimulus: Simulate 6 adverse events with different severity classifications and regulatory reporting timelines: (a) a death requiring 10-day EU MDR reporting, (b) a serious injury requiring 15-day EU MDR reporting, (c) a malfunction requiring 15-day EU MDR reporting, (d) an event requiring 5-day FDA MDR reporting, (e) an event requiring 30-day FDA MDR reporting, and (f) an event requiring 15-day pharmacovigilance reporting. Start the timer at the point of event detection.
Expected behaviour: Reports are generated, reviewed, and submitted within the applicable regulatory timeline for each event.
Pass criteria: 100% of reports are submitted within the applicable regulatory timeline. Internal workflow milestones (automated report generation, reviewer notification, review completion) are completed with sufficient margin to meet the external deadline.
Fail criteria: Any report exceeds its regulatory reporting timeline, or internal workflow milestones indicate that timely submission depends on zero-margin processing.

Test 8.5: Vendor Notification Completeness and Timeliness

Stimulus: Simulate 5 adverse events involving third-party AI agents from different vendors. For each event, verify that a vendor notification is generated and transmitted.
Expected behaviour: Each vendor receives a structured notification within 72 hours of event detection, containing: event description, agent version and configuration, de-identified clinical context, severity classification, and corrective action request. Receipt confirmation is obtained and recorded.
Pass criteria: 100% of vendors are notified within 72 hours. All notifications contain the required data elements. Receipt confirmation is documented for each notification.
Fail criteria: Any vendor is not notified within 72 hours, any notification is missing required data elements, or receipt confirmation is not obtained.

Test 8.6: Bidirectional Integration Verification

Stimulus: File an adverse event through the AI agent's automated detection channel and separately file an adverse event through the institution's manual patient safety reporting system for an AI-related incident. Verify that both events appear in both the AI agent's operational log and the institution's patient safety reporting system.
Expected behaviour: Events filed through either channel are visible in both systems. The AI operational log references the patient safety report ID, and the patient safety report references the AI agent's incident identifier.
Pass criteria: Both events are bidirectionally linked between systems. No event exists in only one system. Cross-references are accurate and navigable.
Fail criteria: Any event exists in only one system without a cross-reference to the other, or cross-references are inaccurate.

Test 8.7: Trend Detection Across Multiple Events

Stimulus: Inject a series of 12 adverse events over a simulated 90-day period, where 8 of the 12 involve the same agent failure mode (incorrect triage of acute chest pain presentations). The remaining 4 events are unrelated. Verify that the trend detection mechanism identifies the pattern.
Expected behaviour: The trend detection mechanism identifies the cluster of 8 related events and generates a signal alert indicating a potential systematic failure. The alert includes: the identified pattern, the affected agent version and clinical domain, the number of events, the time period, and a recommendation for investigation.
Pass criteria: The systematic pattern is identified before the 8th event in the cluster (i.e., detection occurs with fewer than 8 events, demonstrating early signal detection). The signal alert contains all required elements.
Fail criteria: The pattern is not identified, or identification requires all 8 events to have occurred before detection, or the signal alert is missing required elements.

Conformance Scoring

Score 0: No adverse event reporting integration exists — AI agent failures are tracked only in technical systems (bug trackers, IT service management) with no linkage to patient safety reporting infrastructure or regulatory reporting channels.
Score 1: AI agent failures are classified as adverse events and reported to regulatory authorities, but detection is manual, routing is not jurisdiction-aware, and vendor notification is informal. Reports are filed but may be incomplete or delayed beyond regulatory timelines.
Score 2: Automated detection monitors agent operational data for adverse event signals. Classification follows a documented framework mapping AI failure modes to regulatory categories. Jurisdiction-aware routing submits reports to correct authorities within regulatory timelines. Vendor notification is structured and tracked. Bidirectional integration links AI operational logs and patient safety systems.
Score 3: Verified by independent audit — an independent party has validated detection completeness (comparing agent logs against reported events), classification accuracy, routing correctness, timeline compliance, and vendor notification completeness. Trend detection identifies systematic patterns. Predictive signals enable proactive investigation. Cross-institutional signal detection is operational where data sharing agreements exist.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 72 (Post-Market Monitoring)	Direct requirement
EU MDR	Article 87 (Reporting of Serious Incidents)	Direct requirement
EU MDR	Article 88 (Trend Reporting)	Direct requirement
HIPAA	Breach Notification Rule §164.400-414	Supports compliance
FDA 21 CFR Part 11	§11.10 (Controls for Closed Systems)	Supports compliance
FDA 21 CFR Part 803	Medical Device Reporting	Direct requirement
NIST AI RMF	MANAGE 4.1, GOVERN 1.5	Supports compliance
ISO 42001	Clause 10.1 (Nonconformity and Corrective Action)	Supports compliance

EU AI Act — Article 72 (Post-Market Monitoring)

Article 72 requires providers of high-risk AI systems to establish and document a post-market monitoring system that is proportionate to the nature and risks of the AI system. The post-market monitoring system must actively and systematically collect, document, and analyse relevant data provided by users or collected through other sources on the performance of AI systems throughout their lifetime. For clinical AI agents, adverse event reporting is the most critical component of post-market monitoring — it is the mechanism through which performance failures with patient safety consequences are detected, documented, and analysed. AG-524 implements the specific governance controls needed to ensure that post-market monitoring captures AI-contributed adverse events with sufficient detail, timeliness, and routing to satisfy Article 72's requirements. Without adverse event reporting integration, a provider's post-market monitoring system is blind to its most consequential failures.

EU MDR — Article 87 (Reporting of Serious Incidents)

Article 87 requires manufacturers to report serious incidents involving their medical devices to the competent authority of the member state where the incident occurred. For AI-based medical devices (clinical decision-support software, diagnostic imaging analysis, triage systems), "serious incidents" include any malfunction or deterioration in the characteristics or performance of a device that leads to, or might have led to, death or serious deterioration of health. AG-524's classification framework (Requirement 4.1) maps AI failure modes to the EU MDR's serious incident categories. The jurisdiction-aware routing engine (Requirement 4.3) ensures reports are filed with the correct member state authority. The reporting timeline requirements (Requirement 4.3) align with EU MDR's 10-day and 15-day reporting windows.

EU MDR — Article 88 (Trend Reporting)

Article 88 requires manufacturers to report statistically significant increases in the frequency or severity of non-serious incidents or expected undesirable side effects. AG-524's trend detection requirement (Requirement 4.8) directly implements this obligation for AI-based medical devices, enabling identification of systematic patterns that may not be individually serious but collectively indicate a safety concern.

HIPAA — Breach Notification Rule §164.400-414

When an AI agent's failure leads to unauthorised disclosure of protected health information — for example, a de-identification failure in an adverse event report, or an AI agent transmitting patient data to an incorrect routing destination — the HIPAA Breach Notification Rule requires notification to affected individuals, the Secretary of HHS, and in some cases the media. AG-524's reporting framework must itself comply with HIPAA's privacy requirements, ensuring that adverse event reports are de-identified or transmitted through HIPAA-compliant channels, and that reporting failures do not themselves constitute breaches.

FDA 21 CFR Part 11 — §11.10 (Controls for Closed Systems)

Adverse event reports for AI-based medical devices regulated by the FDA are electronic records subject to 21 CFR Part 11. The reports must be attributable (identifying the reporter, the agent, and the reviewing officer), legible (in a format readable by the receiving authority), contemporaneous (generated at or near the time of event detection), original (the authoritative record, not a reconstruction), and accurate (verified against available data). AG-524's requirements for structured report generation (Requirement 4.4) and provenance chain inclusion (referencing AG-523) support Part 11 compliance.

FDA 21 CFR Part 803 — Medical Device Reporting

Part 803 establishes the mandatory reporting requirements for medical device manufacturers, importers, and device user facilities. Manufacturers must report deaths and serious injuries within 30 calendar days (or 5 days for events requiring remedial action). AG-524's routing and timeline requirements (Requirement 4.3) are designed to satisfy Part 803's reporting obligations. The vendor notification requirement (Requirement 4.5) ensures that device user facilities (hospitals, clinics) transmit adverse event information to the manufacturer, enabling the manufacturer to fulfil their own Part 803 obligations.

NIST AI RMF — MANAGE 4.1, GOVERN 1.5

MANAGE 4.1 addresses the documentation and communication of AI system incidents and errors. AG-524 implements this through structured adverse event reporting that documents the AI system's contribution to clinical incidents. GOVERN 1.5 addresses feedback mechanisms for AI governance. The adverse event reporting integration provides the primary feedback mechanism through which AI failures in clinical settings are communicated to governance decision-makers, enabling corrective action and governance refinement.

ISO 42001 — Clause 10.1 (Nonconformity and Corrective Action)

ISO 42001 requires organisations to react to nonconformities, evaluate the need for action, implement corrective actions, and review effectiveness. Adverse events involving AI agents are nonconformities requiring corrective action. AG-524 provides the detection and reporting infrastructure that identifies these nonconformities and initiates the corrective action process. Without adverse event reporting integration, nonconformities in clinical AI systems may go undetected, preventing the corrective action process from operating.

10. Failure Severity

Field	Value
Severity Rating	Critical
Blast Radius	Multi-patient, potentially population-level — unreported adverse events allow defective AI agents to continue operating, exposing all future patients to the same failure mode until the defect is independently discovered

Consequence chain: An adverse event reporting integration failure begins with a patient safety event involving an AI agent that is not recognised, not classified, not reported, or not routed to the correct authority. The immediate consequence is that the regulatory authority responsible for overseeing the agent's safety has no visibility into the failure. Without regulatory visibility, no field safety corrective action is issued, no safety alert is distributed to other deployment sites, and no urgency is attached to correcting the underlying defect. The AI agent continues to operate with the same defect, exposing additional patients to the same failure mode. In Scenario A, 340 additional patients were exposed to suppressed drug interaction alerts over 4 months. In Scenario C, an estimated 23 patients were exposed to defective triage over 14 months. The cumulative harm from unreported events far exceeds the harm from the initial event. The regulatory consequence includes enforcement actions for failure to report (EU MDR Article 87 violations carry penalties up to EUR 2 million or 2% of global turnover; FDA failure-to-report violations carry civil monetary penalties), institutional accreditation risk (failure to maintain an adequate adverse event reporting programme threatens hospital accreditation), and liability exposure (failure to report a known defect that subsequently harms additional patients creates an aggravated liability position). The systemic consequence is erosion of the patient safety reporting infrastructure — if AI-related events are systematically excluded from adverse event reporting, the healthcare system's ability to learn from AI failures and improve AI safety is fundamentally compromised.

Cross-references: AG-424 (Notification Routing Governance) provides the general notification routing framework that AG-524 specialises for adverse event reporting. AG-419 (Adverse Event Severity Matrix Governance) defines the severity classification system used to triage adverse events. AG-519 (Clinical Indication Scope Governance) defines the clinical scope constraints whose violation may constitute an adverse event. AG-522 (Medication Interaction Actuation Governance) governs the drug interaction alerting whose failure is a common adverse event trigger. AG-523 (Clinical Evidence Provenance Governance) provides the provenance chains included in adverse event reports. AG-525 (Physician Override Usability Governance) captures clinician override data that serves as an adverse event detection signal. AG-416 (Evidentiary Chain-of-Custody Governance) ensures that evidence supporting adverse event reports maintains custody integrity. AG-423 (Incident Learning Closure Governance) governs the learning process that adverse event reports feed into.

Cite this protocol

AgentGoverning. (2026). AG-524: Adverse Event Reporting Integration Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-524

← Previous Protocol

AG-523

Clinical Evidence Provenance Governance

Next Protocol →

AG-525

Physician Override Usability Governance