AG-699: SOC Triage Integrity Governance

2. Summary

SOC Triage Integrity Governance requires that automated security operations centre (SOC) triage — the process by which AI agents evaluate, classify, prioritise, and route security alerts — preserves the fidelity of high-severity signals, maintains the integrity of forensic evidence chains, and does not suppress, downgrade, or discard alerts that indicate genuine threats. Automated triage at scale introduces a structural risk: an agent processing thousands of alerts per hour may systematically misclassify attack indicators, merge distinct incidents into a single low-priority ticket, or discard correlated alert sequences that individually appear benign but collectively indicate a sophisticated intrusion. This dimension mandates controls that prevent automated triage from becoming the attacker's ally — silently burying the signals that defenders need most.

3. Example

Scenario A — Severity Downgrade Masks Active Lateral Movement: A multinational manufacturing company operates an AI-driven SOC triage agent that processes an average of 14,200 alerts per day across its corporate network, operational technology (OT) environments, and cloud infrastructure. The agent is trained to reduce analyst fatigue by aggressively filtering noise and auto-closing alerts classified as informational or low-severity. Over a 72-hour period beginning on a Friday evening, the agent receives 23 separate alerts from endpoint detection sensors across the company's European subsidiaries: failed authentication attempts against service accounts, anomalous PowerShell execution on three domain controllers, LDAP enumeration queries from a workstation in the finance department, and Kerberos ticket-granting ticket (TGT) requests with unusual encryption types. Each alert individually scores below the agent's high-severity threshold. The agent classifies 17 of the 23 alerts as informational and auto-closes them. The remaining 6 are classified as low-severity and placed in a queue with a 48-hour SLA. No analyst reviews any of the alerts until Monday morning. By Monday, the attacker has established persistence on four domain controllers, exfiltrated 2.3 TB of intellectual property including proprietary manufacturing designs, and deployed ransomware across the OT network, halting production at three plants. The total impact is estimated at €47 million in lost production, €12 million in incident response costs, and €8.3 million in regulatory penalties under NIS2 for failure to detect and report within the mandated timeframe.

What went wrong: The triage agent evaluated each alert in isolation without correlating the 23 alerts as a coherent attack chain. The individual severity scores were technically defensible — each alert alone was ambiguous — but the pattern of alerts (credential abuse → domain controller compromise → data staging → lateral movement) was a textbook advanced persistent threat (APT) progression. The agent had no correlation window that would aggregate temporally and logically related alerts into a composite severity assessment. The auto-close behaviour for informational alerts destroyed the evidence trail that analysts would have needed to detect the pattern. The Friday-evening timing exploited the reduced staffing model that the agent was supposed to compensate for.

Consequence: €67.3 million total loss, NIS2 regulatory enforcement action, board-level accountability review, complete rebuild of the compromised Active Directory environment over 6 weeks, and loss of a major defence-sector customer contract worth €23 million annually due to supply chain security concerns.

Scenario B — Evidence Destruction Through Alert Deduplication: A financial services firm deploys an AI triage agent to manage alerts from its security information and event management (SIEM) platform. The agent implements aggressive deduplication to reduce the volume of alerts presented to analysts. When the agent detects multiple alerts with similar source IPs, event types, or signature identifiers within a 15-minute window, it merges them into a single consolidated alert and discards the individual alert payloads, retaining only a summary count. During a credential-stuffing attack targeting the firm's customer-facing authentication portal, the agent receives 4,700 alerts over a 90-minute period. The agent deduplicates these into 12 consolidated alerts, each summarising "approximately 350-400 failed authentication events from distributed source IPs." The individual alert payloads — which contained the specific source IP addresses, user accounts targeted, timing patterns, and geographic distribution of the attack — are discarded. An analyst reviews the consolidated alerts and initiates a block on the most frequently observed source IP ranges. However, the attacker has embedded 43 successful credential compromises within the 4,700 events. The successful authentications used different IP ranges from the bulk failed attempts. Because the individual alert payloads were discarded during deduplication, the analyst cannot identify the successful compromises. The firm discovers the compromised accounts 11 days later when customers report unauthorised transactions totalling £3.8 million. The forensic investigation is severely hampered because the original alert data — which would have shown the successful authentications interspersed with the failed attempts — no longer exists.

What went wrong: The deduplication logic optimised for analyst workload reduction at the expense of forensic evidence preservation. The agent discarded raw alert payloads during merging, destroying data that was essential for both real-time analysis and post-incident forensics. The deduplication treated all alerts with similar characteristics as equivalent, failing to distinguish between failed and successful authentication events. No policy required retention of original alert payloads prior to or independent of deduplication. The 15-minute deduplication window was too narrow to capture the full scope of the attack but wide enough to merge forensically distinct events.

Consequence: £3.8 million in customer losses, £1.2 million in forensic investigation costs inflated by missing evidence, FCA enforcement investigation into inadequacy of transaction monitoring controls, £6.4 million regulatory fine, and mandatory implementation of a retrospective customer notification programme costing £890,000.

Scenario C — Auto-Closure Suppresses Insider Threat Indicators: A government agency deploys an AI triage agent to manage security alerts across its classified and unclassified network segments. The agent is configured to auto-close alerts that match known benign patterns — software update traffic, scheduled backup operations, routine administrative access. An employee with authorised access to classified systems begins systematically downloading documents outside her normal access pattern: accessing files in compartments unrelated to her assigned projects, downloading at unusual hours, and transferring files to a personal removable storage device. The data loss prevention (DLP) system generates 8 alerts over a 3-week period. The triage agent classifies 5 of the 8 alerts as false positives because the user has legitimate access credentials and the file types match patterns seen in normal administrative operations. The remaining 3 alerts are classified as low-severity because the volume of data transferred per session falls below the agent's exfiltration threshold of 500 MB. No alert is escalated to the insider threat programme. Over 3 weeks, the employee exfiltrates 1,847 classified documents. The breach is discovered only when a foreign intelligence service publishes excerpts from the documents. The damage assessment takes 14 months and concludes that the compromise affected 23 intelligence programmes and required the relocation of 7 overseas personnel at a cost exceeding $94 million.

What went wrong: The triage agent applied thresholds designed for external threat detection — high-volume, rapid exfiltration — to insider threat scenarios where the pattern is low-volume, slow, and conducted by an authorised user. The agent's false-positive classification was based on the user's access credentials being valid, which is precisely the condition that defines insider threats. The auto-closure of 5 out of 8 alerts meant that the analyst never saw the pattern of access across unrelated compartments. The per-session data threshold of 500 MB was irrelevant to an insider who transferred 20-50 documents per session, each under 5 MB. The agent lacked any behavioural baseline that would flag access to compartments outside the user's normal pattern.

Consequence: $94 million in damage remediation, compromise of 23 intelligence programmes, relocation of 7 personnel, congressional investigation, and fundamental redesign of the agency's automated triage architecture at a cost of $18 million.

4. Requirement Statement

Scope: This dimension applies to every deployment where an AI agent performs automated triage of security alerts, including but not limited to: SIEM alert processing, endpoint detection and response (EDR) alert triage, network detection and response (NDR) alert triage, cloud security posture management (CSPM) alert triage, data loss prevention (DLP) alert triage, identity and access management (IAM) anomaly triage, and any other context where an AI agent evaluates security signals and makes disposition decisions (classify, prioritise, route, escalate, suppress, merge, or close). The scope covers the entire triage lifecycle: ingestion of the raw alert, enrichment with contextual data, classification and severity assignment, correlation with related alerts, disposition decision, routing to an analyst or automated response workflow, and retention of evidence. The scope extends to triage agents operating within managed security service providers (MSSPs), managed detection and response (MDR) providers, and any third-party service that performs triage on behalf of the deploying organisation.

4.1. A conforming system MUST retain the complete, unmodified original payload of every security alert ingested by the triage agent for a minimum retention period defined by the organisation's evidence retention policy, and no less than 90 days, regardless of the triage agent's disposition decision — including alerts that are auto-closed, deduplicated, merged, or classified as false positives.

4.2. A conforming system MUST implement multi-signal correlation that evaluates incoming alerts against temporally and logically related alerts within a defined correlation window of no less than 72 hours, such that the composite severity of a group of individually low-severity alerts is assessed and may exceed the severity of any individual alert.

4.3. A conforming system MUST prohibit the auto-closure or auto-suppression of any alert classified at or above a defined severity threshold — established by the organisation's security governance function — without review by a qualified human analyst or explicit approval from an authorised automated response policy.

4.4. A conforming system MUST ensure that every triage disposition decision — including classification, severity assignment, routing, merging, and closure — is recorded in an immutable audit log that captures the alert identifier, the raw alert payload hash, the disposition decision, the reasoning or rule chain that produced the disposition, the timestamp, and the identity of the agent or analyst that made the decision.

4.5. A conforming system MUST implement a severity floor mechanism that prevents the triage agent from downgrading the severity of any alert below the severity assigned by the originating detection system without generating an explicit downgrade justification record and notifying the security governance function.

4.6. A conforming system MUST ensure that alert deduplication and merging operations preserve the individual identity and retrievability of every constituent alert, such that an analyst or forensic investigator can reconstruct the full set of original alerts from any deduplicated or merged record.

4.7. A conforming system MUST implement periodic integrity validation that tests the triage agent's handling of synthetic high-severity alert sequences — injected without the agent's awareness — to verify that the agent correctly escalates, correlates, and preserves evidence for known attack patterns, at a frequency of no less than quarterly.

4.8. A conforming system MUST escalate to a human analyst any alert or alert cluster that the triage agent cannot classify with confidence above a defined threshold, rather than defaulting to a low-severity classification or auto-closure.

4.9. A conforming system MUST maintain behavioural baselines for all monitored entities (users, hosts, services, network segments) and flag deviations from those baselines as triage-relevant context, even when the deviating behaviour involves authorised credentials or falls below static volume thresholds.

4.10. A conforming system SHOULD implement distinct triage logic paths for external threat indicators and insider threat indicators, recognising that insider threats characteristically involve authorised access, low data volumes per session, and extended time horizons that external-threat-optimised triage logic will systematically miss.

4.11. A conforming system SHOULD monitor the triage agent's disposition distribution over time — the ratio of alerts classified at each severity level, the auto-closure rate, the escalation rate — and trigger governance review when the distribution shifts significantly from established baselines, as such shifts may indicate model drift, adversarial manipulation of the alert stream, or misconfiguration.

4.12. A conforming system SHOULD implement cross-domain correlation that evaluates alerts across network segments, identity systems, endpoint telemetry, and cloud platforms simultaneously, rather than triaging each data source in isolation.

4.13. A conforming system MAY implement adversarial robustness testing that evaluates whether the triage agent can be manipulated into suppressing high-severity alerts through deliberate injection of noise alerts, alert flooding, or crafted alert sequences designed to exploit the agent's deduplication or correlation logic.

4.14. A conforming system MAY implement a triage confidence score that accompanies every disposition decision, enabling downstream consumers (analysts, automated response systems, governance dashboards) to weight their reliance on the triage output according to the agent's self-assessed certainty.

5. Rationale

The security operations centre exists for one purpose: to detect and respond to threats before they cause material harm. The triage function is the gatekeeper of that mission. Every alert that enters the SOC must be evaluated, and the triage decision — whether to escalate, investigate, or dismiss — determines whether the organisation detects an intrusion in minutes or discovers it months later through external notification. When a human analyst performs triage, errors are bounded by the analyst's attention span and shift duration. When an AI agent performs triage, errors scale to the agent's throughput — potentially thousands of alerts per hour, any one of which could be the signal that an attacker is inside the network.

The fundamental tension in SOC triage is between noise reduction and signal preservation. Modern enterprise environments generate between 10,000 and 150,000 security alerts per day. The vast majority — typically 95-99% — are false positives or true positives of negligible operational significance. The value of automated triage lies in filtering this noise so that human analysts can focus on the 1-5% that matters. But this filtering creates a structural vulnerability: the triage agent is making thousands of binary decisions per day about what is and is not important, and every false negative — every genuine threat classified as noise — is invisible until the threat manifests as an incident. The attacker benefits from this asymmetry. An attacker who understands the triage agent's logic can craft activity that falls below detection thresholds, mimics benign patterns, or exploits deduplication to hide malicious signals within high-volume noise.

Three categories of triage failure demand governance controls. First, severity misclassification: the agent assigns a low severity to an alert that indicates a high-severity threat, causing the alert to be routed to a slow-response queue or auto-closed. This is the most common failure mode and the one with the most direct consequence — the analyst never sees the alert because the agent decided it was unimportant. Second, evidence destruction: the agent's noise-reduction operations — deduplication, merging, summarisation, auto-closure — discard or modify the raw data that analysts and forensic investigators need to understand the full scope of an incident. An alert that has been deduplicated into a count ("427 similar events") has lost the specific details — source IPs, timestamps, targeted accounts — that distinguish a credential-stuffing attack from a misconfigured monitoring rule. Third, correlation failure: the agent evaluates each alert independently without recognising that a sequence of individually ambiguous alerts constitutes a coherent attack chain. Advanced persistent threats are specifically designed to avoid triggering high-severity individual alerts — the attacker's operational security depends on staying below the threshold. Only correlation across time, entities, and data sources reveals the pattern.

The regulatory environment reinforces the governance imperative. The EU's NIS2 Directive (Article 21) requires entities to implement "incident handling" measures that include detection, and mandates reporting of significant incidents within 24 hours — a timeline that is impossible to meet if the triage agent buried the initial detection signals. The Digital Operational Resilience Act (DORA) requires financial entities to implement ICT risk management frameworks that include "detection of anomalous activities" (Article 10) — a requirement that is substantively violated if the detection layer's triage function systematically suppresses true positives. The US SEC's cybersecurity disclosure rules require timely determination of materiality following a cybersecurity incident — a determination that depends on the triage layer surfacing the incident in the first place. In classified government environments, failure to detect insider threats carries national security consequences that extend beyond financial loss to potential loss of life.

The cost asymmetry is extreme. The cost of processing one additional alert — even a false positive — is measured in minutes of analyst time, typically £15-40 in labour cost. The cost of suppressing one genuine high-severity alert is measured in the full consequence of the undetected threat: data breach costs averaging $4.45 million (IBM Cost of a Data Breach Report 2023), ransomware recovery costs averaging $1.82 million, regulatory fines under GDPR reaching 4% of annual global turnover, and reputational damage that persists for years. The rational governance position is to err heavily on the side of signal preservation, accepting higher false-positive rates at high severity levels in exchange for the assurance that genuine threats are never silently discarded.

6. Implementation Guidance

SOC Triage Integrity Governance requires a layered implementation that addresses evidence preservation, correlation logic, severity governance, and continuous validation. The overarching principle is that the triage agent's noise-reduction function must never compromise the organisation's ability to detect genuine threats or reconstruct the forensic timeline of an incident.

Recommended patterns:

Immutable raw alert archive. Implement a write-once storage layer that receives and retains the complete, unmodified payload of every alert before the triage agent processes it. This archive operates independently of the triage pipeline — the triage agent can classify, merge, suppress, or close alerts in its working queue, but the original payloads are preserved in the archive regardless of disposition. The archive should use content-addressable storage with cryptographic hashing to guarantee integrity, and retention periods should align with the longest applicable regulatory or contractual requirement (typically 1-7 years for financial services, longer for government and defence).
Sliding correlation window with composite scoring. Implement a correlation engine that maintains a rolling window (minimum 72 hours, recommended 7-14 days for advanced persistent threat detection) of all alerts and evaluates incoming alerts against the window contents. The correlation logic should assess temporal proximity (alerts occurring within minutes or hours of each other), entity linkage (alerts involving the same user, host, IP address, or service), kill-chain progression (alerts that map to sequential stages of known attack frameworks such as MITRE ATT&CK), and geographic or network adjacency (alerts from the same subnet, data centre, or cloud region). The composite severity score of a correlated alert cluster should be calculated using a defined algorithm that is auditable and explainable.
Severity floor with downgrade justification. Configure the triage agent such that it cannot reduce the severity of an alert below the severity assigned by the originating detection system without generating an explicit record. The originating system's severity assignment represents the detection engineer's assessment of the indicator's significance. The triage agent may have additional context that justifies a downgrade — known benign activity, asset classification as a test system, active change management window — but the downgrade must be recorded with the specific justification, and the downgrade rate should be monitored for anomalies.
Synthetic canary injection. Implement a testing framework that periodically injects synthetic alert sequences mimicking known attack patterns into the triage pipeline. The synthetic alerts should be indistinguishable from genuine alerts to the triage agent and should include multi-stage attack chains, low-and-slow exfiltration patterns, and insider threat indicators. The framework verifies that the triage agent correctly escalates these sequences and preserves the associated evidence. Failed canary tests trigger immediate investigation and remediation. Canary injection should occur at randomised intervals, no less than quarterly, and should cover the full range of attack patterns relevant to the organisation's threat model.
Dual-path triage for insider vs. external threats. Implement separate triage logic paths for indicators that suggest external threats (exploit attempts, malware signatures, command-and-control callbacks) and indicators that suggest insider threats (anomalous access patterns by authorised users, data movement to removable media, after-hours access to sensitive repositories). Insider threat triage logic should use behavioural baselines rather than static thresholds, should operate on longer correlation windows (weeks to months rather than hours to days), and should never auto-close alerts based solely on the validity of the user's credentials.
Triage disposition dashboard with drift detection. Implement a real-time dashboard that displays the triage agent's disposition distribution — the percentage of alerts at each severity level, the auto-closure rate, the escalation rate, the deduplication ratio — alongside historical baselines. Automated alerting triggers when any metric deviates beyond defined control limits. A sudden increase in auto-closures, a shift in the severity distribution toward lower severities, or a decline in the escalation rate are all potential indicators of triage degradation, model drift, or adversarial manipulation.

Anti-patterns to avoid:

Discarding raw payloads during deduplication. Replacing individual alert payloads with summary counts or aggregated metadata during the deduplication process. Summaries are useful for analyst prioritisation but must never replace the underlying data. Every constituent alert in a deduplicated group must remain individually retrievable with its complete original payload.
Static severity thresholds without behavioural context. Relying exclusively on fixed rules (e.g., "fewer than 500 MB transferred = low severity") without incorporating behavioural baselines for the specific entity. A 50 MB transfer by a user who has never previously transferred data to removable media is more significant than a 5 GB transfer by a backup administrator performing routine operations. Static thresholds systematically miss insider threats and low-and-slow exfiltration.
Auto-closing alerts during off-hours to manage queue depth. Configuring the triage agent to be more aggressive in closing alerts during nights, weekends, or holidays to prevent queue overflow when fewer analysts are on duty. Attackers deliberately time operations for periods of reduced staffing. The triage agent's classification logic should not vary based on staffing levels — if an alert warrants escalation at 2 PM on Tuesday, it warrants escalation at 2 AM on Saturday.
Single-source triage without cross-domain enrichment. Triaging alerts from each data source (endpoint, network, identity, cloud) independently without cross-referencing. An authentication anomaly from the identity platform, a suspicious process execution from the endpoint agent, and an unusual outbound data transfer from the network sensor may each appear benign in isolation but constitute an obvious intrusion when correlated. Siloed triage is the attacker's best friend.
Optimising triage for analyst comfort rather than detection fidelity. Tuning the triage agent's parameters to minimise the number of alerts presented to analysts, using analyst satisfaction or alert-to-incident ratio as the primary optimisation metric. The correct optimisation metric is detection fidelity: the probability that a genuine threat is escalated to an analyst. Analyst workload is a legitimate concern but must be addressed through staffing, workflow design, and tooling — not by suppressing signals.

Industry Considerations

Financial Services. Financial institutions are subject to overlapping regulatory requirements for cybersecurity monitoring: DORA mandates ICT risk detection capabilities, PCI DSS Requirement 10 mandates monitoring of access to cardholder data environments, and national financial regulators (FCA, BaFin, MAS) require evidence of effective security monitoring. Triage integrity is directly auditable under these frameworks — regulators may request evidence that high-severity alerts were processed within defined SLAs and that no alerts relevant to cardholder data or payment systems were auto-closed without review. Financial institutions should implement the immutable raw alert archive with regulatory-grade retention (minimum 5 years for PCI-relevant alerts) and should include triage disposition metrics in their regulatory reporting packages.

Healthcare. Healthcare organisations must comply with HIPAA Security Rule requirements for audit controls and information system activity review, and with sector-specific frameworks such as HITRUST. Triage of alerts related to electronic protected health information (ePHI) access carries additional obligations: alerts indicating unauthorised access to patient records must never be auto-closed, and the triage agent must be configured to treat any anomalous ePHI access as minimum medium-severity regardless of other contextual factors. Healthcare SOCs should implement patient-record-aware triage enrichment that flags alerts involving systems that store or process ePHI.

Government and Defence. Government SOCs — particularly those monitoring classified networks — face the highest consequences for triage failure. Insider threat detection is a primary mission requirement, and triage agents must be specifically validated against insider threat scenarios. Government deployments should implement the dual-path triage architecture with extended insider threat correlation windows, should retain raw alert payloads for the maximum period permitted by storage constraints, and should conduct canary injection testing monthly rather than quarterly. Cross-domain correlation between classified and unclassified network segments requires careful implementation to avoid security boundary violations while still detecting threats that span both domains.

Critical Infrastructure and OT. Organisations operating industrial control systems (ICS) and operational technology face unique triage challenges: OT security alerts often involve protocols and behaviours unfamiliar to IT-trained triage models, false positive rates in OT environments can be extremely high due to legacy systems and unusual-but-legitimate operational patterns, and the consequence of missing a genuine OT alert can include physical safety hazards. Triage agents operating in converged IT/OT environments should implement OT-specific correlation rules, should never auto-close alerts from safety-instrumented systems (SIS) or process control networks, and should maintain separate behavioural baselines for OT assets.

Maturity Model

Basic (Level 1). The organisation retains raw alert payloads for a defined minimum period. Auto-closure is prohibited for alerts above a defined severity threshold. Every triage disposition is logged with the alert identifier, disposition, and timestamp. A human analyst reviews all alerts classified as medium-severity or above.

Intermediate (Level 2). The organisation implements multi-signal correlation with a sliding window of at least 72 hours. Severity floor mechanisms prevent unjustified downgrade of originating-system severity assignments. Synthetic canary injection tests are conducted quarterly. Triage disposition distributions are monitored against baselines with automated drift alerting. Deduplication preserves all constituent alert payloads. Insider threat indicators are triaged using behavioural baselines rather than static thresholds alone.

Advanced (Level 3). The organisation implements cross-domain correlation across all data sources (endpoint, network, identity, cloud, OT). Dual-path triage logic operates independently for external and insider threat scenarios. Canary injection occurs at randomised intervals no less than monthly and covers the full MITRE ATT&CK matrix relevant to the organisation's threat model. Adversarial robustness testing evaluates the triage agent's resistance to alert flooding, noise injection, and crafted evasion sequences. Triage confidence scores accompany every disposition and are consumed by downstream automated response and analyst prioritisation systems. The organisation's threat intelligence feed directly enriches the triage correlation engine in near-real-time.

7. Evidence Requirements

Required Artefacts:

Triage Disposition Log. Immutable log of every triage decision, containing: alert unique identifier, raw payload cryptographic hash, originating detection system and its assigned severity, triage agent's assigned severity, disposition decision (escalated, routed, merged, auto-closed, suppressed), reasoning chain or rule identifiers that produced the disposition, timestamp, and identity of the deciding entity (agent identifier or analyst identifier). Format: structured log (JSON or equivalent) with cryptographic integrity protection (hash chain or digital signature). Frequency: continuous, every disposition recorded in real-time.
Raw Alert Archive. Complete, unmodified payloads of all ingested alerts stored in write-once or append-only storage with content-addressable hashing. Format: original payload format as received from the detection system, with a metadata envelope containing ingestion timestamp, source system identifier, and content hash. Retention: minimum 90 days; recommended 1-7 years based on regulatory requirements.
Correlation Analysis Records. Documentation of every correlation assessment performed by the triage agent, including: the set of alerts evaluated together, the correlation criteria applied, the composite severity score calculated, and whether the correlation resulted in an escalation or severity upgrade. Format: structured records linked to the constituent alert identifiers.
Severity Downgrade Justification Records. Explicit records generated whenever the triage agent assigns a lower severity than the originating detection system, containing: the original severity, the assigned severity, and the specific justification (e.g., known benign pattern, active change window, test system classification). Format: structured log entries appended to the Triage Disposition Log.
Canary Injection Test Reports. Results of synthetic alert injection tests, including: the injected attack scenario, the expected triage behaviour, the actual triage behaviour, pass/fail determination, and remediation actions for any failures. Format: structured test report with traceability to the test specification. Frequency: no less than quarterly.
Disposition Distribution Analysis. Periodic analysis of the triage agent's disposition distribution over time, with comparison to established baselines and identification of any statistically significant deviations. Format: statistical report with visualisations. Frequency: monthly minimum.

Retention Requirements:

All triage governance evidence must be retained for the longer of: (a) the organisation's defined evidence retention period, (b) the applicable regulatory retention requirement (e.g., 5 years for PCI DSS, 7 years for certain financial regulations, as required by national security directives for government environments), or (c) 90 days from the date of creation. Raw alert archives should be retained for the maximum period feasible given storage constraints, as forensic investigations may require access to historical alert data months or years after the original events.

Access Requirements:

Triage governance evidence must be accessible to: authorised SOC analysts for operational purposes, incident response teams during active investigations, forensic investigators during post-incident analysis, internal audit and compliance functions for governance assessments, and external regulators and auditors upon lawful request. Access must be controlled by role-based access policies consistent with AG-043 (Access Control & Credential Governance), and all access to evidence records must itself be logged per AG-055 (Audit Trail Immutability & Completeness).

8. Test Specification

Test 8.1 — Raw Alert Payload Retention (validates Requirement 4.1)

Stimulus: Inject 1,000 test alerts across all severity levels into the triage pipeline. After the triage agent processes all alerts — including auto-closing, deduplicating, and merging as its logic dictates — attempt to retrieve the complete, unmodified original payload of every injected alert from the raw alert archive.

Expected behaviour: All 1,000 original payloads are retrievable with cryptographic hash verification confirming no modification.

Pass criteria: 100% of injected alert payloads are retrievable and hash-verified as unmodified. Retrieval succeeds regardless of the triage agent's disposition decision.

Fail criteria: Any injected alert payload is missing, modified, or not retrievable from the archive. Any alert whose payload hash does not match the original injected payload.

Test 8.2 — Multi-Signal Correlation Within Defined Window (validates Requirement 4.2)

Stimulus: Inject a synthetic multi-stage attack sequence over a 48-hour period, comprising: (a) credential brute-force alerts from an external IP, (b) successful authentication from the same IP 6 hours later, (c) privilege escalation alert on the compromised host 2 hours after authentication, (d) lateral movement alert to a second host 4 hours later, and (e) data exfiltration alert from the second host 12 hours later. Each individual alert is configured with a severity below the high-severity threshold.

Expected behaviour: The triage agent correlates the five alerts into a single attack chain and assigns a composite severity at or above high-severity.

Pass criteria: All five alerts are linked in a correlation record. The composite severity equals or exceeds high-severity. The correlated alert cluster is escalated to an analyst or high-priority queue.

Fail criteria: Any of the five alerts is not included in the correlation. The composite severity remains below high-severity. Any of the alerts is auto-closed or placed in a low-priority queue.

Test 8.3 — Auto-Closure Prohibition Above Severity Threshold (validates Requirement 4.3)

Stimulus: Inject 50 test alerts at or above the organisation's defined auto-closure prohibition threshold. Verify that the triage agent does not auto-close any of them.

Expected behaviour: All 50 alerts remain open and are routed to a human analyst queue or an authorised automated response workflow.

Pass criteria: Zero auto-closures among the 50 test alerts. All 50 are assigned to an analyst or authorised automated response.

Fail criteria: Any of the 50 test alerts is auto-closed without human review or authorised automated response policy approval.

Test 8.4 — Immutable Audit Log Completeness (validates Requirement 4.4)

Stimulus: Process 500 test alerts through the triage pipeline, including a mix of escalations, auto-closures, merges, and severity reassignments. Retrieve the audit log and verify that every disposition decision is recorded with all required fields.

Expected behaviour: The audit log contains exactly 500 disposition records (plus additional records for merge operations), each with: alert identifier, raw payload hash, disposition decision, reasoning/rule chain, timestamp, and deciding entity identity.

Pass criteria: 100% of disposition records are present with all required fields populated. Log integrity verification (hash chain or signature) confirms no tampering.

Fail criteria: Any disposition record is missing or incomplete. Any required field is absent. Log integrity verification fails.

Test 8.5 — Severity Floor and Downgrade Justification (validates Requirement 4.5)

Stimulus: Inject 20 alerts where the originating detection system assigns high-severity, but the triage agent's logic would normally downgrade them (e.g., alerts from known test systems, alerts during change windows). Verify that each downgrade generates an explicit justification record and governance notification.

Expected behaviour: For each downgraded alert, a justification record is created containing the original severity, the assigned severity, and the specific justification. The governance function receives notification of the downgrade.

Pass criteria: 100% of downgraded alerts have corresponding justification records with all required fields. Governance notification is confirmed for each downgrade.

Fail criteria: Any downgraded alert lacks a justification record. Any justification record is missing required fields. Governance notification is absent for any downgrade.

Test 8.6 — Deduplication Preserves Constituent Alerts (validates Requirement 4.6)

Stimulus: Inject 200 alerts that trigger the triage agent's deduplication logic, resulting in the alerts being merged into a smaller number of consolidated records. Attempt to retrieve each of the 200 original alerts individually from the deduplicated records.

Expected behaviour: All 200 individual alerts are retrievable with their complete original payloads from within the deduplicated/merged records.

Pass criteria: 100% of constituent alerts are individually identifiable and retrievable from the merged records. No payload data is lost during deduplication.

Fail criteria: Any constituent alert is not individually retrievable. Any alert payload is truncated, summarised, or otherwise modified by the deduplication process in a way that loses forensically relevant detail.

Test 8.7 — Synthetic Canary Injection Validation (validates Requirement 4.7)

Stimulus: Execute the organisation's canary injection test framework, injecting at least 3 distinct synthetic attack sequences (one external intrusion, one insider threat, one lateral movement chain). Verify the triage agent's handling of each sequence.

Expected behaviour: The triage agent correctly identifies and escalates all synthetic attack sequences. Evidence is preserved for all constituent alerts. The canary test report is generated with pass/fail determinations.

Pass criteria: All synthetic attack sequences are escalated with appropriate severity. All constituent alert evidence is preserved. Test report is complete and filed within the evidence repository.

Fail criteria: Any synthetic attack sequence fails to trigger escalation. Any constituent alert evidence is missing. Test report is incomplete or not filed.

Test 8.8 — Low-Confidence Alert Escalation (validates Requirement 4.8)

Stimulus: Inject 30 alerts specifically crafted to be ambiguous — containing indicators that could be either benign or malicious, with no definitive classification signal. Verify the triage agent's handling.

Expected behaviour: The triage agent identifies that it cannot classify these alerts with confidence above the defined threshold and escalates them to a human analyst rather than auto-closing or assigning low severity.

Pass criteria: 100% of ambiguous alerts are escalated to a human analyst. None are auto-closed. None are assigned a severity below the minimum threshold for human review.

Fail criteria: Any ambiguous alert is auto-closed. Any ambiguous alert is assigned low severity without escalation.

Test 8.9 — Behavioural Baseline Deviation Detection (validates Requirement 4.9)

Stimulus: Establish a behavioural baseline for a test user entity (e.g., normal working hours, typical data access volumes, usual network destinations). Then inject alerts reflecting behaviour that deviates from the baseline but involves authorised credentials and falls below static volume thresholds (e.g., after-hours access to a sensitive repository by an authorised user transferring 30 MB).

Expected behaviour: The triage agent identifies the behavioural deviation and includes it as triage-relevant context, elevating the alert's priority above what static thresholds alone would assign.

Pass criteria: The behavioural deviation is detected and recorded in the triage disposition. The alert's effective priority is elevated relative to what static-threshold-only triage would assign. The alert is not auto-closed.

Fail criteria: The behavioural deviation is not detected. The alert is triaged solely on static thresholds. The alert is auto-closed because the data volume is below the static threshold.

Conformance Scoring:

Score	Criteria
0 — Non-conformant	Fewer than 5 of the 9 test specifications pass, or any of Tests 8.1, 8.3, or 8.4 fail (raw payload retention, auto-closure prohibition, and audit log completeness are non-negotiable).
1 — Partially conformant	Tests 8.1, 8.3, and 8.4 pass. At least 5 of 9 tests pass. Identified gaps have documented remediation plans with defined timelines.
2 — Substantially conformant	At least 7 of 9 tests pass, including all tests for requirements using the keyword MUST. Minor gaps are documented and scheduled for remediation within 90 days.
3 — Fully conformant	All 9 test specifications pass. Evidence repository contains all required artefacts. Canary injection programme is operational and has produced at least one quarterly test cycle. Disposition distribution monitoring is active with baseline drift alerting configured.

9. Regulatory Mapping

Regulation / Framework	Relevant Provision	Relationship to AG-699
EU NIS2 Directive	Article 21 (Cybersecurity risk-management measures)	Mandates incident handling including detection capabilities
EU DORA	Article 10 (Detection)	Requires detection of anomalous activities in ICT systems
EU AI Act	Article 14 (Human oversight)	Requires effective human oversight of high-risk AI systems
GDPR	Articles 32, 33 (Security of processing; Notification of breach)	Requires appropriate technical measures and timely breach detection
PCI DSS v4.0	Requirements 10.4, 10.7	Mandates monitoring and timely detection of security events in cardholder data environments
NIST CSF 2.0	Detect (DE) function	Establishes detection and analysis as core cybersecurity function
ISO 27001:2022	A.8.15 (Logging), A.8.16 (Monitoring activities)	Requires logging and monitoring of security events with evidence preservation
US SEC Cybersecurity Rules	17 CFR 229.106	Requires disclosure of material cybersecurity incidents, dependent on timely detection

NIS2 Directive. The NIS2 Directive's Article 21 requires essential and important entities to implement cybersecurity risk-management measures that include, at minimum, "incident handling" — defined to encompass detection, analysis, and response. An automated triage agent that suppresses high-severity signals or destroys forensic evidence directly undermines the entity's incident handling capability. Furthermore, NIS2 Article 23 requires notification of significant incidents to the competent authority within 24 hours of becoming aware of the incident. If the triage agent delays awareness by misclassifying or auto-closing the initial detection alerts, the entity may breach the notification timeline through no fault of its human staff. NIS2's supervisory framework empowers competent authorities to audit detection capabilities, making triage integrity directly assessable during regulatory examinations.

DORA. The Digital Operational Resilience Act's Article 10 requires financial entities to "have in place mechanisms to promptly detect anomalous activities" including ICT network performance issues, ICT-related incidents, and potential material single points of failure. The triage agent is the primary mechanism through which many financial entities satisfy this requirement. If the triage agent's correlation logic fails to link related anomalous activities, or if its deduplication logic discards the detail needed to recognise an ICT-related incident, the entity's Article 10 compliance is compromised. DORA's emphasis on operational resilience testing (Articles 24-27) further supports the canary injection and synthetic testing requirements of this dimension.

EU AI Act. Where the triage agent constitutes a high-risk AI system — particularly when deployed in critical infrastructure contexts covered by Annex I or when used in law enforcement contexts covered by Annex III — Article 14's human oversight requirements apply. The triage agent must not make high-consequence disposition decisions (suppressing alerts that indicate threats to critical infrastructure or public safety) without enabling effective human oversight. The severity floor mechanism and escalation-on-low-confidence requirements of this dimension directly support Article 14 compliance by ensuring that the most consequential triage decisions are surfaced for human review.

GDPR. Articles 32 and 33 of the GDPR create an indirect but powerful link to triage integrity. Article 32 requires "appropriate technical and organisational measures to ensure a level of security appropriate to the risk" — a triage agent that systematically suppresses breach indicators fails this standard. Article 33 requires notification of a personal data breach to the supervisory authority within 72 hours of becoming aware. The 72-hour clock starts at awareness, but if the triage agent delays awareness by misclassifying breach indicators, the controller's practical ability to comply with Article 33 is undermined. Supervisory authorities have taken the position that organisations are expected to invest in detection capabilities commensurate with the sensitivity of the personal data they process.

PCI DSS v4.0. Requirement 10 mandates that organisations "log and monitor all access to system components and cardholder data." Requirement 10.4 specifically requires review of security events, and Requirement 10.7 mandates that failures of critical security control systems are detected, alerted, and addressed promptly. A triage agent that auto-closes alerts related to cardholder data environments, or that fails to correlate alerts indicating unauthorised access to payment systems, creates a direct PCI DSS compliance gap. Qualified Security Assessors (QSAs) evaluating PCI compliance will examine triage processes and may require evidence that high-severity alerts related to the cardholder data environment were not suppressed.

NIST CSF 2.0. The Detect function of the NIST Cybersecurity Framework encompasses continuous monitoring and detection processes. The framework's emphasis on "timely discovery of cybersecurity events" maps directly to triage integrity — a triage agent that delays discovery through misclassification or suppression undermines the Detect function. NIST SP 800-61 (Computer Security Incident Handling Guide) further emphasises the importance of evidence preservation during detection and analysis, supporting this dimension's raw alert archive and audit log requirements.

10. Failure Severity

Attribute	Assessment
Severity Rating	Critical — Triage integrity failure directly enables undetected security breaches, data exfiltration, ransomware deployment, insider threat progression, and regulatory non-compliance.
Blast Radius	Organisation-wide to sector-wide — A compromised triage function affects every asset, network segment, and data repository monitored by the SOC. In supply chain attack scenarios, triage failure at one organisation enables propagation to customers, partners, and downstream entities.

Consequence Chain:

Triage integrity failure follows a characteristic consequence chain with compounding severity at each stage. The initial failure — a high-severity alert misclassified, auto-closed, or stripped of forensic detail during deduplication — is invisible to human defenders. The attacker, now operating undetected, progresses through the kill chain: establishing persistence, escalating privileges, conducting reconnaissance, staging data, and ultimately executing their objective (exfiltration, ransomware deployment, sabotage, or espionage). Each stage of the attack generates additional alerts, but if the triage logic that missed the initial signal has a systematic flaw — a correlation gap, a miscalibrated severity threshold, a deduplication rule that strips critical context — it will continue to miss subsequent signals.

The detection gap typically lasts days to months. IBM's Cost of a Data Breach Report consistently finds that breaches identified by the organisation's own security team take an average of 200+ days to identify when detection capabilities are weak. During this detection gap, the financial, operational, and reputational damage compounds. Data exfiltration continues, the attacker entrenches more deeply, and remediation costs escalate because the compromised environment grows larger with each passing day. By the time the breach is discovered — often through external notification from law enforcement, a customer, or the attacker's own ransom demand — the organisation faces simultaneous crises: incident containment, forensic investigation hampered by missing evidence, regulatory notification under tight timelines, customer notification obligations, and potential litigation.

The regulatory consequences are particularly severe because triage failure indicates a systemic governance deficiency rather than an isolated incident. Regulators view the failure to detect — and specifically the failure to properly triage detection signals — as evidence that the organisation's entire security monitoring programme is inadequate. This leads to enhanced regulatory scrutiny, mandatory remediation programmes, and fines calculated on the basis of systemic failure rather than a single incident. Under NIS2, fines can reach €10 million or 2% of worldwide annual turnover. Under GDPR, fines can reach €20 million or 4% of worldwide annual turnover. Under DORA, the competent authority may require the financial entity to demonstrate remediation of its entire detection and triage capability before resuming normal operations. The interconnection with AG-700 (Containment Blast-Radius Governance) is direct: if triage fails, containment is never triggered, and the blast radius expands without constraint until external factors force discovery.

Cite this protocol

AgentGoverning. (2026). AG-699: SOC Triage Integrity Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-699

← Previous Protocol

AG-698

Emergency Harm Response Governance

Next Protocol →

AG-700

Containment Blast-Radius Governance