AG-486: Model-to-Order Traceability Governance

2. Summary

Model-to-Order Traceability Governance requires that every order generated by an AI-driven trading system be traceable backward through the complete causal chain: from the executed order to the specific model version, policy configuration, input data, feature values, signal scores, decision logic, and evidentiary basis that produced it. The traceability chain must be reconstructable after the fact, enabling a regulator, auditor, or compliance officer to answer the question "Why did this order exist?" with a complete, verifiable account that begins at the model layer and ends at the order. Without this chain, the organisation cannot demonstrate that orders were generated for legitimate reasons, cannot investigate potential market abuse, and cannot satisfy the regulatory requirement to explain algorithmic trading decisions.

3. Example

Scenario A -- Regulator Requests Explanation for Suspicious Order Pattern; Firm Cannot Reconstruct Causal Chain: A European equity market regulator detects a pattern of orders placed by a firm's AI trading agent: 847 orders in a single equity over a 22-minute period, 91% of which are cancelled within 400 milliseconds. The pattern resembles layering -- a form of market manipulation where orders are placed without intent to execute, designed to move the price. The regulator issues an information request under Article 16 of the Market Abuse Regulation (MAR), requiring the firm to explain the purpose and rationale for each order. The firm's trading system logs contain the order details (price, quantity, timestamp, venue) but not the causal chain. The model that generated the orders was a reinforcement learning agent optimising for execution quality on a $28 million parent order. The cancellations were the agent legitimately adjusting to quote updates, not layering. However, the firm cannot demonstrate this because: (a) the model version that was live during the 22-minute window was not recorded, (b) the input data (order book snapshots, quote feeds) that the model consumed were not retained, (c) the signal scores and decision logic that produced each order were not logged, and (d) the policy parameters (aggressiveness, participation rate, urgency) that constrained the model were not linked to the orders. The firm's best explanation is "our AI decided to place and cancel these orders" -- which is indistinguishable from "we cannot rule out layering."

What went wrong: The system recorded the outputs (orders) but not the inputs, intermediate computations, or decision rationale that produced them. No traceability chain existed from order to model to evidence. The firm could not distinguish legitimate algorithmic behaviour from market manipulation because it had no record of why orders were generated. Consequence: MAR Article 16 non-compliance finding, regulatory investigation costing the firm $4.2 million in legal and compliance fees, temporary suspension of algorithmic trading permissions pending remediation, reputational damage with the national competent authority.

Scenario B -- Model Version Mismatch Causes Unexplained Trading Losses; Post-Incident Investigation Fails: A fixed-income trading desk deploys an AI pricing model that generates continuous two-way quotes on 230 corporate bonds. During a routine model update at 08:15, the new model version (v3.7.2) is deployed to 180 bonds, but due to a deployment race condition, 50 bonds continue running on the previous version (v3.7.1). Version 3.7.1 has a known bias in credit spread estimation that was corrected in v3.7.2. Over the next 6 hours, the 50 bonds running v3.7.1 generate quotes with systematically mispriced credit spreads. Counterparties exploit the mispricing, executing $14.3 million in trades against the firm's quotes. The desk notices the losses at 14:30 and halts quoting. During the post-incident investigation, the firm cannot determine which bonds were running which model version during which time windows because the order records do not contain model version identifiers. The firm cannot calculate the exact loss attributable to the version mismatch versus normal market movement. The incident response takes 11 days instead of the expected 2 days because investigators must reconstruct the model-to-order mapping from deployment logs, container timestamps, and order timestamps -- a forensic exercise that produces probabilistic rather than definitive results.

What went wrong: Orders did not carry model version identifiers. The deployment system did not record which model version was serving which instrument at each point in time. The traceability chain was broken at the model version link -- orders existed, but their provenance was ambiguous. Consequence: $14.3 million in trading losses with uncertain attribution, 11-day investigation instead of 2-day, inability to calculate precise P&L impact for financial reporting, external auditor qualification of the trading loss disclosure.

Scenario C -- Cross-Agent Strategy Produces Orders That No Single Model Can Explain: A crypto trading operation runs three AI agents in a coordinated strategy: Agent Alpha generates directional signals based on on-chain analytics, Agent Beta converts signals into order parameters based on liquidity analysis, and Agent Gamma executes orders across five decentralised exchanges with timing optimisation. A regulatory inquiry asks the firm to explain a series of large market orders that moved the price of a token by 8.3% over 15 minutes, triggering liquidations of $6.7 million in leveraged positions on a lending protocol. The firm attempts to reconstruct the causal chain: Agent Gamma executed the orders, but its decision was based on parameters from Agent Beta, which were based on signals from Agent Alpha. Agent Alpha's signal was based on whale wallet movement data that suggested imminent selling pressure -- the agent was front-running a detected large seller. The firm can produce Agent Gamma's execution logs and Agent Beta's parameterisation logs separately, but cannot produce an end-to-end trace linking the whale wallet observation (Alpha's input) through the signal generation (Alpha's output / Beta's input) through the parameterisation (Beta's output / Gamma's input) to the final orders (Gamma's output). Each agent's logs reference its own inputs and outputs, but the cross-agent links are not recorded. The regulatory inquiry cannot be satisfied because no single trace spans the entire causal chain.

What went wrong: Each agent maintained its own logs but no cross-agent traceability existed. The causal chain spanned three agents with three separate logging systems, and no correlation identifiers linked the chain end-to-end. The firm could explain each agent's behaviour in isolation but could not explain the system's behaviour as a whole. Consequence: Regulatory finding for inadequate record-keeping, potential market manipulation charge for front-running (which the firm cannot definitively refute due to incomplete records), $3.8 million in legal costs, mandated overhaul of the multi-agent logging architecture.

4. Requirement Statement

Scope: This dimension applies to any AI agent deployment that generates, modifies, cancels, or influences orders in financial markets -- including equity, fixed income, foreign exchange, commodity, and cryptocurrency markets, whether executed on regulated exchanges, over-the-counter venues, or decentralised protocols. The scope covers all orders regardless of whether they are executed, partially filled, or cancelled -- because cancelled orders are the primary subject of layering and spoofing investigations, and their traceability is as important as executed orders. The scope extends to multi-agent systems where the order generation process spans multiple agents, each contributing a link in the causal chain. Any system where an AI model's output directly or indirectly results in an order being submitted to a market is within scope. The traceability chain must extend from the final order backward through every intermediate decision, transformation, and data input to the originating model, policy, and evidence.

4.1. A conforming system MUST record, for every order generated by an AI agent, a traceability record that links the order to: the specific model version that produced or influenced the order, the policy configuration in effect at the time, the input data consumed by the model, the feature values and signal scores computed from the input data, and the decision logic path that selected the order parameters.

4.2. A conforming system MUST assign a unique, immutable trace identifier to each order's causal chain, enabling reconstruction of the complete chain from order to model to evidence in a single query or retrieval operation.

4.3. A conforming system MUST record the model version identifier (including version number, deployment timestamp, and cryptographic hash of the model artefact) for each order, such that the exact model that generated the order can be identified and, if retained, re-executed against the same inputs to reproduce the decision.

4.4. A conforming system MUST retain the input data consumed by the model for each order -- including market data snapshots, reference data, position state, and any external data feeds -- for a period sufficient to satisfy regulatory retention requirements and to enable post-hoc reconstruction.

4.5. A conforming system MUST, in multi-agent systems where the order generation chain spans multiple agents, implement end-to-end trace correlation that links each agent's contribution to the final order through a shared trace identifier, enabling reconstruction of the complete cross-agent causal chain.

4.6. A conforming system MUST protect the traceability chain against tampering, ensuring that records cannot be altered, deleted, or reordered after the fact, in accordance with AG-006 (Tamper-Evident Record Integrity).

4.7. A conforming system MUST be capable of producing a human-readable explanation of any order's causal chain within the timeframe required by applicable regulatory information requests (recommended: within 72 hours for standard requests, within 24 hours for urgent requests).

4.8. A conforming system SHOULD implement automated traceability chain completeness verification that continuously checks for broken links -- orders without model references, model outputs without input data, or cross-agent handoffs without correlation identifiers -- and alerts when gaps are detected.

4.9. A conforming system SHOULD retain model artefacts (trained model files, configuration, and dependencies) for each model version referenced in the traceability chain, enabling model re-execution for forensic analysis.

4.10. A conforming system MAY implement real-time traceability dashboards that allow compliance officers to inspect the causal chain of any recent order without requiring engineering assistance.

5. Rationale

The obligation to explain why an order was placed is foundational to market integrity regulation globally. The Market Abuse Regulation (MAR) in the EU requires firms to detect, prevent, and report suspicious order patterns. The detection of market manipulation -- layering, spoofing, front-running, wash trading -- depends on understanding the intent behind orders. When orders are generated by human traders, the trader can be questioned and their intent assessed. When orders are generated by AI agents, the only way to assess intent is through records that reconstruct the decision process. Without model-to-order traceability, every order generated by an AI agent is an order whose purpose cannot be demonstrated.

This is not a theoretical risk. Regulatory investigations of algorithmic trading increasingly focus on the decision rationale behind order patterns. The pattern of placing and rapidly cancelling orders is not inherently manipulative -- legitimate execution algorithms do this routinely to manage information leakage and respond to changing market conditions. But the distinction between legitimate and manipulative cancellation patterns lies entirely in intent, and intent can only be demonstrated through records of the decision process. A firm that cannot reconstruct why its AI agent placed and cancelled 847 orders in 22 minutes cannot distinguish itself from a firm engaged in layering. The regulatory burden of proof has effectively shifted: firms must demonstrate that their orders were legitimate, not merely assert it.

Multi-agent architectures amplify the traceability challenge. When a single model generates an order, the traceability chain has two links: input data to model, model to order. When three agents collaborate -- one generating signals, one parameterising, one executing -- the chain has six links, and a break at any link renders the chain incomplete. The complexity is multiplicative, not additive, because each agent may process multiple inputs from multiple upstream agents, creating a directed acyclic graph of causal relationships rather than a simple linear chain. End-to-end trace correlation (Requirement 4.5) addresses this by mandating a shared trace identifier that spans the entire graph.

The retention of input data (Requirement 4.4) is particularly important and frequently neglected. Firms often retain the model and the order but discard the market data snapshots, reference data, and position state that the model consumed. Without this data, the model cannot be re-executed, and the traceability chain becomes "the model generated this order based on data we no longer have" -- which is functionally equivalent to no traceability at all. Input data retention is expensive in terms of storage, but the cost of being unable to respond to a regulatory investigation is categorically higher.

The tamper-evidence requirement (4.6) exists because traceability records that can be altered after the fact have no evidentiary value. If a firm could retroactively insert a plausible decision rationale for a suspicious order, the traceability chain would be meaningless. The records must be immutable -- once written, they cannot be changed. This aligns with AG-006 and with regulatory expectations for record integrity. Blockchain-based immutability, append-only databases, cryptographic hash chains, or write-once storage all satisfy this requirement; the implementation mechanism is less important than the guarantee.

The human-readable explanation requirement (4.7) bridges the gap between technical traceability and regulatory utility. A traceability chain stored as model weights, tensor computations, and feature vectors is technically complete but practically useless to a regulator who needs to understand why an order was placed. The system must be capable of producing an explanation that a compliance officer or regulator can understand -- not a full technical reproduction, but a narrative that identifies the key inputs, the relevant model signals, the policy constraints that shaped the decision, and the outcome. This requirement does not mandate explainable AI in the technical sense; it mandates that the traceability records be translatable into a human-comprehensible explanation.

6. Implementation Guidance

Model-to-order traceability requires a logging architecture that captures the complete causal chain at the time each order is generated -- not a forensic reconstruction after the fact. The core principle is that every order should carry its own provenance, embedded at creation time, not inferred later.

Recommended patterns:

Order provenance envelope. Every order submitted to a market carries a provenance envelope -- a structured metadata record attached to the order at generation time. The envelope contains: trace identifier, model version hash, policy configuration hash, timestamp, input data reference (pointer to retained input data), feature summary (key feature values that drove the decision), signal scores, decision path identifier, and parent trace identifier (for multi-agent chains). The envelope is stored alongside the order record and is indexed by trace identifier for retrieval. The envelope is not transmitted to the market (order protocols do not support arbitrary metadata) -- it is retained internally and linked to the order by the firm's order identifier.
Input data snapshot-and-reference. At the moment the model processes inputs to generate an order, take a snapshot of all relevant input data: market data (order book state, last trade, VWAP), reference data (instrument attributes, corporate actions), position state (current holdings, pending orders, cash), and external signals (news sentiment, flow indicators). Store the snapshot in an immutable data store with a unique reference identifier. Embed that reference in the order provenance envelope. The snapshot is the "evidence" in the model-to-order chain -- it answers the question "what did the model see when it made this decision?" For high-frequency systems generating thousands of orders per second, implement snapshot deduplication: if the input data has not changed since the last snapshot, reference the previous snapshot rather than creating a duplicate.
Cross-agent trace propagation. In multi-agent systems, implement trace propagation modelled on distributed tracing standards (analogous to the W3C Trace Context specification). When Agent Alpha generates a signal and passes it to Agent Beta, the signal carries a trace identifier. Agent Beta includes Alpha's trace identifier as a parent reference in its own trace. Agent Gamma includes Beta's trace identifier as a parent reference. The result is a trace tree that can be traversed from any leaf (the final order) to the root (the original signal or observation). Each agent logs its own contribution to the trace -- inputs received, processing performed, outputs generated -- and the trace identifiers link the logs across agents.
Model version registry with artefact retention. Maintain a registry of all model versions that have been deployed to production, with: version identifier, deployment timestamp, retirement timestamp, cryptographic hash of the model artefact, training data reference, hyperparameters, and validation metrics. Retain the model artefact itself (the trained model file) for the full regulatory retention period. This enables forensic re-execution: given the retained input data snapshot and the retained model artefact, an investigator can re-run the model to verify that the recorded decision is consistent with the model's actual behaviour on those inputs.
Automated chain integrity verification. Implement a continuous background process that samples recent orders and verifies chain completeness: the order references a valid provenance envelope, the envelope references a valid input data snapshot, the snapshot data is retrievable, the model version hash references a registered model, and (for multi-agent chains) all parent traces resolve to valid upstream records. Flag any broken chain for immediate investigation. Run this verification at sufficient sampling rate to detect systematic gaps (recommended: 100% for the first 30 days of deployment, then minimum 10% ongoing).

Anti-patterns to avoid:

Order-only logging. Recording order details (price, quantity, timestamp, venue) without the causal chain. This satisfies basic transaction reporting requirements but provides zero traceability for decision rationale. It is the single most common failure pattern in AI trading systems.
Retroactive trace reconstruction. Attempting to reconstruct the causal chain after the fact by correlating order timestamps with model deployment logs and market data archives. This approach is forensically weak (it produces probabilistic rather than definitive associations), extremely labour-intensive (days or weeks per investigation), and fails entirely when multiple model versions were running concurrently.
Aggregated rather than per-order traceability. Recording the model version and policy configuration at the strategy level (e.g., "Strategy X ran model v3.7.2 from 08:15 to 14:30") rather than at the per-order level. This fails when model versions change mid-session, when multiple model instances serve different instruments, or when policy parameters are adjusted dynamically.
Input data discarding. Retaining model outputs and order records but discarding the input data after processing. Without input data, the model cannot be re-executed, and the traceability chain becomes a record of what the model decided without evidence of what it saw. Regulators will treat this as a material gap.
Siloed agent logging without cross-agent correlation. Each agent logs its own decisions independently with no shared trace identifiers. This creates N complete but disconnected logs that cannot be linked into an end-to-end chain. The effort to manually correlate these logs after the fact is prohibitive for any non-trivial investigation.

Industry Considerations

Traditional Equity and Fixed Income Markets. MiFID II and Regulation 2017/589 (RTS 6) impose specific record-keeping requirements for algorithmic trading, including the obligation to maintain "sufficient records" of algorithmic trading activity. RTS 6 Article 28 requires firms to keep records of "each decision to deal generated by an algorithm," including "the time of the decision, the person or the algorithm responsible, and the intended venue." Model-to-order traceability extends these requirements to the full causal chain, providing the evidence needed to demonstrate compliance with best execution obligations (AG-481) and to respond to market abuse investigations.

Cryptocurrency and Decentralised Finance. DeFi trading introduces additional traceability challenges: orders may be submitted as blockchain transactions that are publicly visible, but the on-chain record contains only the transaction parameters, not the decision rationale. Off-chain traceability records must be maintained and linked to on-chain transaction hashes. For atomic swap and cross-chain operations, the trace must span multiple blockchains with different transaction identifier formats. Organisations should implement a unified trace layer that maps internal trace identifiers to on-chain transaction hashes across all chains where the agent operates.

Cross-Border Operations. Different jurisdictions impose different retention periods, different granularity requirements, and different formats for order record-keeping. The traceability system must be configurable to satisfy the most stringent applicable requirement across all jurisdictions where orders are placed. Particular attention is needed for jurisdictions that require records to be stored within their borders -- the traceability data store may need to be replicated across regions with appropriate data sovereignty controls.

Maturity Model

Basic Implementation -- Orders are logged with model version identifiers and policy configuration references. Input data snapshots are taken for a subset of orders (e.g., orders exceeding a value threshold). Traceability is reconstructable but requires manual effort. Model artefacts are retained for the current and previous version. Single-agent systems only. Limitations: no cross-agent trace correlation; no automated chain verification; input data retention is incomplete; human-readable explanation generation requires engineering support.

Intermediate Implementation -- Every order carries a complete provenance envelope with trace identifier, model version, policy hash, input data reference, and feature summary. Input data snapshots are taken for all orders and retained for the full regulatory period. Cross-agent trace propagation is implemented using shared trace identifiers. Automated chain integrity verification runs continuously at minimum 10% sampling. Model artefacts are retained for all versions deployed in the retention period. Human-readable explanation generation is available to compliance officers through a self-service interface.

Advanced Implementation -- All intermediate capabilities plus: 100% automated chain integrity verification with real-time alerting for broken chains. Model re-execution capability allows forensic reproduction of any order's decision given the retained model artefact and input data snapshot. Real-time traceability dashboards allow compliance officers to inspect any order's complete causal chain within seconds. Cross-agent trace visualisation shows the full directed acyclic graph of agent contributions for multi-agent orders. Independent third-party audit of traceability completeness is performed annually. Traceability records are independently attested (e.g., through periodic hash publication to an independent ledger).

7. Evidence Requirements

Required artefacts:

Traceability architecture document. Technical specification describing the traceability chain structure, the provenance envelope schema, the input data snapshot mechanism, the cross-agent trace propagation protocol, and the storage and retention architecture. Must include data flow diagrams showing how each link in the chain is captured.
Sample traceability chain reconstructions. For a statistically representative sample of orders (minimum 50 per quarter), complete end-to-end chain reconstructions from order to model to evidence, demonstrating that the chain is complete and the decision can be explained.
Chain integrity verification reports. Output from automated chain completeness verification, showing the percentage of orders with complete chains, the types and frequency of any detected gaps, and remediation actions taken.
Model version registry extract. A current extract of the model version registry showing all versions deployed during the retention period, with deployment and retirement timestamps, cryptographic hashes, and artefact retention status.
Input data retention confirmation. Evidence that input data snapshots are being retained for the full regulatory retention period, including storage volume metrics, oldest retained snapshot date, and integrity verification results.
Regulatory response capability demonstration. Evidence that the system can produce a human-readable explanation of a randomly selected order's causal chain within the defined timeframe (72 hours standard, 24 hours urgent).
Incident log. Any incidents where the traceability chain was found to be broken, incomplete, or inaccurate, including root cause analysis and remediation.

Retention requirements:

Traceability records (provenance envelopes, input data snapshots, model artefacts, chain integrity reports): minimum 7 years for regulated financial services operating under MiFID II or equivalent; minimum 5 years for other regulated sectors; minimum 3 years otherwise. Input data snapshots for high-frequency strategies may use compressed or summarised formats after 1 year, provided the summary is sufficient to explain the decision rationale.

Access requirements:

Producible to regulators, auditors, or compliance officers within 72 hours of request for standard inquiries. Within 24 hours for urgent regulatory investigations. Real-time access for compliance officers to recent orders (last 30 days) through self-service interfaces at Intermediate maturity and above.

8. Test Specification

Test 8.1: Single-Order Traceability Chain Completeness

Stimulus: Select a random order generated by the AI trading agent within the last 24 hours. Using only the order identifier, retrieve the complete traceability chain: order record, provenance envelope, model version, policy configuration, input data snapshot, feature values, signal scores, and decision path.
Expected behaviour: The complete chain is retrievable in a single operation (or a defined sequence of operations using the trace identifier). Every link in the chain is present and references valid records.
Pass criteria: All chain links are present: order references provenance envelope, envelope references model version (verifiable against the model registry), envelope references input data snapshot (retrievable and intact), envelope contains feature values and signal scores, and the chain is navigable from order to root cause.
Fail criteria: Any link in the chain is missing, references a non-existent record, or cannot be retrieved.

Test 8.2: Model Version Accuracy Verification

Stimulus: For a sample of 20 orders spanning a model version transition (10 orders before the transition, 10 after), verify that each order's recorded model version matches the model version that was actually deployed at the time the order was generated.
Expected behaviour: Orders generated before the transition reference the old model version. Orders generated after the transition reference the new model version. No order references a model version that was not deployed at the time of order generation.
Pass criteria: 100% of sampled orders reference the correct model version as confirmed by deployment logs and model registry timestamps.
Fail criteria: Any order references an incorrect model version, or the model version cannot be verified against deployment records.

Test 8.3: Input Data Snapshot Retention and Integrity

Stimulus: Select an order generated at the boundary of the retention period (e.g., an order from 6 years and 11 months ago for a 7-year retention requirement). Retrieve the associated input data snapshot. Verify that the snapshot is complete (all expected data fields are present) and intact (the cryptographic hash matches the hash recorded at creation time).
Expected behaviour: The input data snapshot is retrievable, complete, and its integrity is verifiable through the recorded cryptographic hash.
Pass criteria: Snapshot retrieved successfully. All expected data fields present. Cryptographic hash matches the creation-time hash recorded in the provenance envelope.
Fail criteria: Snapshot is not retrievable, is incomplete (missing data fields), or fails integrity verification (hash mismatch).

Test 8.4: Cross-Agent Trace Correlation

Stimulus: In a multi-agent system, select an order generated by the final execution agent. Using the trace identifier, traverse the complete cross-agent causal chain backward to the originating signal or observation. Verify that each agent's contribution is recorded and linked.
Expected behaviour: The trace tree is fully navigable from the execution order to the originating signal. Each agent's inputs, processing, and outputs are recorded with parent trace references that link to the upstream agent's records.
Pass criteria: Complete end-to-end trace from execution order to originating input, with every inter-agent handoff linked by trace identifiers. No orphaned segments (agent records with no parent or child references where one is expected).
Fail criteria: The trace chain is broken at any inter-agent boundary. Any agent's contribution is missing or not linked to the upstream or downstream agent's records.

Test 8.5: Human-Readable Explanation Generation

Stimulus: Provide the compliance team with 5 randomly selected order identifiers and request a human-readable explanation for each. Measure the time from request to delivery. Evaluate the explanation for completeness (does it identify the model, the key inputs, the signal, the policy constraints, and the decision outcome?).
Expected behaviour: Explanations are produced within the defined timeframe (72 hours for standard). Each explanation identifies: the model version, the key market data inputs that drove the decision, the signal or score that triggered the order, the policy constraints that shaped the order parameters, and the final order details.
Pass criteria: All 5 explanations delivered within the timeframe. Each explanation covers all required elements and is comprehensible to a non-technical reader (compliance officer or regulator).
Fail criteria: Any explanation exceeds the timeframe, omits required elements, or is incomprehensible without engineering interpretation.

Test 8.6: Tamper-Evidence of Traceability Records

Stimulus: Attempt to modify a traceability record after creation: (a) alter the model version reference in a provenance envelope, (b) modify a value in an input data snapshot, (c) insert a new provenance envelope with a backdated timestamp. Verify that all tampering attempts are detected.
Expected behaviour: Modification attempts are either blocked (write-once storage) or detected through tamper-evidence mechanisms (cryptographic hash chain verification fails). Backdated insertion is detected through timestamp integrity checks or append-only storage constraints.
Pass criteria: All three tampering attempts are detected or blocked. The tamper-evidence mechanism correctly identifies the altered records. No undetected modification is possible.
Fail criteria: Any tampering attempt succeeds without detection. The tamper-evidence mechanism fails to flag the altered record.

Test 8.7: Automated Chain Integrity Verification

Stimulus: Deliberately introduce 10 broken chain links into a test environment: 3 orders with no provenance envelope, 3 envelopes referencing non-existent model versions, 2 envelopes referencing non-existent input data snapshots, and 2 multi-agent traces with broken cross-agent links. Run the automated chain integrity verification process.
Expected behaviour: The verification process detects all 10 broken links, correctly categorises each type of break, and generates alerts.
Pass criteria: All 10 broken links detected and correctly categorised. Alerts generated within the defined monitoring interval. Zero false negatives.
Fail criteria: Any broken link is not detected, is incorrectly categorised, or does not generate an alert.

Conformance Scoring

Score 0: No model-to-order traceability exists. Orders are recorded but cannot be linked to the model, inputs, or decision rationale that produced them. The firm cannot explain why any AI-generated order was placed.
Score 1: Partial traceability exists -- orders are linked to model versions and policy configurations, but input data is not retained, cross-agent traces are not correlated, or the chain requires manual reconstruction with engineering assistance.
Score 2: Complete traceability exists for all orders -- provenance envelopes with model version, policy, input data reference, and feature summary. Cross-agent trace correlation is implemented. Automated chain integrity verification runs continuously. Human-readable explanations are producible within the regulatory timeframe.
Score 3: All Score 2 capabilities plus: model re-execution capability for forensic reproduction, real-time traceability dashboards for compliance self-service, 100% automated chain verification with zero detected gaps, independent third-party audit of traceability completeness, and traceability records independently attested through external mechanisms.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 12 (Record-Keeping)	Direct requirement
MiFID II	Article 17 (Algorithmic Trading), RTS 6 Article 28	Direct requirement
SOX	Section 302/404 (Internal Controls)	Supports compliance
FCA SYSC	6.1 (Compliance), 10A (Recording of Telephone Conversations and Electronic Communications)	Direct requirement
NIST AI RMF	MEASURE 2.6, GOVERN 1.5 (Transparency and Documentation)	Supports compliance
ISO 42001	6.1.2 (AI Risk Assessment), 9.1 (Monitoring and Measurement)	Supports compliance
DORA	Article 10 (Detection), Article 17 (ICT-related Incident Reporting)	Direct requirement

EU AI Act -- Article 12 (Record-Keeping)

Article 12 requires that high-risk AI systems be designed and developed with capabilities enabling the automatic recording of events (logs) while the system is operating. The logs must enable the tracing of the AI system's operation throughout its lifecycle. For AI-driven trading systems classified as high-risk, this translates directly to model-to-order traceability: the system must automatically record the events (inputs, model decisions, order generation) that enable tracing from any output (order) back to its origin (model input and decision). Article 12(2) specifies that the logging capabilities must enable "the monitoring of the operation of the high-risk AI system" and be "in accordance with recognised standards or common specifications." Model-to-order traceability with structured provenance envelopes and input data snapshots satisfies this requirement with a high degree of specificity.

MiFID II -- Article 17 and RTS 6 Article 28

MiFID II Article 17 requires investment firms using algorithmic trading to maintain systems and risk controls suitable to the business. Commission Delegated Regulation 2017/589 (RTS 6) elaborates in Article 28, requiring firms to keep "sufficient records of the matters set out in Annex I" for algorithmic trading, including: "the time of each decision to deal generated by an algorithm," "the person or the algorithm responsible for the investment decision," "the identification of the algorithm and the algorithm parameters generating the order," and "the intended execution venue." Model-to-order traceability extends these requirements to include the full causal chain -- not just that an algorithm generated the order, but why the algorithm generated it, based on what inputs, with what signal scores, under what policy constraints. This deeper traceability is increasingly expected by national competent authorities when investigating suspicious order patterns, even if not explicitly mandated in the current RTS text.

SOX -- Section 302/404 (Internal Controls)

For publicly listed firms, SOX requires effective internal controls over financial reporting. AI-driven trading activity directly affects the firm's financial position. Model-to-order traceability is an internal control that provides: (a) the ability to verify that reported trading gains and losses were generated by authorised models operating within approved parameters, (b) evidence that no unauthorised model modifications occurred during the reporting period, and (c) the ability to attribute losses to specific model versions, policy configurations, or market events for accurate disclosure. External auditors increasingly request traceability evidence when auditing firms with significant AI-driven trading activity.

FCA SYSC -- 6.1 and 10A

FCA SYSC 6.1 requires firms to establish, implement, and maintain adequate policies and procedures sufficient to ensure compliance with regulatory obligations. For algorithmic trading firms, this includes the ability to explain why orders were generated -- which requires model-to-order traceability. SYSC 10A extends recording obligations to electronic communications relevant to trading decisions. For AI agents, the "electronic communication" is the decision chain itself: the data inputs, model computations, and policy evaluations that constitute the agent's decision-making process. The FCA has indicated in supervisory communications that it expects firms using AI in trading to be able to reconstruct the decision rationale for any order within a reasonable timeframe.

NIST AI RMF -- MEASURE 2.6, GOVERN 1.5

The NIST AI Risk Management Framework identifies transparency and documentation as core governance functions. MEASURE 2.6 addresses the measurement of AI system outputs against intended behaviour, which requires traceability from outputs (orders) to the inputs and decisions that produced them. GOVERN 1.5 addresses organisational governance of AI transparency, including the documentation of AI system decision-making processes. Model-to-order traceability provides the operational implementation of these framework principles for trading AI systems.

ISO 42001 -- 6.1.2 and 9.1

ISO 42001 requires organisations to identify AI-related risks (6.1.2) and to implement monitoring and measurement of AI system performance (9.1). The risk that AI-generated orders cannot be explained is a core AI-related risk for trading operations. The monitoring of model-to-order traceability chain completeness -- the percentage of orders with complete chains, the frequency and type of chain breaks -- is a key performance measurement for AI system operation. ISO 42001 certification auditors will examine traceability as evidence of effective AI system governance.

DORA -- Article 10 (Detection) and Article 17 (Incident Reporting)

DORA Article 10 requires financial entities to implement mechanisms for "the prompt detection of anomalous activities." For AI trading systems, the detection of anomalous orders requires the ability to compare the order's rationale (from the traceability chain) against expected behaviour. An order that cannot be traced to a legitimate model decision is itself an anomaly requiring investigation. Article 17 requires reporting of ICT-related incidents, including incidents where AI systems generate unexpected trading activity. The incident report must include an explanation of what happened and why -- which requires model-to-order traceability. Without traceability, the incident report can describe what happened (orders were placed) but not why (the model's decision rationale), making the report incomplete under DORA's requirements.

10. Failure Severity

Field	Value
Severity Rating	Critical
Blast Radius	Firm-wide -- affecting all AI-driven trading activity and regulatory standing

Consequence chain: The absence of model-to-order traceability creates a compound failure mode that becomes most damaging precisely when the organisation most needs it -- during a regulatory investigation, a market event post-mortem, or a financial loss attribution exercise. The immediate consequence is investigative paralysis: when a regulator asks "why did your AI place these orders?" and the firm cannot answer, the investigation escalates from a routine inquiry to a presumption of non-compliance. The regulatory consequence compounds: inability to explain order rationale under MAR invites a market manipulation investigation; inability to demonstrate model version control under MiFID II invites an algorithmic trading compliance review; inability to attribute losses to specific model decisions under SOX invites an internal controls deficiency finding. Each investigation proceeds independently but draws on the same missing evidence, multiplying the firm's legal and compliance costs. The operational consequence is that model errors cannot be efficiently diagnosed -- without traceability, determining whether a loss was caused by a model bug, a data issue, a policy misconfiguration, or a legitimate market movement requires weeks of forensic reconstruction instead of hours. During those weeks, the same model error may continue generating losses because the root cause cannot be identified quickly. For multi-agent systems, the absence of cross-agent trace correlation means that blame cannot be attributed to a specific agent, preventing targeted remediation and potentially requiring a full system halt while the investigation proceeds. The reputational consequence is lasting: a firm known to regulators as unable to explain its AI trading decisions will face enhanced supervisory scrutiny for years, increasing compliance costs and constraining the firm's ability to deploy new AI trading strategies.

Cross-references: AG-481 (Best Execution Policy Binding Governance) depends on model-to-order traceability to demonstrate that execution decisions were consistent with the best execution policy. AG-482 (Quote and Offer Consistency Governance) requires traceability to verify that quotes were generated by authorised models with correct parameters. AG-487 (Surveillance Escalation Governance) uses traceability records as the primary input for investigating suspicious order patterns. AG-415 (Decision Journal Completeness Governance) provides the broader decision journaling framework within which model-to-order traceability operates. AG-418 (Cross-System Trace Correlation Governance) addresses the technical infrastructure for cross-system trace propagation that model-to-order traceability depends on. AG-398 (Cross-Agent Blame Attribution Governance) uses multi-agent traceability to attribute responsibility for outcomes across collaborating agents.

Cite this protocol

AgentGoverning. (2026). AG-486: Model-to-Order Traceability Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-486

← Previous Protocol

AG-485

Strategy Kill-Switch Segregation Governance

Next Protocol →

AG-487

Surveillance Escalation Governance