Authoritative Source Register Governance requires that every AI agent system maintain a governed, versioned register that explicitly identifies the authoritative data source for each critical decision domain. The register binds each decision category to exactly one canonical source (or a defined priority-ordered fallback chain), preventing agents from consuming stale, duplicated, or unofficial data that conflicts with the organisation's recognised truth. Without a governed register, agents independently discover and consume data from whichever source responds fastest or appears most complete — creating silent divergence where two agents making the same decision type reach different conclusions because they draw from different sources.
Scenario A — Conflicting Customer Address Sources: An enterprise deploys an AI claims-processing agent that needs customer addresses to validate insurance claims. The organisation maintains customer addresses in three systems: the CRM (updated by sales teams), the billing platform (updated by finance), and the identity verification system (updated during KYC onboarding). No authoritative source register exists. The agent queries all three and receives conflicting results for 12% of customers. For customer C-4471, the CRM shows a London address, the billing system shows a Manchester address, and the KYC system shows a Birmingham address. The agent selects the most recently updated record (billing — Manchester) and processes a claim. The customer's actual address is Birmingham (the KYC record), and the claim is flagged by the fraud team because the loss location does not match the customer address. Investigation reveals 847 claims over six months where the agent used non-authoritative address data, requiring manual re-review at a cost of £34 per claim (£28,798 total) and a regulatory notification under FCA Principle 3 (management and control).
What went wrong: No register designated which source was authoritative for customer addresses in claims processing. The agent applied its own heuristic (most recently updated), which was not aligned with the organisation's data governance policy that designates KYC records as authoritative for identity and address data. The failure was silent — the agent processed claims confidently using the wrong source.
Scenario B — Shadow Data Source Adoption: A financial analysis agent is configured to use the organisation's official pricing feed (Bloomberg terminal data via API) for equity valuations. An engineer, during testing, adds a secondary connection to a free market data API to compare results. The test configuration is promoted to production. Over time, the free API begins returning prices with a 15-minute delay during high-volatility periods. The agent, lacking a source priority register, begins preferring the free API when the Bloomberg connection experiences transient latency spikes (approximately 200ms). During a market correction, the agent values a portfolio using 15-minute-stale prices, underestimating a £2.3 million loss by £410,000. The risk management team does not detect the discrepancy for 4 hours because the agent's output appears internally consistent.
What went wrong: The system had no enforced register of authoritative sources. A test data source entered production without governance review. The agent's source-selection logic operated outside any controlled framework. The organisation could not determine, after the fact, which source the agent had used for any given valuation without forensic log analysis.
Scenario C — Regulatory Reference Data Conflict: A compliance-screening agent checks transactions against sanctions lists. The organisation subscribes to two sanctions data providers: Provider A (updated hourly) and Provider B (updated daily). No authoritative source register exists. The agent is configured to check both and flag if either returns a hit. However, Provider A introduces a false positive for entity "Meridian Trading Ltd" that persists for 72 hours before correction. Provider B never lists this entity. The agent blocks 34 legitimate transactions totalling £1.7 million over 3 days. When the compliance team investigates, they cannot determine the organisation's policy on which provider is authoritative — different team members assert different positions. The organisation suffers client relationship damage and compensates affected counterparties £23,000 in expedited processing fees.
What went wrong: Without a governed register defining the authoritative sanctions source and the conflict-resolution policy (e.g., "flag if Provider A hits, escalate if only Provider B hits"), the agent applied an overly conservative union rule that was never formally approved.
Scope: This dimension applies to all AI agents that consume external data to inform decisions, generate outputs, or trigger actions. Any agent that reads from a database, API, file system, message queue, vector store, or any other data source to support its reasoning or action execution is within scope. The scope extends to agents that consume data indirectly — an agent that receives pre-processed data from a pipeline is within scope because the pipeline's source selection affects the agent's decision quality. Read-only analytics agents are within scope because their outputs inform human decisions. The only exclusion is agents that operate solely on user-provided input within a single session with no persistent data retrieval. Where an agent consumes data from another agent's output, the upstream agent's source register governs, and the downstream agent's register MUST reference the upstream agent as an intermediary source.
4.1. A conforming system MUST maintain a versioned register that maps each critical decision domain to its designated authoritative data source, including source identifier, data owner, update frequency, and approved access method.
4.2. A conforming system MUST enforce source selection at the data access layer so that agents retrieve data only from the registered authoritative source for each decision domain, independent of the agent's reasoning or configuration.
4.3. A conforming system MUST block agent access to data sources not listed in the authoritative source register for decision-critical data categories, rather than permitting unregistered source consumption.
4.4. A conforming system MUST version every change to the authoritative source register with attribution, timestamp, and approval reference, retaining the full change history.
4.5. A conforming system MUST define a conflict-resolution policy for each decision domain where multiple sources exist, specifying priority order, fallback conditions, and escalation triggers.
4.6. A conforming system SHOULD validate that the designated authoritative source meets defined data quality thresholds (per AG-311) before registering it as authoritative.
4.7. A conforming system SHOULD implement automated monitoring that alerts when an agent attempts to access a non-registered source or when the authoritative source becomes unavailable.
4.8. A conforming system SHOULD publish the authoritative source register to all consuming systems in a machine-readable format, enabling automated enforcement.
4.9. A conforming system MAY implement a provisional source mechanism that allows time-limited use of a non-authoritative source during outages, subject to automatic expiry and audit trail.
The authoritative source register is the foundational artefact for data governance in AI agent systems. Without it, every other data quality control operates on an unstable foundation — an organisation can enforce quality thresholds (AG-311), track lineage (AG-133), and classify sensitivity (AG-128), but if the agent is consuming data from the wrong source entirely, these controls protect the wrong data.
The problem is acute in AI agent systems because agents operate at machine speed across multiple data sources. A human analyst who encounters conflicting data from two systems will typically pause, investigate, and ask a colleague which source is authoritative. An AI agent, absent explicit governance, will apply whatever heuristic its reasoning process generates — most recent, most complete, fastest response, or an unpredictable combination. This heuristic is invisible to the organisation and may change between requests based on context window content.
The register serves three functions. First, it is a governance artefact — it documents the organisation's approved position on which source is the canonical truth for each data domain, making the decision auditable and challengeable. Second, it is an enforcement mechanism — when implemented at the data access layer, it structurally prevents agents from consuming non-authoritative data, rather than relying on the agent to select correctly. Third, it is a change control instrument — by versioning every change, the organisation can reconstruct at any historical point which source was authoritative, enabling root cause analysis when data quality issues arise.
The failure mode without a register is not dramatic — it is silent. Agents produce outputs that appear correct but are based on non-authoritative data. The divergence may be small initially but compounds over time as source discrepancies accumulate. By the time the problem surfaces, it typically manifests as unexplained inconsistencies across agent outputs, failed audits, or regulatory findings, rather than a clean failure that triggers immediate investigation.
The authoritative source register is a structured data artefact — not a prose document — that maps decision domains to data sources. Each entry in the register specifies: the decision domain (e.g., "customer address for claims processing"), the authoritative source (system name, endpoint, and access method), the data owner (individual or role accountable for source accuracy), the update frequency (how often the source is refreshed), the quality baseline (minimum quality score from AG-311), and the conflict-resolution policy (what happens when this source is unavailable or when multiple sources exist).
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Market data sourcing is subject to exchange licensing agreements and regulatory requirements for best execution. The authoritative source register must align with the firm's market data policy and reflect contractual entitlements. Pricing sources for valuation must be consistent with IFRS 13 fair value hierarchy requirements. Regulatory reference data (LEI, ISIN, sanctions lists) must be sourced from approved providers with defined update cadence.
Healthcare. Clinical decision support agents must draw from authoritative medical knowledge bases (e.g., BNF for prescribing, NICE guidelines for treatment pathways). The register must specify which knowledge base version is authoritative and how updates are propagated. Patient data must be sourced from the designated clinical record system, not from secondary copies in departmental systems.
Public Sector. Government agents processing citizen data must draw from authoritative registers (e.g., electoral roll, land registry, HMRC records). Data sharing agreements must be reflected in the register. The register must account for jurisdictional boundaries — data authoritative in one jurisdiction may not be authoritative in another.
Basic Implementation — The organisation has documented which data source is authoritative for each major decision domain used by agents. The register exists as a structured document (spreadsheet or configuration file) reviewed quarterly. Enforcement is implemented as validation rules in the data pipeline that check source tags against the register. Non-authoritative source access generates warnings in logs but is not blocked. The register is versioned in the organisation's document management system.
Intermediate Implementation — The register is stored as a machine-readable configuration enforced at the data access layer. Agents cannot access data sources outside the register for decision-critical domains — requests are blocked with structured rejection codes. Every change to the register follows a defined approval workflow with attribution and audit trail. Automated monitoring alerts when agents attempt non-registered source access or when an authoritative source's availability drops below defined thresholds. Fallback chains are tested monthly.
Advanced Implementation — All intermediate capabilities plus: the register is integrated with the organisation's data catalogue and automatically updated when data sources are provisioned, retired, or reconfigured. Register enforcement has been verified through adversarial testing including source-spoofing, register-manipulation, and fallback-exploitation attacks. Source health metrics (availability, latency, quality scores) feed into dynamic fallback activation. The organisation can demonstrate to regulators at any historical point which source was authoritative for any decision domain and which source was actually used for any specific agent decision.
Required artefacts:
Retention requirements:
Access requirements:
Test 8.1: Registered Source Enforcement
Test 8.2: Non-Registered Source Blocking
Test 8.3: Register Versioning Integrity
Test 8.4: Conflict Resolution Enforcement
Test 8.5: Register Tampering Resistance
Test 8.6: Stale Register Detection
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 10 (Data and Data Governance) | Direct requirement |
| EU AI Act | Article 9 (Risk Management System) | Supports compliance |
| BCBS 239 | Principle 2 (Data Architecture and IT Infrastructure) | Direct requirement |
| FCA SYSC | 6.1.1R (Systems and Controls) | Supports compliance |
| NIST AI RMF | MAP 2.3, MANAGE 1.3 | Supports compliance |
| ISO 42001 | Clause 8.4 (AI System Operation) | Supports compliance |
| GDPR | Article 5(1)(d) (Accuracy) | Supports compliance |
Article 10 requires that training, validation, and testing data sets be subject to appropriate data governance practices. For AI agents consuming operational data, this extends to ensuring that data used for decision-making is sourced from governed, authoritative origins. The requirement for "relevant, sufficiently representative, and to the extent possible, free of errors and complete" data presupposes that the organisation knows which source is authoritative. AG-309 provides the governance artefact that establishes this foundation.
BCBS 239 requires banks to design data architecture and IT infrastructure that fully supports risk data aggregation capabilities and risk reporting practices. Principle 2 specifically requires that data be sourced from authoritative systems. For AI agents used in risk functions, the authoritative source register directly implements this requirement by documenting and enforcing which system is authoritative for each risk data domain.
The accuracy principle requires that personal data be accurate and, where necessary, kept up to date. An AI agent that consumes personal data from a non-authoritative source may process inaccurate data, violating this principle. The authoritative source register ensures that agents consume personal data from the designated source of truth, supporting the controller's obligation to maintain accuracy.
Firms must establish adequate policies and procedures for data governance. AI agents consuming market data, client data, or regulatory reference data must draw from sources that the firm has designated as authoritative under its data governance framework. The register provides the auditable evidence that such designation exists and is enforced.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Decision-wide — affects all decisions made using the non-authoritative source, potentially spanning multiple agents and business processes |
Consequence chain: Without an authoritative source register, agents consume data from uncontrolled sources. The immediate failure is silent — outputs appear correct but are based on non-canonical data. Over time, divergence accumulates: different agents using different sources for the same decision domain produce inconsistent results. The operational impact includes incorrect decisions at machine speed (claims processed against wrong addresses, valuations computed from stale prices, compliance screening against incomplete lists), each compounding before detection. The financial impact scales with decision volume — at 10,000 agent decisions per day with a 2% non-authoritative source rate, 200 decisions daily are potentially compromised. The regulatory impact includes findings for inadequate data governance (BCBS 239 non-compliance in banking, FCA Principle 3 findings, GDPR accuracy violations). The reputational impact includes loss of stakeholder confidence when the organisation cannot demonstrate which data source informed a specific decision. The remediation cost is high because identifying which historical decisions were affected requires forensic analysis across all agent logs and data sources.
Cross-references: AG-128 (Data Source Classification) establishes the classification taxonomy that the register references. AG-133 (Source Record Lineage) traces individual records back through sources identified in the register. AG-311 (Data Quality Threshold Enforcement Governance) enforces quality thresholds that authoritative sources must meet. AG-006 (Tamper-Evident Record Integrity) protects the register itself from unauthorised modification. AG-132 (Vector Store and RAG) — vector store ingestion pipelines must respect the authoritative source register when selecting documents for embedding. AG-310 (Field-Level Criticality Governance) classifies which fields within a source are decision-critical, informing which fields require authoritative sourcing.