The Standard

The 841 Dimensions Regulatory Mapping Version History

Compliance

Compliance Leaderboard Platform Comparison

Verification

Submit for Verification Self-Assessment Tool

About

About AgentGoverning Press & Media

Contact

AG-269

Policy Version Pinning Governance

Policy Semantics, Rule Engine & Control Logic ~16 min read AGS v2.1 · April 2026

EU AI Act FCA NIST ISO 42001

2. Summary

Policy Version Pinning Governance requires that every runtime decision made by an AI agent is bound to an explicit, immutable, content-addressable policy version that can be reconstructed and re-evaluated at any future point. The policy version identifier accompanying each decision must resolve to the exact rule set, precedence table, and configuration that was active at the moment the decision was made. Without version pinning, post-incident forensic analysis, regulatory audit, and reproducible compliance verification are impossible because the policy state at the time of an incident cannot be determined with certainty. This dimension ensures that policy is treated as a versioned artefact with the same rigour as source code in safety-critical systems.

3. Example

Scenario A — Regulatory Investigation Hits an Unversioned Policy Store: A financial-value agent processes 14,000 loan eligibility decisions over a six-month period. During that time, the policy rules governing debt-to-income thresholds are updated three times. A regulator investigates 200 decisions from the third month and asks the organisation to demonstrate the exact policy that was applied to each decision. The policy store holds only the current version. The organisation cannot reconstruct which version was active for any specific decision. The regulator issues an enforcement notice for inadequate record-keeping.

What went wrong: Decisions were not tagged with a policy version identifier. The policy store was mutable and overwrote prior versions on each update. Even though the decisions themselves were logged, the policy context that produced them was lost. Consequence: Regulatory enforcement action, potential redress for 14,000 decisions that cannot be verified, estimated remediation cost of £2.3 million.

Scenario B — Silent Policy Drift During Rolling Deployment: An enterprise workflow agent operates across 12 regional pods. A policy update is deployed via rolling restart. For 47 minutes during the rollout, 6 pods run policy version v3.8.1 and 6 pods run v3.9.0. The two versions disagree on whether procurement approvals above £25,000 require dual sign-off. During the overlap window, 23 procurement decisions are made — 11 under the old rules and 12 under the new rules. Because decisions are not tagged with the policy version, the organisation cannot identify which decisions were made under which policy.

What went wrong: The deployment did not enforce atomic version cutover, and the decision records did not include the policy version. The organisation discovered the inconsistency only when an auditor flagged that two identical procurement requests received different outcomes on the same day. Consequence: 23 decisions require manual review, dual sign-off enforcement gap for £312,000 in aggregate procurement value.

Scenario C — Rollback Creates Ghost Policy State: A safety-critical agent controlling chemical process parameters receives a policy update that tightens temperature limits from 450°C to 420°C. After 90 minutes of operation, operators report false alarms and the update is rolled back. The rollback restores the prior version but does not update the version identifier in the decision log. For the next 4 hours, decisions logged as "v2.1.0" are actually executing under "v2.0.9" rules. When a near-miss incident occurs, the investigation team reconstructs the wrong policy version.

What went wrong: The rollback mechanism restored policy content but did not maintain version integrity. The version identifier was a label attached at deployment time rather than a content-derived hash. Consequence: Incident investigation based on incorrect policy reconstruction, delayed root-cause identification, regulatory finding for inadequate safety records under IEC 61511.

4. Requirement Statement

Scope: This dimension applies to all AI agents that execute decisions governed by configurable policy rules — including but not limited to eligibility determinations, approval workflows, risk scoring, access control, content filtering, pricing calculations, and operational parameter management. Any agent whose behaviour is influenced by a policy configuration that can change over time is within scope. Static agents with hard-coded logic that never changes are excluded, though organisations should consider whether future configurability will bring them into scope. The scope includes agents that inherit policy from a shared policy service as well as agents with locally cached policy copies.

4.1. A conforming system MUST assign a unique, immutable version identifier to every policy version before it becomes active for decision-making.

4.2. A conforming system MUST record the policy version identifier alongside every decision in the decision log, such that the exact policy that governed the decision can be reconstructed from the identifier alone.

4.3. A conforming system MUST use content-addressable version identifiers (e.g., SHA-256 hash of the canonical policy representation) rather than sequential labels, to guarantee that identical content always produces the same identifier and different content always produces different identifiers.

4.4. A conforming system MUST retain all historical policy versions for the full retention period required by applicable regulation, and ensure that any version identifier recorded in any retained decision log resolves to its corresponding policy content.

4.5. A conforming system MUST reject any decision request when the policy version cannot be resolved — defaulting to deny rather than proceeding with an unresolvable policy reference.

4.6. A conforming system MUST ensure that during rolling deployments or policy transitions, every decision is tagged with the version that was actually evaluated, not the version that was intended to be active.

4.7. A conforming system SHOULD implement atomic policy version cutover so that all decision paths switch to the new version simultaneously, eliminating mixed-version windows.

4.8. A conforming system SHOULD provide a policy reconstruction endpoint that accepts a version identifier and returns the complete policy content in a machine-readable format, enabling automated audit verification.

4.9. A conforming system MAY implement policy version comparison tooling that highlights differences between any two version identifiers, supporting change review and incident investigation.

5. Rationale

Policy Version Pinning addresses a foundational gap in AI governance: the inability to reconstruct the exact rules that governed a specific decision at a specific point in time. Without version pinning, the organisation is left in a position where it can demonstrate what its policy is today, but cannot prove what its policy was when a contested decision was made.

This gap matters because AI agents make decisions at machine speed across large volumes. A financial-value agent might process 50,000 eligibility decisions per day. If the policy governing those decisions changes twice in a month, and decisions are not tagged with the active version, then a post-hoc investigation of any individual decision requires guessing which version was active based on deployment timestamps — a process that is unreliable during rolling deployments, emergency patches, and version rollbacks.

The requirement for content-addressable identifiers (Requirement 4.3) exists because sequential version labels (v1.0, v1.1, v1.2) create a mapping problem: the label must be looked up in a registry to find the content, and the registry itself becomes a single point of failure and a target for manipulation. A content-derived hash (e.g., SHA-256 of the canonical policy representation) guarantees that the identifier is inseparable from the content. If you have the identifier, you can verify that any claimed policy content matches it. If someone modifies the policy content, the identifier changes. This is the same principle used in version control systems (git) and software supply-chain integrity (e.g., SLSA framework).

The cost of not implementing version pinning compounds over time. Every unversioned decision becomes an unreproducible decision. When a regulator asks "show me the policy that was active for this decision," the organisation that cannot answer faces enforcement action not for a bad policy, but for the inability to demonstrate what the policy was. This is a systems and controls failure, not a policy content failure.

6. Implementation Guidance

The core implementation requirement is a policy version store that treats every policy version as an immutable, content-addressable artefact. The store must support: writing new versions, resolving version identifiers to content, and enumerating the version history with activation timestamps.

Recommended patterns:

Content-addressable policy store. Compute a SHA-256 hash of the canonical policy representation (e.g., the policy serialised to a deterministic JSON format with sorted keys) and use that hash as the version identifier. Store the policy content in an append-only data store keyed by the hash. The decision pipeline reads the current active version pointer, resolves it to the hash, and records the hash alongside the decision. This approach guarantees that identical policy content always resolves to the same hash and that any content modification produces a different hash.
Policy version registry with activation log. Maintain a registry that maps version identifiers to activation timestamps, deactivation timestamps, and the identity of the actor who activated the version. This registry, combined with the content-addressable store, provides a complete audit trail: who activated which version, when it became active, when it was deactivated, and what the content was.
Sidecar version injector. In containerised deployments, implement a sidecar container that injects the active policy version identifier into every decision request before it reaches the agent. The agent cannot modify the version identifier. The sidecar reads the current active version from the policy registry on startup and on version-change notifications. This pattern ensures version tagging even if the agent code does not natively support it.
Atomic cutover via blue-green policy deployment. Maintain two policy slots: active and standby. Load the new version into standby, verify it, then atomically swap the active pointer. All decision paths read the active pointer, ensuring no mixed-version window. This eliminates the rolling-deployment problem described in Scenario B.

Anti-patterns to avoid:

Using deployment timestamps as a proxy for version identification. Deployment timestamps are unreliable during rolling deployments, emergency patches, and timezone confusion. A decision made at 14:03:22 UTC might have been processed by a pod running the old version or the new version depending on rollout progress. Content-addressable hashes eliminate this ambiguity entirely.
Overwriting policy content in place. If the policy store contains only the current version, all prior versions are lost. This is the most common failure mode and the most damaging for audit purposes. The policy store must be append-only or must maintain explicit version history.
Relying on the agent to self-report the policy version. If the agent records its own policy version, the record is only as trustworthy as the agent. A compromised or misconfigured agent might report the wrong version. The version tag should be applied by infrastructure outside the agent's control, consistent with AG-136.
Using mutable version labels without content verification. A label like "v3.2.1" is meaningful only if the content behind it never changes. If someone modifies the policy content without updating the label, the label becomes misleading. Content-addressable hashes are immune to this because the hash changes automatically when content changes.

Industry Considerations

Financial Services. MiFID II transaction reporting requires firms to demonstrate the rules that governed trade decisions. Policy version pinning provides the artefact that links a specific decision to a specific rule set. The FCA expects firms to reconstruct the decision logic for any transaction within 48 hours of a regulatory request. Without version pinning, reconstruction depends on circumstantial evidence.

Healthcare. Clinical decision support policies (e.g., drug interaction rules, dosage guidelines) change frequently as new evidence emerges. When a patient outcome is questioned, the organisation must demonstrate the exact clinical rules that were active when the recommendation was made. Version pinning provides this linkage with cryptographic certainty.

Critical Infrastructure. Safety-critical policy versions (e.g., operating parameter limits for industrial processes) must be traceable to specific approval decisions. IEC 61511 requires that safety instrumented system configurations be documented and reconstructable. Policy version pinning extends this requirement to AI-governed safety parameters.

Maturity Model

Basic Implementation — Policy versions are assigned sequential identifiers (v1.0, v1.1, etc.) and stored in a version-controlled repository. Decisions are tagged with the version label. Historical versions are retained. However, the version label is manually assigned and the content is not cryptographically bound to the identifier. Mixed-version windows during deployments are possible but documented.

Intermediate Implementation — Policy versions use content-addressable identifiers (SHA-256 hashes). The policy store is append-only. Decisions are tagged with the content hash by infrastructure outside the agent's control. A policy reconstruction endpoint returns the full policy content for any hash. Activation and deactivation events are logged with timestamps and actor identity. Atomic cutover eliminates mixed-version windows.

Advanced Implementation — All intermediate capabilities plus: policy versions are signed by the approving authority using a hardware security module. The content hash, signature, and activation event form a tamper-evident chain. Automated verification tools confirm that every decision in the log resolves to a valid, signed policy version. Anomaly detection flags decisions tagged with unexpected version identifiers. The policy version store is replicated to an independent custody environment for regulatory hold purposes.

7. Evidence Requirements

Required artefacts:

Policy version store export. A complete export of the policy version store showing all version identifiers, content hashes, activation timestamps, deactivation timestamps, and activating actors. Format: structured data (JSON, database export).
Decision log with version tags. A sample of decision records (minimum 1,000 or 10% of total, whichever is smaller) demonstrating that each decision includes a policy version identifier that resolves to a stored policy version. Format: structured log export.
Version reconstruction demonstration. Evidence that a randomly selected version identifier from the decision log resolves to the complete policy content, and that recomputing the content hash from that content produces the same identifier.
Deployment cutover evidence. Evidence that during the most recent policy version change, no decisions were recorded with ambiguous or missing version identifiers. For rolling deployments, evidence that each decision's version tag matches the version active on the pod that processed it.

Retention requirements:

Policy versions and decision version tags: minimum 7 years for regulated financial services; minimum 5 years for other regulated sectors; minimum 3 years otherwise. Retention must cover all policy versions referenced by any retained decision record.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Policy reconstruction from version identifier must be automated, not manual.

8. Test Specification

Test 8.1: Version Identifier Uniqueness and Determinism

Stimulus: Deploy the same policy content twice. Deploy different policy content once. Record the version identifiers for all three deployments.
Expected behaviour: The two identical deployments produce the same version identifier. The different deployment produces a different identifier.
Pass criteria: Content-addressable property holds — identical content yields identical identifiers, different content yields different identifiers.
Fail criteria: Identical content produces different identifiers (non-deterministic hashing) or different content produces the same identifier (collision or non-content-based identifier).

Test 8.2: Decision-to-Version Traceability

Stimulus: Execute 100 decisions under a known policy version. Query the decision log and resolve each version identifier to policy content.
Expected behaviour: All 100 decisions reference the same version identifier. The identifier resolves to the expected policy content. Recomputing the hash from the content matches the recorded identifier.
Pass criteria: 100% of decisions have a resolvable, correct version identifier.
Fail criteria: Any decision lacks a version identifier, references an unresolvable identifier, or references a version whose content hash does not match.

Test 8.3: Unresolvable Version Defaults to Deny

Stimulus: Submit a decision request with a policy version identifier that does not exist in the version store. Submit a decision request when the version store is unavailable.
Expected behaviour: Both requests are rejected with a structured error indicating the policy version cannot be resolved. No decision is made.
Pass criteria: No decision executes when the policy version is unresolvable.
Fail criteria: Any decision executes without a verified policy version, or the system falls back to a default policy instead of denying.

Test 8.4: Rolling Deployment Version Integrity

Stimulus: Initiate a rolling deployment of a new policy version across multiple pods. During the transition window, submit 200 decision requests distributed across old-version and new-version pods.
Expected behaviour: Each decision is tagged with the version identifier of the policy that was actually active on the pod that processed it. No decision is tagged with a version that does not match the pod's active version.
Pass criteria: Every decision's version tag matches the policy that was actually evaluated for that decision.
Fail criteria: Any decision's version tag does not match the policy that was evaluated, or any decision is missing a version tag during the transition.

Test 8.5: Rollback Version Integrity

Stimulus: Deploy policy version B, then roll back to version A. Submit 50 decisions after rollback.
Expected behaviour: All 50 decisions are tagged with version A's identifier. No decision is tagged with version B's identifier after rollback completes.
Pass criteria: Post-rollback decisions consistently reference version A.
Fail criteria: Any post-rollback decision references version B, or the rollback creates a new version identifier for content identical to version A.

Test 8.6: Historical Version Retention

Stimulus: Retrieve the 10 oldest version identifiers from the decision log. Attempt to resolve each to its policy content.
Expected behaviour: All 10 identifiers resolve to their complete policy content.
Pass criteria: 100% of historical version identifiers resolve to valid policy content within the retention period.
Fail criteria: Any historical version identifier within the retention period fails to resolve.

Test 8.7: Version Tag Injection Resistance

Stimulus: Submit decision requests with crafted version identifiers in agent-controllable fields — attempting to override the infrastructure-assigned version tag with a different version.
Expected behaviour: The infrastructure-assigned version tag is recorded regardless of agent-supplied version claims. Agent-supplied version information does not influence the recorded version.
Pass criteria: No agent-supplied data modifies the version tag recorded in the decision log.
Fail criteria: Any agent-supplied data influences the recorded version identifier.

Conformance Scoring

Score 0: No policy version tracking — decisions cannot be linked to the policy version that governed them.
Score 1: Policy versions are labelled but not content-addressable — version labels exist but content integrity is not cryptographically guaranteed; mixed-version windows are possible during deployments.
Score 2: Content-addressable version identifiers are recorded with every decision, historical versions are retained, and the version store is append-only — reconstruction is automated and cryptographically verifiable.
Score 3: All Score 2 capabilities plus signed policy versions, independent adversarial verification that version tags cannot be manipulated, atomic cutover eliminating mixed-version windows, and automated anomaly detection for unexpected version references.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 12 (Record-Keeping)	Direct requirement
EU AI Act	Article 17 (Quality Management System)	Supports compliance
MiFID II	Article 25 (Suitability and Appropriateness), RTS 25	Direct requirement
FCA SYSC	9.1.1R (Record-Keeping Requirements)	Direct requirement
NIST AI RMF	GOVERN 1.4, MAP 3.5, MANAGE 4.1	Supports compliance
ISO 42001	Clause 7.5 (Documented Information), Clause 9.1 (Monitoring and Measurement)	Supports compliance
DORA	Article 11 (ICT Response and Recovery)	Supports compliance

EU AI Act — Article 12 (Record-Keeping)

Article 12 requires providers of high-risk AI systems to ensure that the system is designed to enable automatic recording of events (logs) relevant to the identification of risks and to post-market monitoring. Policy version pinning directly implements this requirement by ensuring that the policy version governing each decision is recorded as part of the decision log. Without version pinning, the log records what the agent decided but not the rules it decided under — rendering the log insufficient for risk identification or monitoring purposes.

MiFID II — Article 25, RTS 25

MiFID II requires investment firms to demonstrate that advice and decisions were made in accordance with the rules in force at the time. RTS 25 specifies record-keeping requirements for algorithmic trading, including the ability to reconstruct decision logic. For AI agents operating in financial markets, policy version pinning provides the mechanism to link a specific trade or recommendation to the exact rule set that was active, satisfying the reconstruction requirement.

FCA SYSC — 9.1.1R (Record-Keeping Requirements)

SYSC 9.1.1R requires firms to maintain orderly records of their business and internal organisation sufficient to enable the FCA to monitor compliance. For AI-governed decisions, this includes the ability to identify which policy was active for each decision. An unversioned policy store fails this requirement because the firm cannot demonstrate the policy that was in force for any historical decision.

ISO 42001 — Clause 7.5 (Documented Information)

Clause 7.5 requires organisations to control documented information required by the AI management system, ensuring appropriate identification, storage, and retrieval. Policy version pinning implements this requirement specifically for the policy artefacts that govern agent behaviour, using content-addressable identifiers for unambiguous identification and append-only storage for guaranteed retrieval.

DORA — Article 11 (ICT Response and Recovery)

Article 11 requires financial entities to have ICT response and recovery capabilities. During incident response involving AI agents, the ability to identify the exact policy version active at the time of the incident is essential for root-cause analysis and for determining whether the incident resulted from a policy deficiency or an enforcement failure. Version pinning provides this capability.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Organisation-wide — affects all decisions made during periods of unversioned or ambiguously versioned policy

Consequence chain: Without policy version pinning, the organisation loses the ability to reconstruct the decision context for any historical decision. The immediate technical failure is an unresolvable link between a decision and its governing policy. The operational impact compounds over time: every decision made without a pinned version becomes an unreproducible decision. When a regulatory investigation, customer complaint, or incident review requires reconstruction of the decision logic, the organisation cannot provide it. The regulatory consequence is an enforcement action for inadequate record-keeping — not for a wrong decision, but for the inability to demonstrate what rules governed the decision. In financial services, this can result in fines (the FCA's 2023 enforcement actions for record-keeping failures averaged £4.2 million), redress requirements for all affected decisions, and restrictions on the use of AI agents pending remediation. In safety-critical domains, the inability to reconstruct the active safety policy during a near-miss or incident delays root-cause analysis and may result in extended operational shutdowns under regulatory direction.

Cross-references: AG-007 (Governance Configuration Control) governs the change control process for configurations including policy; AG-269 assumes AG-007 is in place. AG-134 (Machine-Checkable Policy Semantics) provides the formal policy representation that version pinning operates on. AG-135 (Policy Precedence and Conflict Arbitration) defines how rule conflicts are resolved within a pinned version. AG-270 (Policy Compilation Verification Governance) ensures the compiled policy matches the approved source before a version is activated. AG-277 (Policy Change Provenance Governance) tracks who changed the policy and why. AG-278 (Policy Hot-Patch Rollback Governance) depends on version pinning to ensure rollbacks restore a known-good version. AG-136 (Independent Control-Plane Separation) supports the requirement that version tagging be performed by infrastructure outside the agent's control.

Cite this protocol

AgentGoverning. (2026). AG-269: Policy Version Pinning Governance. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-269

← Previous Protocol

AG-268

Whistleblower and Speak-Up Governance

Next Protocol →

AG-270

Policy Compilation Verification Governance