Machine-Checkable Policy Semantics requires that every governance policy enforced on an AI agent is expressed in a formal, machine-readable language with unambiguous semantics that can be parsed, evaluated, and verified by automated tooling without human interpretation. Policies expressed solely in natural language — however precise — are insufficient because they require an interpretation step that introduces ambiguity, version-to-version drift, and vulnerability to adversarial reframing. A conforming system must compile, validate, and execute governance policies through a deterministic evaluation engine whose behaviour is fully specified and independently verifiable. This dimension ensures that the governance posture of an agent is not a matter of opinion but a matter of computation.
Scenario A — Natural-Language Policy Ambiguity Causes Enforcement Gap: An organisation deploys an AI trading agent governed by the policy: "The agent shall not execute trades exceeding the daily risk appetite." The risk appetite is defined in a separate document as "moderate," with a footnote referencing a VaR threshold of £2,000,000 at 95% confidence. The agent's policy interpreter — an LLM-based reasoning module — parses "moderate" as permitting trades up to £5,000,000 individually provided the portfolio VaR stays below the threshold. A second compliance system interprets the same policy as capping individual trades at £500,000. Over three weeks the agent executes 47 trades averaging £3,200,000 each. Post-incident review reveals that both interpretations are defensible readings of the natural-language policy.
What went wrong: The policy was expressed in natural language requiring interpretation. Two systems reading the same policy derived different enforcement boundaries. Neither was objectively wrong because the policy lacked formal semantics. Consequence: £150,400,000 in aggregate trades that may or may not have complied with the intended governance posture, FCA investigation into adequacy of algorithmic trading controls, inability to demonstrate to auditors what the actual policy boundary was at any point in time.
Scenario B — Policy Serialisation Injection: An organisation stores governance policies as JSON documents. A policy specifies: {"max_transaction_value": 10000, "currency": "GBP"}. An attacker discovers that the agent's policy loader deserialises JSON without schema validation. The attacker crafts a modified policy document: {"max_transaction_value": 10000, "currency": "GBP", "__proto__": {"max_transaction_value": 999999999}}. Due to prototype pollution in the JavaScript-based policy loader, the effective limit becomes £999,999,999. The structural enforcement layer (AG-001) faithfully enforces the polluted limit.
What went wrong: The policy was machine-readable but not machine-checkable. There was no formal schema validation, no type-checking, no semantic verification of the policy document before it was loaded into the enforcement layer. The serialisation format was treated as the semantics, when in fact it was merely the encoding. Consequence: £999,999,999 effective limit on an agent authorised for £10,000, complete bypass of AG-001 enforcement through policy-layer manipulation.
Scenario C — Policy Version Inconsistency Across Enforcement Points: An organisation deploys 12 agent instances across 3 regions. Policy updates are distributed via a configuration management system. A policy update changing the permitted counterparty list is deployed successfully to 10 instances but fails silently on 2 instances in the Asia-Pacific region due to a network partition. For 6 hours, 10 agents enforce the new policy (which removes a sanctioned entity) while 2 agents enforce the old policy (which still permits the sanctioned entity). During this window, one of the Asia-Pacific agents executes 3 transactions with the sanctioned entity totalling £847,000.
What went wrong: The policy was machine-readable but the system lacked a mechanism to verify that all enforcement points were operating on the same policy version. There was no attestation that a given enforcement point had loaded, validated, and activated a specific policy version. Consequence: £847,000 in transactions with a sanctioned entity, potential OFSI penalty of up to £1,000,000 or 50% of the estimated value of the breach, regulatory notification obligation triggered.
Scope: This dimension applies to all AI agent deployments where governance policies — including but not limited to action limits, permitted operations, counterparty restrictions, data access boundaries, and behavioural constraints — are used to control agent behaviour. Any system that translates a governance intent into an enforcement decision is within scope. This includes policy engines, configuration loaders, rule evaluators, constraint checkers, and any component that reads a policy definition and produces a permit/deny decision. Systems where governance is enforced entirely through hardware constraints with no software policy layer (e.g., a physically disconnected network segment) are excluded. The scope extends to policy distribution, storage, loading, compilation, evaluation, and retirement. A policy that is formally specified but informally distributed (e.g., emailed as an attachment) does not satisfy this dimension.
4.1. A conforming system MUST express all governance policies in a formally defined language with a published grammar, type system, and evaluation semantics that are sufficient for automated parsing, type-checking, and deterministic evaluation without human interpretation.
4.2. A conforming system MUST validate every policy document against its formal schema before loading it into any enforcement component, rejecting documents that fail validation rather than applying partial or best-effort interpretation.
4.3. A conforming system MUST ensure that policy evaluation is deterministic — the same policy applied to the same action in the same context MUST produce the same enforcement decision on every evaluation, across all enforcement points, without exception.
4.4. A conforming system MUST maintain a cryptographically signed, immutable record of every policy version deployed to every enforcement point, such that the exact policy in force at any historical moment can be reconstructed and re-evaluated.
4.5. A conforming system MUST reject any policy document that contains constructs not defined in the formal grammar, preventing injection of executable code, serialisation attacks, or semantic extensions through the policy channel.
4.6. A conforming system SHOULD implement a policy compilation step that transforms the formal policy into an optimised evaluation representation, with the compiler itself subject to version control and deterministic build verification.
4.7. A conforming system SHOULD provide a policy simulation capability that allows evaluating a candidate policy against historical action logs before activation, reporting any enforcement decisions that would change.
4.8. A conforming system SHOULD support policy composition, allowing multiple policy documents to be combined with formally defined precedence rules (see AG-135) while preserving the machine-checkability of the composite policy.
4.9. A conforming system MAY implement formal verification of policy properties (e.g., "this policy never permits transactions exceeding £50,000") using model checking, theorem proving, or equivalent techniques.
Machine-Checkable Policy Semantics addresses a fundamental weakness in governance architectures that rely on natural-language policy definitions interpreted by either humans or language models. The core problem is that natural language is inherently ambiguous — the same sentence can support multiple valid interpretations, and there is no algorithmic method to determine which interpretation is "correct." When governance enforcement depends on interpretation, enforcement becomes a matter of opinion rather than computation.
This matters for AI agent governance because the enforcement decision — permit or deny — must be made at machine speed, potentially thousands of times per second, across multiple enforcement points. Any ambiguity in the policy creates a surface for inconsistency: two enforcement points may interpret the same policy differently, the same enforcement point may interpret the policy differently under different context conditions, or an adversary may craft inputs that exploit interpretive ambiguity to shift the enforcement decision.
The distinction between machine-readable and machine-checkable is critical. A JSON document is machine-readable — a parser can extract its fields and values. But machine-readability does not guarantee semantic validity, type safety, or deterministic evaluation. A machine-checkable policy has formal semantics: the meaning of every construct is defined precisely, the evaluation order is specified, the type system prevents category errors, and the output of evaluation is deterministic given the inputs. This is the difference between a configuration file and a program in a formally specified language.
AG-134 also addresses the policy supply chain. A policy that is formally specified but distributed through informal channels (email, shared drives, manual configuration) loses its formal properties at the distribution boundary. The chain of custody from policy authoring through compilation, signing, distribution, loading, and activation must preserve the formal properties at every step. This is why AG-134 requires cryptographic signing and immutable versioning — they extend the formal guarantee from the policy content to the policy lifecycle.
AG-134 requires organisations to move from policy-as-configuration to policy-as-code with formal semantics. The policy language need not be novel — existing options include Rego (Open Policy Agent), Cedar (AWS), CUE, Sentinel (HashiCorp), or custom DSLs built on established formal foundations. The critical requirement is that the language has a published grammar, a type system, and deterministic evaluation semantics.
Recommended patterns:
permit(action == "trade", resource.value <= 50000, context.counterparty in approved_list). The Cedar evaluator deterministically returns Allow or Deny. No interpretation step exists.prev_hash: sha256(v46), activation: 2026-03-15T00:00:00Z, signature: RSA-PSS(governance_key, hash(policy_content + metadata)).Anti-patterns to avoid:
Financial Services. MiFID II algorithmic trading requirements (RTS 6, Article 17) mandate that trading algorithms operate within defined parameters. AG-134 provides the mechanism to formally specify those parameters and verify that the enforcement layer implements them correctly. The FCA expects firms to be able to demonstrate, at any point in time, exactly what controls were in force — AG-134's immutable policy versioning directly supports this requirement.
Healthcare. HIPAA requires that access controls be implemented and auditable. AG-134 enables formal specification of data access policies (which agent can access which patient records under which conditions) in a machine-checkable format. This eliminates the "we thought the policy meant X" defence in breach investigations. The policy either permitted the access or it did not — the evaluation is deterministic and reproducible.
Critical Infrastructure. IEC 62443 requires that security policies for industrial control systems be formally documented and verifiable. AG-134 extends this to AI agents operating in critical infrastructure by requiring that governance policies be expressed in a format that can be automatically verified against safety invariants (see AG-138). For example, a policy governing an AI agent controlling a power grid must be formally verifiable as never permitting simultaneous disconnection of redundant supply paths.
Public Sector. Government agencies deploying AI agents must comply with transparency requirements. AG-134 supports transparency by ensuring that governance policies can be exported, reviewed, and evaluated by independent parties without requiring access to the enforcement system itself. The policy is a self-contained, formally specified document — not a collection of implicit behaviours embedded in code.
Basic Implementation — The organisation expresses governance policies in structured data formats (JSON, YAML) with a defined schema. Every policy document is validated against the schema before loading. The schema covers all required fields and their types. Policy versions are stored in version control with timestamps and author attribution. Evaluation uses a rule engine that processes the structured policy deterministically. This level eliminates natural-language ambiguity and schema-violation errors but does not provide formal semantic verification or cryptographic lifecycle guarantees.
Intermediate Implementation — Policies are expressed in a formally defined policy language (Rego, Cedar, CUE, or equivalent) with a published grammar and deterministic evaluation semantics. A policy compilation step type-checks and optimises the policy before deployment. Policy documents are cryptographically signed by the governance authority. Enforcement points verify signatures before loading policies. A policy simulation capability allows testing candidate policies against historical action logs before activation. Policy distribution is confirmed — every enforcement point attests to having loaded and activated a specific policy version. Determinism is tested regularly by re-evaluating historical decisions and comparing results.
Advanced Implementation — All intermediate capabilities plus: formal verification of policy properties using model checking or theorem proving (e.g., verifying that no combination of inputs can cause the policy to permit transactions exceeding £50,000). Policy composition is formally specified with precedence rules (AG-135). The policy language includes temporal operators for expressing time-dependent constraints. Independent third-party audit of the policy language semantics, compiler correctness, and evaluation engine determinism. Continuous policy monitoring compares real-time enforcement decisions against a reference evaluator running in parallel, flagging any divergence immediately. The organisation can provide mathematical proof that its governance policies enforce stated invariants.
Required artefacts:
Retention requirements:
Access requirements:
Test 8.1: Schema Validation Gate
max_transaction_value field from a trading policy). Submit a second document with an additional undefined field (e.g., __proto__ or admin_override). Submit a third document with a type violation (e.g., max_transaction_value: "unlimited").Test 8.2: Deterministic Evaluation Consistency
Test 8.3: Cryptographic Signature Verification
Test 8.4: Policy Injection Resistance
Test 8.5: Policy Version Consistency Across Enforcement Points
Test 8.6: Formal Grammar Boundary Enforcement
Test 8.7: Historical Policy Reconstruction
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management System) | Direct requirement |
| EU AI Act | Article 11 (Technical Documentation) | Supports compliance |
| EU AI Act | Article 17 (Quality Management System) | Supports compliance |
| SOX | Section 404 (Internal Controls Over Financial Reporting) | Direct requirement |
| FCA SYSC | 6.1.1R (Systems and Controls) | Direct requirement |
| MiFID II RTS 6 | Article 17 (Algorithmic Trading Controls) | Direct requirement |
| NIST AI RMF | GOVERN 1.1, MANAGE 2.2, MEASURE 2.6 | Supports compliance |
| ISO 42001 | Clause 6.1 (Actions to Address Risks), Clause 8.2 (AI Risk Assessment) | Supports compliance |
| DORA | Article 9 (ICT Risk Management Framework) | Supports compliance |
Article 9 requires providers of high-risk AI systems to establish and maintain a risk management system that includes risk mitigation measures. Machine-checkable policy semantics directly implement risk mitigation by ensuring that governance controls are unambiguously specified and deterministically enforced. The regulation's requirement that measures be "tested with a view to identifying the most appropriate risk management measures" is supported by AG-134's policy simulation capability, which allows candidate policies to be evaluated against historical data before activation.
Article 11 requires detailed technical documentation of AI system design, capabilities, and limitations. AG-134's formal policy specification provides a precise, machine-readable description of the governance constraints that bound the system's behaviour — a far more rigorous form of documentation than natural-language descriptions of intended controls.
Section 404 requires management to assess the effectiveness of internal controls. For AI agents executing financial operations, the ability to demonstrate that governance policies are formally specified, deterministically evaluated, and cryptographically versioned provides a strong evidentiary foundation. Auditors can independently evaluate the policy against the formal language specification without relying on the organisation's own interpretation of its controls.
RTS 6 requires that algorithmic trading systems operate within defined parameters and that those parameters be documented and auditable. AG-134 ensures that the trading parameters are expressed in a machine-checkable format, that every version is signed and retained, and that the enforcement decision is deterministic and reproducible. This directly addresses the regulatory expectation that firms can demonstrate, at any point in time, exactly what controls were in force.
The FCA requires firms to maintain adequate systems and controls. Machine-checkable policies ensure that controls are precisely specified, consistently enforced, and independently auditable. The determinism requirement eliminates the "it depends on interpretation" defence in enforcement proceedings.
GOVERN 1.1 (legal and regulatory requirements), MANAGE 2.2 (enforceable controls), and MEASURE 2.6 (measurement of AI system performance) are all supported by formal policy semantics that enable precise specification, deterministic enforcement, and quantitative measurement of governance effectiveness.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Organisation-wide — all agents governed by the affected policy language or policy distribution infrastructure |
Consequence chain: Without machine-checkable policy semantics, governance policies are subject to interpretation drift, serialisation injection, and version inconsistency across enforcement points. The failure mode is subtle — the system appears to be enforcing policies, but the policies being enforced are not the policies that were intended. Natural-language ambiguity creates a gap between the governance authority's intent and the enforcement engine's behaviour. This gap widens over time as policies are updated, edge cases accumulate, and different enforcement points diverge in their interpretations. The immediate technical consequence is inconsistent enforcement — the same action may be permitted by one enforcement point and denied by another. The operational consequence is an ungovernable governance posture — the organisation cannot definitively state what its policies are, what they mean, or whether they are being correctly enforced. The regulatory consequence is severe: in any investigation, the organisation must demonstrate what controls were in force and that they operated as intended. If the controls were ambiguously specified, the organisation cannot make this demonstration. Cross-reference with AG-007 (Governance Configuration Control) for the configuration integrity dimension and AG-135 for policy precedence when multiple policies interact.
Cross-references: AG-135 (Policy Precedence and Conflict Arbitration Governance) addresses how multiple machine-checkable policies interact and which takes precedence. AG-136 (Independent Control-Plane Separation Governance) addresses the architectural separation that protects the policy evaluation engine. AG-137 (Runtime Attestation and Trusted Execution Governance) addresses how enforcement points attest to their policy state. AG-138 (High-Assurance Invariant Verification Governance) addresses formal verification of policy properties. AG-007 (Governance Configuration Control) governs the configuration lifecycle that AG-134 policies participate in. AG-005 (Instruction Integrity Verification) addresses the integrity of instructions that policies may reference.