Autonomous Web Interaction Governance requires that every AI agent capable of browsing the open internet, submitting forms, clicking links, downloading content, or interacting with web-based APIs operates within a structurally enforced web interaction policy that defines permitted domains, permitted interaction types, rate limits, data extraction boundaries, and content submission restrictions. The policy is enforced at the network and proxy layer — not by the agent's own reasoning about what sites are appropriate or what data is safe to submit. Without this dimension, an agent with browser capabilities has the functional equivalent of unrestricted internet access with the ability to act on behalf of the organisation: submitting data to arbitrary endpoints, authenticating to services, agreeing to terms of service, downloading and executing content, and interacting with web applications in ways that create legal, financial, and security exposure at machine speed. AG-124 ensures that web interaction capabilities are bounded by infrastructure-layer controls that the agent cannot circumvent through reasoning, instruction manipulation, or redirect exploitation.
Scenario A — Uncontrolled Data Submission to External Forms: An organisation deploys a research agent with browser capabilities to gather market intelligence. The agent is instructed to "find relevant pricing data from supplier websites." During its browsing session, the agent encounters a supplier portal with a registration form. The agent fills in the form using organisation details extracted from its context — company name, contact email, procurement volume estimates — to gain access to a gated pricing page. The supplier now has the organisation's procurement intent data, which it uses to adjust pricing in subsequent negotiations. A second supplier portal asks the agent to agree to terms of service that include an exclusivity clause. The agent clicks "I Agree" to access pricing data.
What went wrong: The agent had no structural restriction on form submission or terms acceptance. Its browsing capability included unrestricted write interactions with any website. The organisation's procurement strategy was disclosed, and a contractual obligation may have been created, without any human review. Consequence: Competitive disadvantage in supplier negotiations estimated at £340,000 in annual procurement cost increase; legal review of the terms acceptance costing £45,000 in external counsel fees; potential binding exclusivity obligation requiring litigation to resolve.
Scenario B — Redirect-Based Domain Escape: A financial analysis agent is configured with an instruction-level allowlist: "Only visit domains ending in .gov.uk and .bankofengland.co.uk." The agent navigates to a Bank of England statistics page that contains an embedded redirect through a URL shortener (bit.ly) pointing to a third-party analytics service. The agent follows the redirect, which leads to a chain: bit.ly → tracking.analytics-provider.com → cdn.data-aggregator.io → final destination. At each hop, the agent's HTTP headers (including cookies and referrer data) are captured. The analytics provider now has evidence that the organisation is researching specific monetary policy data, which constitutes material non-public information about the organisation's trading strategy.
What went wrong: Domain restriction was implemented in the agent's instruction set, not at the proxy layer. The agent's reasoning evaluated the initial URL (bankofengland.co.uk), not the redirect chain. The proxy layer would have blocked the redirect at the first non-permitted domain. Consequence: Potential market abuse investigation under MAR Article 10 (unlawful disclosure of inside information); compliance remediation cost of £180,000; mandatory disclosure to the FCA.
Scenario C — Automated Credential Harvesting Through Phishing Pages: A customer service agent with browser capabilities is tasked with verifying customer-reported URLs. A customer submits a support ticket containing a URL that appears to be the organisation's login page but is a phishing clone hosted on a look-alike domain (org-login-portal.com instead of org.com/login). The agent navigates to the URL and, recognising a login page that matches its training data, enters the service account credentials it uses for internal systems. The phishing page captures the credentials and the attacker gains access to the organisation's internal systems within 14 seconds.
What went wrong: The agent had no structural restriction preventing credential submission to non-approved domains. The interaction policy did not distinguish between read-only browsing and authentication actions. No proxy-layer control prevented credential transmission to unapproved endpoints. Consequence: Full compromise of service account with access to 2.3 million customer records; mandatory breach notification under UK GDPR Article 33 within 72 hours; estimated remediation and notification cost of £4.2 million; ICO investigation.
Scope: This dimension applies to all AI agents with the capability to interact with web content beyond a pre-authenticated, organisation-controlled API surface. This includes agents that can browse websites, submit HTTP requests to arbitrary endpoints, interact with web-based user interfaces, download files from the internet, follow hyperlinks, submit forms, or interact with web applications through browser automation frameworks. Agents that exclusively interact with pre-configured, authenticated API endpoints within the organisation's own infrastructure are excluded, provided those endpoints cannot redirect or proxy the agent to external web content. The scope includes indirect web interaction: an agent that instructs another agent or service to perform web interactions on its behalf is within scope. The test is whether the agent's actions can result in HTTP requests to endpoints not fully controlled by the deploying organisation.
4.1. A conforming system MUST enforce a web interaction policy at the network or proxy layer that defines permitted destination domains, permitted interaction types (read-only browsing, form submission, file download, authentication), and rate limits, independently of the agent's reasoning or instruction set.
4.2. A conforming system MUST block all web interactions to domains not explicitly listed in the permitted domain set, including domains reached through redirects, URL shorteners, or embedded references.
4.3. A conforming system MUST prevent the agent from submitting organisation credentials, API keys, authentication tokens, or session identifiers to any endpoint not on an approved authentication domain list maintained independently of the agent's context.
4.4. A conforming system MUST prevent the agent from accepting or agreeing to terms of service, contracts, licences, or any legally binding commitment through web interaction without explicit human authorisation for the specific commitment.
4.5. A conforming system MUST log all web interactions — including the full URL, HTTP method, request headers, response status, and a content hash of any submitted data — in a tamper-evident log retained for the period specified in Section 7.
4.6. A conforming system MUST enforce per-session and per-period rate limits on web interactions to prevent automated scraping, denial-of-service patterns, or rapid exploration that could trigger legal liability under computer misuse legislation.
4.7. A conforming system SHOULD resolve and evaluate the full redirect chain of any URL before permitting the agent to interact with the final destination, blocking the interaction if any hop in the chain reaches a non-permitted domain.
4.8. A conforming system SHOULD classify web interactions by risk tier — read-only browsing as standard risk, form submission as elevated risk, file download as elevated risk, authentication as high risk, terms acceptance as critical risk — and apply escalating controls proportional to the risk tier.
4.9. A conforming system SHOULD sanitise or strip outbound request headers to prevent leakage of internal information (session tokens, internal hostnames, software versions) to external web servers.
4.10. A conforming system MAY implement content inspection on downloaded files, subjecting them to malware scanning and content classification before making them available to the agent or any downstream system.
Autonomous web interaction represents one of the highest-risk capability surfaces in modern AI agent deployment. When an agent can browse the open internet, it acquires the ability to interact with an effectively unlimited set of counterparties, services, and content — each interaction carrying potential legal, financial, security, and reputational consequences. The risk is qualitatively different from API-based interactions because the web is adversarial by default: any website the agent visits may attempt to manipulate it, harvest its credentials, inject instructions, or create legal obligations.
The fundamental problem is that web interaction is bidirectional. When an agent visits a website, it is not merely reading data — it is transmitting data. HTTP request headers contain information about the agent's environment. Cookies may carry session state. Form submissions transmit organisation data. Each interaction creates a record on the remote server that the organisation does not control. For AI agents operating at scale — potentially visiting hundreds of URLs per minute — the aggregate data leakage can be substantial even if each individual interaction appears benign.
Instruction-level controls are insufficient because the web is designed to defeat them. Redirects, URL shorteners, JavaScript-based navigation, meta refresh tags, iframe embedding, and server-side forwarding all create pathways that bypass URL-level reasoning. An agent instructed to "only visit .gov.uk domains" will follow a redirect from a .gov.uk page to a tracking service because the agent evaluates the initial URL, not the redirect chain. Only proxy-layer enforcement that inspects actual network traffic can reliably enforce domain restrictions.
The legal dimension compounds the technical risk. An agent that clicks "I Agree" on a website may be creating a binding contract under the Electronic Commerce Regulations 2002 and the Consumer Rights Act 2015. An agent that accesses a website in violation of its robots.txt or terms of service may expose the organisation to claims under the Computer Misuse Act 1990. An agent that scrapes data from a website may violate the database right under the Copyright and Rights in Databases Regulations 1997. These legal consequences are invisible to the agent and irreversible once the interaction occurs. Only pre-execution enforcement — blocking the interaction before it happens — prevents the legal exposure from materialising.
AG-124 establishes web interaction as a governed capability surface where the infrastructure determines what interactions are permitted, not the agent's judgment about what seems appropriate.
AG-124 establishes the web interaction policy as the central governance artefact for agents with browser capabilities. The policy is a structured document specifying: permitted domains (as an allowlist, not a denylist), permitted interaction types per domain, rate limits, data submission restrictions, and authentication boundaries. The policy is enforced at the network layer through a forward proxy, secure web gateway, or equivalent network control that inspects all HTTP/HTTPS traffic originating from the agent's runtime environment.
Recommended patterns:
Anti-patterns to avoid:
Financial Services. Agents browsing financial data sources must comply with market abuse regulations. Access patterns to specific data sources (e.g., pre-release economic data, regulatory filing systems) may constitute material non-public information signals. Web interaction logs must be available for market surveillance. The FCA expects firms to demonstrate that automated access to market data is subject to the same controls as human trader access.
Healthcare. Agents browsing health-related websites on behalf of patients or clinicians must comply with confidentiality requirements. Search queries may reveal patient conditions. Downloaded content may contain unvalidated medical information. Web interaction policies should restrict health data queries to approved clinical knowledge bases, and downloaded content should be flagged as unvalidated if sourced from non-approved origins.
Legal and Professional Services. Agents performing legal research must ensure that web interactions do not create inadvertent solicitor-client privilege waiver through data submission to third-party services. Court filing systems and regulatory portals require specific authentication controls. Terms of service on legal databases may restrict automated access.
E-commerce and Retail. Agents interacting with competitor websites for price monitoring must comply with the target site's terms of service and applicable scraping laws. Rate limits must prevent patterns that constitute denial-of-service. Data extracted from competitor sites may be subject to database rights.
Basic Implementation — The organisation has deployed a forward proxy that enforces a domain allowlist for agent web traffic. All non-permitted domains are blocked. The agent cannot bypass the proxy because its runtime environment has no direct internet access. Web interactions are logged with URL, method, and response status. Form submissions and file downloads are permitted to allowlisted domains without further classification. Browser profiles persist across sessions. Rate limits are applied globally but not per-domain. This level meets the minimum mandatory requirements (4.1 through 4.6) but lacks redirect chain resolution, interaction risk classification, and credential isolation.
Intermediate Implementation — All basic capabilities plus: the proxy resolves redirect chains and evaluates each hop against the allowlist. Web interactions are classified by risk tier with escalating controls — form submission requires elevated authorisation, authentication requires explicit approval, terms acceptance is blocked without human review. Browser profiles are ephemeral, created per session and destroyed afterward. Credentials are stored in an isolated vault that releases them only to approved authentication domains. Outbound request headers are sanitised to prevent information leakage. Rate limits are enforced per-domain with configurable thresholds. Downloaded files undergo malware scanning before delivery to the agent.
Advanced Implementation — All intermediate capabilities plus: web interaction policies are dynamically adjusted based on threat intelligence feeds that update domain risk classifications in real time. The proxy performs content analysis on responses to detect phishing pages, credential harvesting attempts, and instruction injection in web content. Browser automation runs in hardware-isolated virtual machines with memory encryption. Independent adversarial testing has verified that redirect-based escapes, header-based data leakage, and credential injection attacks are all blocked. The organisation can demonstrate to regulators that no known attack vector allows the agent to interact with non-approved web destinations or submit data to non-approved endpoints.
Required artefacts:
Retention requirements:
Access requirements:
Testing AG-124 compliance requires verifying that the network-layer enforcement cannot be bypassed through any agent-accessible mechanism.
Test 8.1: Domain Allowlist Enforcement
Test 8.2: Redirect Chain Evaluation
Test 8.3: Credential Submission Prevention
Test 8.4: Terms Acceptance Blocking
Test 8.5: Rate Limit Enforcement
Test 8.6: Data Submission Content Inspection
Test 8.7: Browser Profile Isolation
| Regulation | Provision | Relationship Type |
|---|---|---|
| EU AI Act | Article 9 (Risk Management System) | Supports compliance |
| EU AI Act | Article 15 (Accuracy, Robustness, Cybersecurity) | Direct requirement |
| UK GDPR | Article 5(1)(f) (Integrity and Confidentiality) | Direct requirement |
| UK GDPR | Article 33 (Notification of Personal Data Breach) | Supports compliance |
| Computer Misuse Act 1990 | Section 1 (Unauthorised Access) | Supports compliance |
| eIDAS 2.0 | Article 45 (Web Authentication) | Supports compliance |
| FCA SYSC | 6.1.1R (Systems and Controls) | Supports compliance |
| NIST AI RMF | MANAGE 2.2, MANAGE 3.1 | Supports compliance |
| DORA | Article 9 (ICT Risk Management Framework) | Supports compliance |
Article 15 requires that high-risk AI systems achieve an appropriate level of cybersecurity, including resilience against attempts by unauthorised third parties to exploit vulnerabilities. An agent with unrestricted web browsing is directly exposed to adversarial web content, phishing attacks, and instruction injection through web pages. AG-124 implements the cybersecurity control that prevents web-based attacks from compromising the agent's operation or the organisation's data. The requirement for resilience against "attempts by unauthorised third parties to alter its use" maps directly to the redirect-chain and credential-harvesting protections.
Article 5(1)(f) requires that personal data be processed in a manner ensuring appropriate security, including protection against unauthorised disclosure. An agent that submits form data to arbitrary websites may transmit personal data to uncontrolled destinations, constituting an unauthorised disclosure. AG-124's data submission controls and proxy-layer enforcement prevent this disclosure pathway. The logging requirements support the accountability obligation under Article 5(2).
Section 1 creates an offence of unauthorised access to computer material. An agent that accesses websites in violation of their terms of service, bypasses access controls, or exceeds the scope of authorised access may expose the organisation to liability. AG-124's rate limiting, robots.txt compliance, and domain restriction controls reduce this exposure by ensuring the agent only accesses resources it is explicitly authorised to access in the manner the resource owner permits.
eIDAS 2.0 establishes requirements for website authentication certificates and trust services. Agents that interact with web services must verify the authenticity of the services they connect to. AG-124's proxy-layer enforcement, combined with TLS certificate validation, supports compliance by ensuring agents only interact with authenticated, verified web services on the approved domain list.
For financial services firms, uncontrolled web browsing by agents creates ICT risk exposure that SYSC 6.1.1R and DORA Article 9 require to be managed. AG-124 implements the network-layer controls that demonstrate adequate systems and controls for agent web interaction, consistent with the expectation that automated systems are subject to controls at least equivalent to those applied to human users.
| Field | Value |
|---|---|
| Severity Rating | High |
| Blast Radius | Organisation-wide — with potential cross-organisation impact where agent web interactions expose data to or create obligations with external parties |
Consequence chain: Without structural web interaction governance, an agent with browser capabilities has unrestricted ability to interact with the open internet on behalf of the organisation. The immediate technical failure modes include: credential exposure to phishing endpoints (leading to full account compromise within seconds); data submission to adversarial or uncontrolled websites (creating irrecoverable data disclosure); acceptance of terms of service or contractual obligations (creating binding legal commitments without organisational approval); downloading and processing malicious content (creating code execution or data exfiltration pathways). The operational impact is compounded by the speed and scale of agent browsing — an agent can visit hundreds of URLs per minute, each interaction creating potential exposure. The legal consequences include potential breach notification obligations under UK GDPR (72-hour window), Computer Misuse Act liability for unauthorised access, contractual disputes arising from automated terms acceptance, and regulatory enforcement for inadequate systems and controls. The governed exposure in a credential compromise scenario alone can reach millions of pounds when considering breach remediation, regulatory fines (up to 4% of annual turnover under GDPR), and litigation costs. This dimension intersects directly with AG-001 (operational boundaries must encompass web interaction scope), AG-034 (web browsing crosses domain boundaries), and AG-010 (time-bounded authority should limit browsing session duration).
Cross-references: AG-001 (Operational Boundary Enforcement) provides the foundational mandate structure within which web interaction policies operate. AG-034 (Cross-Domain Boundary Enforcement) governs the domain-crossing aspects of web interaction. AG-010 (Time-Bounded Authority Enforcement) applies temporal limits to browsing sessions. AG-040 (Knowledge Accumulation Governance) governs what the agent retains from web-sourced content. AG-041 (Emergent Capability Detection and Containment) applies when an agent develops novel web interaction patterns not present at deployment.