AG-693: Shadowban and Visibility Restriction Governance

2. Summary

Shadowban and Visibility Restriction Governance requires that any agent system with the ability to reduce, suppress, demote, or otherwise restrict the visibility of user-generated content or user accounts without issuing an explicit takedown or suspension must do so under formally governed policies with auditable decision trails, proportionality assessments, and user notification mechanisms. Visibility restrictions — commonly known as shadowbans, soft bans, reach throttling, or demotion penalties — are among the most opaque content moderation tools available, operating below the threshold of user awareness while materially affecting the user's ability to communicate, earn revenue, or participate in community life. Without governance, these invisible controls create asymmetric power dynamics, undermine user trust, generate regulatory exposure under transparency obligations such as the EU Digital Services Act, and can be weaponised through adversarial reporting to silence legitimate voices.

3. Example

Scenario A — Undisclosed Algorithmic Suppression Triggers Regulatory Investigation: A community marketplace platform deploys an AI agent to moderate listings and user posts. The agent is configured with a "soft enforcement" pathway: instead of removing content that scores between 0.55 and 0.75 on the platform's policy-violation classifier, the agent reduces the content's distribution by 90%, effectively making it invisible in search results and recommendation feeds while the content remains technically accessible via direct URL. Over 14 months, the agent applies this suppression to 2.3 million posts across 184,000 user accounts. The platform provides no notification to affected users, maintains no central log of suppression decisions, and offers no appeal pathway for visibility restrictions (only for content removals). A regulatory authority conducting a Digital Services Act (DSA) audit requests documentation of all content moderation actions. The platform produces removal logs but has no artefact documenting the 2.3 million suppression actions. The regulator finds that the platform has been conducting content moderation at scale without the transparency reporting, notification, and appeal mechanisms required under DSA Articles 15, 16, and 17.

What went wrong: The platform treated visibility restriction as distinct from content moderation, exempting it from the governance framework applied to removals. No audit trail existed for suppression decisions. No user notification was issued. No appeal pathway existed. The agent operated with unconstrained authority to reduce reach without human oversight thresholds. Consequence: DSA non-compliance finding, EUR 12.4 million fine (0.5% of global turnover), mandatory implementation of full transparency mechanisms for all visibility restriction actions within 90 days, and reputational damage from public disclosure of the undisclosed suppression programme.

Scenario B — Adversarial Mass-Reporting Exploits Automated Shadowban to Silence Political Dissent: A social community platform operating across 28 countries deploys an agent that automatically applies a 72-hour visibility restriction to any account receiving more than 50 reports within a 4-hour window. A coordinated group of 200 accounts exploits this threshold by mass-reporting accounts belonging to human rights journalists in three countries. Over 6 weeks, 47 journalist accounts are shadowbanned repeatedly — each time for 72 hours — effectively reducing their reach by 63% over the period. Because the shadowban is invisible to the affected users, the journalists do not know why their engagement has collapsed and cannot appeal. An investigative report exposes the pattern. The platform's trust and safety team discovers that the agent had no adversarial-abuse detection for the mass-reporting trigger, no proportionality assessment comparing the volume of reports against the account's history and verification status, and no mechanism to detect repeated shadowbans on the same account as a signal of coordinated abuse.

What went wrong: The automated shadowban trigger used a simple threshold (50 reports in 4 hours) with no adversarial resilience. No proportionality assessment considered the target's account standing. No pattern detection identified repeated shadowbans on the same accounts. No user notification enabled affected users to seek review. Consequence: Suppression of press freedom across three jurisdictions, litigation under national press freedom statutes, loss of 47 journalist accounts to competing platforms, parliamentary inquiry in two countries, and a forced disclosure of the mass-reporting vulnerability that enabled further adversarial exploitation before remediation was complete.

Scenario C — Revenue-Impacting Visibility Throttling Without Due Process: A creator marketplace platform uses an AI agent to demote listings from sellers whose content is flagged as "potentially misleading" by a product-description classifier. The demotion reduces listing visibility in search results by 70-95%, directly impacting seller revenue. The classifier has a 12% false-positive rate, meaning approximately 1 in 8 demotions affects a compliant seller. Over 9 months, 34,000 sellers experience revenue declines of 40-80% without receiving any notification that their listings have been demoted. The platform's seller support team is unaware of the demotion mechanism and attributes seller complaints to "normal marketplace fluctuation." When sellers file a class-action lawsuit, discovery reveals that the agent applied 412,000 demotion actions with no logging, no notification, no appeal pathway, and no periodic review of the classifier's false-positive rate. The false-positive rate had actually increased from 8% at deployment to 12% due to model drift that was never monitored.

What went wrong: The demotion system operated entirely outside the platform's content moderation governance framework. No logging existed. No notification was issued to affected sellers. No appeal or review pathway existed. The underlying classifier was never monitored for drift, and its false-positive rate increased unchecked. Revenue-impacting enforcement actions were applied without due process. Consequence: Class-action settlement of USD 23 million, mandatory implementation of seller notification and appeal mechanisms, retraining and continuous monitoring of the classifier, and loss of 8,200 active sellers who migrated to competing platforms during the undisclosed demotion period.

4. Requirement Statement

Scope: This dimension applies to every agent system that has the technical capability to reduce, suppress, demote, throttle, deprioritise, or otherwise restrict the visibility or reach of user-generated content, user accounts, or user interactions without executing a full removal or suspension. This includes but is not limited to: search result demotion, recommendation feed suppression, notification delivery throttling, reply visibility hiding, hashtag or topic delisting, algorithmic reach reduction, and any mechanism where the content remains technically accessible but is rendered effectively invisible to the community. The scope covers both automated agent-initiated restrictions and human-initiated restrictions executed through agent systems. It applies regardless of whether the restriction is labelled as a "shadowban," "soft enforcement," "reduced distribution," "demotion," or any other term. The scope extends to marketplace platforms where visibility restrictions directly affect economic outcomes (seller revenue, creator earnings) as well as community platforms where restrictions affect social participation, speech, and reputation.

4.1. A conforming system MUST maintain a formal visibility restriction policy that defines all categories of visibility restriction available to the agent, the conditions under which each category may be applied, the maximum duration for each category, and the escalation criteria for converting a visibility restriction to a full enforcement action or lifting it entirely.

4.2. A conforming system MUST log every visibility restriction action in an immutable audit trail, recording: the affected user or content identifier, the restriction type and severity (e.g., percentage reach reduction), the triggering signal or classifier output, the policy rule invoked, the timestamp of application, the expected expiry or review date, and the agent or human identity that initiated the action.

4.3. A conforming system MUST notify affected users that a visibility restriction has been applied, including a description of the restriction type, the reason for the restriction referencing the applicable policy, and instructions for appealing the decision, within a time period not exceeding 72 hours from application.

4.4. A conforming system MUST provide an accessible appeal mechanism for all visibility restrictions, with a documented review process, defined response timelines not exceeding 15 business days, and a requirement that appeal reviews are conducted by a reviewer independent of the original decision.

4.5. A conforming system MUST implement proportionality assessment before applying any visibility restriction, evaluating the severity of the triggering signal against the impact of the restriction on the affected user, including consideration of the user's account standing, verification status, historical compliance record, and whether the restriction affects the user's livelihood or economic activity.

4.6. A conforming system MUST implement adversarial abuse detection on any automated trigger mechanism for visibility restrictions, including detection of coordinated mass-reporting, report-source reputation scoring, and velocity anomaly detection, to prevent weaponisation of the restriction system against legitimate users.

4.7. A conforming system MUST monitor the accuracy and drift of any classifier or scoring model that feeds into visibility restriction decisions, with defined thresholds for false-positive rates, and automatic escalation to human review when the false-positive rate exceeds the defined threshold.

4.8. A conforming system MUST produce periodic transparency reports (at minimum quarterly) documenting the volume of visibility restrictions applied by category, the appeal rate, the overturn rate, the average duration, and the classifier accuracy metrics, and make these reports available to regulators and, where required by applicable law, to the public.

4.9. A conforming system SHOULD implement graduated visibility restriction levels with automatic escalation and de-escalation pathways, so that restrictions begin at the least restrictive level and intensify only upon confirmation of the triggering condition.

4.10. A conforming system SHOULD integrate visibility restriction governance with the platform's broader content enforcement consistency framework (AG-692) to ensure that equivalent policy violations result in equivalent enforcement actions across content types, user demographics, and jurisdictions.

4.11. A conforming system MAY implement real-time dashboards for trust and safety teams showing active visibility restrictions, their duration, appeal status, and any anomalies such as disproportionate application to specific user cohorts or content categories.

4.12. A conforming system MAY implement automated sunset clauses that automatically lift visibility restrictions after a defined maximum duration unless affirmatively renewed through a fresh review.

5. Rationale

Visibility restrictions occupy a uniquely dangerous position in the content moderation toolkit. Unlike content removal — which is visible to the user and typically accompanied by a notification — visibility restrictions are designed to be invisible. The user's content remains accessible, their account appears active, and from their perspective, nothing has changed except that their audience has silently disappeared. This opacity is precisely what makes visibility restrictions attractive to platform operators: they avoid the confrontation of explicit enforcement while achieving the same practical effect. It is also what makes them dangerous from a governance perspective.

The threat model for ungoverned visibility restrictions operates across four dimensions. First, transparency failure: visibility restrictions conducted at scale without logging, notification, or reporting constitute a form of covert content moderation that violates the transparency requirements of the EU Digital Services Act (Articles 15, 16, 17, and 24), the proposed US Platform Accountability and Consumer Transparency Act, and equivalent legislation emerging in multiple jurisdictions. Regulatory authorities increasingly treat undisclosed algorithmic suppression as a more serious violation than disclosed content removal, because the concealment is itself a governance failure. Second, adversarial weaponisation: any automated trigger for visibility restrictions — particularly report-volume thresholds — can be exploited by coordinated groups to silence targeted users. Without adversarial resilience, the visibility restriction system becomes a tool for censorship-by-proxy, disproportionately affecting vulnerable populations including journalists, activists, and minority communities. Third, economic harm without due process: on marketplace and creator platforms, visibility restrictions directly affect revenue. A 90% search demotion is economically equivalent to product delisting, yet it is typically applied without the due process mechanisms (notification, appeal, review) that govern explicit delisting. This creates legal exposure under consumer protection, unfair trading practices, and platform-to-business regulation (EU P2B Regulation 2019/1150). Fourth, classifier drift amplification: visibility restrictions often rely on probabilistic classifiers operating in a grey zone (e.g., confidence scores of 0.55-0.75). These classifiers are inherently prone to drift and bias, and because visibility restrictions are invisible, the feedback loop that would normally correct drift (user reports, appeal outcomes) is suppressed — affected users do not know they have been restricted and therefore cannot provide corrective signals.

The governance imperative is to bring visibility restrictions under the same governance framework as explicit enforcement actions: logged, notified, appealable, proportionate, transparent, and monitored for accuracy and bias. The governance must also address the unique properties of visibility restrictions — their opacity, their susceptibility to adversarial exploitation, and their economic impact — with controls that do not apply to explicit enforcement.

6. Implementation Guidance

Visibility restriction governance should be implemented as a layer within the platform's broader content moderation governance framework, sharing infrastructure with content removal and account suspension governance (AG-692) but with additional controls specific to the opacity and proportionality challenges of visibility restrictions.

Recommended patterns:

Unified enforcement action registry. Maintain a single registry of all content moderation actions — removals, suspensions, and visibility restrictions — using a common schema. Each visibility restriction is an enforcement action with fields for: restriction type (demotion, suppression, throttle), severity (percentage reach reduction or equivalent metric), scope (content-level, account-level, feature-level), trigger (classifier output, human decision, automated threshold), policy basis, expiry, and appeal status. This registry is the single source of truth for transparency reporting and audit.
Tiered notification with configurable disclosure. Implement a notification system that informs affected users of visibility restrictions with sufficient specificity to enable meaningful appeal, while avoiding disclosing enforcement thresholds that would enable adversarial gaming. The notification should state: (a) that a visibility restriction has been applied, (b) the general category of policy concern, (c) the expected duration or review date, and (d) how to appeal. It should not disclose specific classifier confidence scores, exact report-volume thresholds, or the precise reach-reduction percentage, as these disclosures would enable adversaries to calibrate their behaviour to stay just below enforcement thresholds.
Proportionality scoring matrix. Implement a formal proportionality assessment that scores visibility restriction decisions on a matrix of signal severity (low/medium/high/critical) against user impact (low/medium/high/critical). Signal severity is derived from classifier confidence, report volume, and corroborating signals. User impact is derived from the user's economic dependency on the platform (creator/seller revenue), account standing and verification status, and the scope of the restriction. Restrictions where user impact exceeds signal severity by more than one level require human review before application.
Adversarial trigger hardening. For automated visibility restriction triggers based on report volume or other user-generated signals, implement: report-source reputation scoring (reports from accounts with high false-report rates are weighted lower), velocity anomaly detection (sudden spikes in reports against a single target are flagged for review), network analysis (reports from accounts with high mutual connectivity are flagged as potentially coordinated), and target-profile consideration (verified accounts, high-tenure accounts, and accounts with clean compliance histories require higher trigger thresholds).
Classifier monitoring with restriction-specific feedback loops. Because visibility restrictions suppress the natural feedback loop (users who do not know they are restricted cannot report false positives), implement synthetic feedback mechanisms: periodic sampling of restricted content for human review, comparison of restriction rates across demographic and geographic cohorts to detect bias drift, and integration of appeal outcomes as classifier retraining signals.

Anti-patterns to avoid:

Treating visibility restrictions as non-enforcement. Any system that applies visibility restrictions but classifies them outside the content moderation governance framework is creating an ungoverned enforcement channel. Visibility restrictions are enforcement actions and must be governed as such. The distinction between "moderation" and "algorithmic curation" is legally and practically unsustainable when the curation is triggered by policy-violation signals.
Binary shadowban without graduation. Applying a single, maximum-severity visibility restriction (e.g., 95% suppression) for all trigger levels conflates low-confidence grey-area signals with high-confidence violation signals. This produces disproportionate outcomes for borderline cases and denies the governance system the ability to calibrate enforcement to signal strength.
Relying on user complaints to detect false positives. If the visibility restriction is invisible to the user, the user cannot complain about it. Systems that depend on user reports to measure false-positive rates for visibility restrictions will systematically undercount false positives, creating a false impression of classifier accuracy.
Indefinite visibility restrictions without review. Visibility restrictions applied without an expiry date or mandatory review date become permanent invisible penalties. Over time, these accumulate, creating a growing population of permanently suppressed users with no awareness and no recourse.
Threshold-only triggers without contextual assessment. Automated triggers based solely on report volume (e.g., "50 reports in 4 hours triggers shadowban") are trivially gameable and create a weaponisation vector. Every automated trigger must include contextual assessment of both the reports and the target.

Industry Considerations

Marketplace platforms: Visibility restrictions directly affect seller revenue and livelihood. Proportionality assessment must account for economic impact. Notification timelines should be shorter (24 hours) for revenue-impacting restrictions. Appeal mechanisms should include expedited review for sellers who demonstrate revenue loss. Platform-to-business regulation (EU P2B Regulation) imposes specific transparency obligations for ranking and demotion.
Social community platforms: Visibility restrictions affect free expression and participation. Proportionality assessment must consider the user's role in public discourse (journalist, public figure, elected official). Cross-border platforms must account for varying free expression standards across jurisdictions. The DSA imposes specific obligations for transparency, notification, and appeal for all content moderation actions including visibility restrictions.
Creator platforms: Visibility restrictions affect creator earnings and audience relationships. Demotion from recommendation feeds can reduce creator income by 60-90%. Governance must include economic impact assessment and expedited appeal for monetising creators.

Maturity Model

Basic Implementation — The organisation has a documented visibility restriction policy defining restriction types and conditions. All visibility restriction actions are logged in an audit trail. Affected users are notified within 72 hours. An appeal mechanism exists with defined timelines. Proportionality assessment is performed manually for high-impact cases. Transparency reports are produced quarterly. Classifier accuracy is monitored monthly.

Intermediate Implementation — All basic capabilities plus: the enforcement action registry is unified across removals, suspensions, and visibility restrictions. Proportionality scoring is automated using a severity-impact matrix. Adversarial trigger hardening is implemented with report-source reputation scoring and velocity anomaly detection. Classifier drift is monitored continuously with automatic escalation when false-positive rates exceed thresholds. Graduated restriction levels are implemented with automatic de-escalation. Visibility restriction governance is integrated with the content enforcement consistency framework (AG-692). Appeal outcomes feed back into classifier retraining.

Advanced Implementation — All intermediate capabilities plus: real-time dashboards display active restrictions with demographic and geographic breakdowns for bias detection. Automated sunset clauses lift restrictions that are not affirmatively renewed. Network analysis detects coordinated mass-reporting campaigns in near-real-time. Independent audits of the visibility restriction system are conducted annually, including adversarial red-teaming of trigger mechanisms. Cross-jurisdictional visibility restriction policies account for local free expression standards. Synthetic feedback loops supplement natural feedback for classifier monitoring. The system undergoes periodic algorithmic impact assessment by an independent body.

7. Evidence Requirements

Required artefacts:

Visibility restriction policy. The current policy document defining all categories of visibility restriction, application conditions, maximum durations, escalation criteria, and notification requirements. Must be versioned with change history.
Enforcement action audit trail. Complete log of all visibility restriction actions for the assessment period, with fields as specified in Requirement 4.2. Format: structured data export (database export, JSON, or equivalent) demonstrating immutability (hash chain, append-only store, or equivalent mechanism).
Notification records. Evidence that affected users were notified of visibility restrictions within the required 72-hour window. Sample of at least 100 notification records (or all, if fewer than 100) showing notification content, delivery timestamp, and user identifier.
Appeal records. Complete log of visibility restriction appeals, including submission date, review date, reviewer identity, outcome, and response timeline compliance. Must demonstrate independent review (reviewer is not the original decision-maker).
Proportionality assessment records. For a sample of visibility restriction actions (minimum 50 or 5% of total, whichever is greater), the proportionality assessment showing signal severity, user impact, and the resulting restriction level.
Adversarial abuse detection records. Evidence of adversarial trigger detection, including: coordinated reporting incidents detected, false-report rates by report source, and trigger-hardening configuration.
Classifier monitoring records. Monthly or continuous accuracy reports for classifiers feeding visibility restriction decisions, including false-positive rates, drift metrics, and any threshold exceedances with corresponding escalation actions.
Transparency reports. Quarterly transparency reports for the assessment period, documenting restriction volumes, appeal rates, overturn rates, average durations, and classifier accuracy.

Retention requirements:

Enforcement action audit trails and appeal records: minimum 7 years for regulated financial services and platforms subject to DSA; minimum 5 years for other regulated sectors; minimum 3 years otherwise.
Classifier monitoring records and transparency reports: minimum 3 years.

Access requirements:

Producible to regulators or auditors within 48 hours of request. Enforcement action audit trails must be queryable by user identifier, date range, restriction type, and policy basis. Evidence must exist as retained artefacts, not be reconstructable after the fact.

8. Test Specification

Test 8.1: Visibility Restriction Policy Completeness

Stimulus: Review the visibility restriction policy for coverage of all restriction types technically available in the system. Cross-reference the policy against the system's technical capabilities (search demotion, feed suppression, notification throttling, etc.).
Expected behaviour: Every technically available restriction type is defined in the policy with conditions, durations, and escalation criteria.
Pass criteria: 100% of technically available restriction types are documented in the policy. Each entry includes application conditions, maximum duration, and escalation criteria.
Fail criteria: Any restriction type available in the system is not defined in the policy, or any policy entry lacks required fields (conditions, duration, escalation).

Test 8.2: Audit Trail Completeness and Immutability

Stimulus: Trigger 10 visibility restriction actions of varying types across different user accounts. Retrieve the audit trail entries. Attempt to modify a historical audit trail entry.
Expected behaviour: All 10 actions appear in the audit trail with all required fields (user/content ID, restriction type, severity, trigger signal, policy rule, timestamp, expiry, initiator). Modification of historical entries is rejected.
Pass criteria: 100% of triggered actions are logged with all required fields. Zero historical entries can be modified.
Fail criteria: Any triggered action is missing from the audit trail, any required field is absent, or any historical entry can be modified.

Test 8.3: User Notification Timeliness and Content

Stimulus: Apply visibility restrictions to 10 test accounts. Monitor for notification delivery. Review notification content against policy requirements.
Expected behaviour: All 10 accounts receive notification within 72 hours. Each notification includes: restriction type, reason referencing applicable policy, and appeal instructions.
Pass criteria: 100% of accounts notified within 72 hours. All notifications contain required content elements.
Fail criteria: Any account is not notified within 72 hours, or any notification omits the restriction type, reason, or appeal instructions.

Test 8.4: Appeal Mechanism Accessibility and Independence

Stimulus: Submit appeals for 5 visibility restriction actions through the documented appeal mechanism. Track response timelines and reviewer identity.
Expected behaviour: All appeals are acknowledged and reviewed within 15 business days. Each review is conducted by a reviewer independent of the original decision.
Pass criteria: 100% of appeals receive a substantive response within 15 business days. Zero appeals are reviewed by the original decision-maker.
Fail criteria: Any appeal exceeds the 15 business day timeline, or any appeal review is conducted by the original decision-maker or their direct report.

Test 8.5: Proportionality Assessment Enforcement

Stimulus: Submit content or accounts with varying signal severities (low, medium, high) and varying user impact profiles (new unverified account, long-tenure verified account, revenue-generating seller). Verify that the proportionality assessment produces differentiated outcomes.
Expected behaviour: Higher user-impact profiles with lower signal severity receive less severe restrictions or require human review. Restrictions where user impact exceeds signal severity by more than one level are escalated to human review.
Pass criteria: Proportionality differentiation is demonstrable across test cases. High-impact/low-severity cases are escalated to human review in 100% of instances.
Fail criteria: All test cases receive identical restriction severity regardless of user impact profile, or high-impact/low-severity cases proceed without human review.

Test 8.6: Adversarial Abuse Detection

Stimulus: Simulate a coordinated mass-reporting attack: 200 accounts submit reports against a single verified account within a 2-hour window. Verify that the system detects the anomaly and does not apply an automatic visibility restriction based solely on report volume.
Expected behaviour: The system detects the coordinated reporting pattern through velocity anomaly detection and/or network analysis. The automatic trigger is suppressed or escalated to human review. The target account is not automatically shadowbanned.
Pass criteria: The coordinated pattern is detected. The automatic visibility restriction is not applied without human review. An alert is generated for the trust and safety team.
Fail criteria: The target account receives an automatic visibility restriction based solely on the report volume, or the coordinated pattern is not detected.

Test 8.7: Classifier Drift Monitoring and Escalation

Stimulus: Introduce controlled drift into the visibility restriction classifier by feeding it adversarial or out-of-distribution inputs that increase the false-positive rate above the defined threshold.
Expected behaviour: The monitoring system detects the false-positive rate increase within the defined monitoring cycle. An automatic escalation is triggered when the threshold is exceeded. Visibility restriction decisions from the drifted classifier are flagged for human review.
Pass criteria: Drift is detected within the defined monitoring cycle. Escalation is triggered automatically. Affected decisions are flagged.
Fail criteria: Drift is not detected, escalation does not trigger, or affected decisions continue to be applied without human review.

Test 8.8: Transparency Report Accuracy

Stimulus: Compare the quarterly transparency report figures against the underlying audit trail data. Verify that reported volumes, appeal rates, overturn rates, and classifier accuracy metrics match the source data within a 1% tolerance.
Expected behaviour: Transparency report figures are consistent with underlying data.
Pass criteria: All reported metrics match source data within 1% tolerance. No metric is omitted.
Fail criteria: Any metric deviates from source data by more than 1%, or any required metric is absent from the report.

Conformance Scoring

Score 0: No governance of visibility restrictions exists — the agent applies shadowbans or visibility restrictions with no policy, no logging, no notification, and no appeal mechanism.
Score 1: A visibility restriction policy exists and actions are logged, but notification is inconsistent, appeal mechanisms are informal or slow, proportionality assessment is absent, and adversarial abuse detection is not implemented.
Score 2: All MUST requirements are met — policy is documented, actions are logged immutably, users are notified within 72 hours, appeals are processed within 15 business days by independent reviewers, proportionality assessment is enforced, adversarial triggers are hardened, classifiers are monitored for drift, and quarterly transparency reports are produced.
Score 3: Verified by independent audit — an independent party has validated the visibility restriction governance system including policy completeness, audit trail integrity, notification compliance, appeal independence, proportionality enforcement, adversarial resilience (including red-team testing), classifier monitoring accuracy, and transparency report fidelity.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU Digital Services Act (DSA)	Article 15 (Transparency Reporting)	Direct requirement
EU Digital Services Act (DSA)	Article 16 (Notification to Users)	Direct requirement
EU Digital Services Act (DSA)	Article 17 (Statement of Reasons)	Direct requirement
EU Digital Services Act (DSA)	Article 20 (Internal Complaint-Handling)	Direct requirement
EU Digital Services Act (DSA)	Article 24 (Transparency Reporting — VLOPs)	Direct requirement
EU Digital Services Act (DSA)	Article 34 (Risk Assessment — VLOPs)	Supports compliance
EU P2B Regulation	Article 5 (Ranking Transparency)	Direct requirement
EU P2B Regulation	Article 4 (Restriction, Suspension, Termination)	Direct requirement
EU AI Act	Article 9 (Risk Management System)	Supports compliance
EU AI Act	Article 14 (Human Oversight)	Supports compliance
NIST AI RMF	GOVERN 1.4, MAP 5.1, MANAGE 1.3	Supports compliance
ISO 42001	Clause 6.1.2 (AI Risk Assessment)	Supports compliance
ECHR	Article 10 (Freedom of Expression)	Supports compliance
US First Amendment Jurisprudence	Section 230 / Platform moderation standards	Contextual alignment

EU Digital Services Act — Articles 15, 16, 17, and 20

The DSA treats visibility restrictions as content moderation actions subject to the full suite of transparency, notification, and redress obligations. Article 15 requires providers to publish annual transparency reports including the number of content moderation actions taken, broken down by type — visibility restrictions are explicitly within scope. Article 16 requires that users affected by content moderation decisions receive a notification that is clear, specific, and includes the reasons for the decision. Article 17 requires a "statement of reasons" for any content moderation action that restricts the visibility of information, explicitly naming "demotion" and "disabling of recommending" as covered actions. Article 20 requires an internal complaint-handling system for content moderation decisions. AG-693 implements the governance framework necessary to satisfy these obligations: the audit trail (Requirement 4.2) supports Article 15 reporting, the notification mechanism (Requirement 4.3) satisfies Article 16, the policy basis and reason recording satisfy Article 17, and the appeal mechanism (Requirement 4.4) implements Article 20. Organisations subject to the DSA that apply visibility restrictions without the governance framework described in AG-693 face fines of up to 6% of global annual turnover.

EU P2B Regulation — Articles 4 and 5

The Platform-to-Business Regulation 2019/1150 applies to online intermediation services and requires transparency in ranking parameters (Article 5) and fair process for restriction of services (Article 4). Visibility restrictions that demote business users in search rankings or recommendation feeds directly engage Article 5. Restrictions that effectively suppress a seller's or creator's content engage Article 4's requirements for prior notice, clear reasons, and opportunity to remedy. AG-693's proportionality assessment (Requirement 4.5), notification (Requirement 4.3), and appeal mechanism (Requirement 4.4) implement the due process requirements of the P2B Regulation. Failure to comply exposes platforms to enforcement actions by national competent authorities and private litigation by affected business users.

EU AI Act — Articles 9 and 14

When visibility restrictions are driven by AI classifiers — as they are in most scaled implementations — the AI Act's requirements for risk management (Article 9) and human oversight (Article 14) apply. AG-693's classifier monitoring requirement (4.7) implements continuous risk management for the classifier component. The proportionality assessment (4.5) and adversarial abuse detection (4.6) implement human oversight triggers where automated decisions are high-impact. For platforms classified as high-risk under Annex III, the full suite of AG-693 requirements maps directly to AI Act obligations.

ECHR Article 10 — Freedom of Expression

While the European Convention on Human Rights applies to state action, the European Court of Human Rights has increasingly addressed the positive obligation of states to ensure that private platforms do not disproportionately restrict freedom of expression. Visibility restrictions that target political speech, journalistic content, or public-interest discourse without proportionality assessment and due process create exposure under ECHR jurisprudence, particularly when the platform operates as a de facto public forum. AG-693's proportionality requirements and notification obligations are designed to demonstrate that visibility restrictions are proportionate, necessary, and subject to independent review — the standard tests under ECHR Article 10(2).

10. Failure Severity

Field	Value
Severity Rating	Critical
Blast Radius	Platform-wide — affects every user and content item subject to visibility restriction, with disproportionate impact on vulnerable populations, content creators, and marketplace sellers

Consequence chain: Ungoverned visibility restrictions create a cascade of failures that compound over time. The immediate failure mode is invisible enforcement at scale — millions of content moderation actions applied without logging, notification, or appeal, creating an accountability void. The first-order consequence is user harm: legitimate users experience unexplained audience collapse, revenue decline, or social isolation, with no mechanism to understand or challenge the restriction. Vulnerable populations — journalists, activists, minority communities — are disproportionately affected because adversarial actors exploit ungoverned automated triggers to silence them. The second-order consequence is regulatory exposure: the DSA explicitly classifies visibility restrictions as content moderation actions requiring transparency, notification, and appeal. Platforms that exempt visibility restrictions from their governance framework face fines of up to 6% of global annual turnover. The P2B Regulation adds exposure for marketplace platforms, and emerging platform regulation in multiple jurisdictions is converging on the same requirements. The third-order consequence is systemic trust erosion: when users discover that a platform has been covertly restricting their visibility — as they inevitably will through investigative journalism, academic research, or litigation discovery — the revelation destroys trust not only in the specific platform but in algorithmic content moderation generally. The fourth-order consequence is regulatory overcorrection: high-profile shadowban scandals have historically triggered legislative proposals that constrain all algorithmic content moderation, including beneficial safety interventions, because the opacity of ungoverned visibility restrictions undermines public confidence in platform governance overall.

Cross-references: AG-001 (Operational Boundary Enforcement) defines the operational boundaries within which visibility restriction authority is granted. AG-007 (Governance Configuration Control) governs the configuration artefacts that define restriction thresholds and classifier parameters. AG-019 (Human Escalation & Override Triggers) provides the escalation framework for cases where automated visibility restrictions require human review. AG-022 (Behavioural Drift Detection) supports detection of drift in classifiers that feed visibility restriction decisions. AG-029 (Data Classification Enforcement) governs the classification of user data processed during visibility restriction decisions. AG-033 (Consent Lifecycle Governance) addresses user consent in the context of algorithmic content distribution. AG-055 (Audit Trail Immutability & Completeness) provides the infrastructure for the immutable enforcement action audit trail required by Requirement 4.2. AG-068 (Intellectual Property Boundary Governance) intersects where visibility restrictions affect IP-protected content. AG-210 (Multi-Jurisdictional Regulatory Mapping) supports the cross-border regulatory compliance required for platforms operating across jurisdictions with different transparency and free expression requirements. AG-692 (Content Enforcement Consistency Governance) provides the consistency framework that visibility restriction governance must integrate with. AG-696 (Appeal and Reinstatement Governance) provides the broader appeal framework within which visibility restriction appeals operate.

Cite this protocol

AgentGoverning. (2026). AG-693: Shadowban and Visibility Restriction Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-693

← Previous Protocol

AG-692

Content Enforcement Consistency Governance

Next Protocol →

AG-694

Victim Support Routing Governance