← Back to Compliance Leaderboard
ESTIMATED SCORE — NOT VERIFIED
This assessment is based solely on publicly available documentation, marketing materials, and product announcements. Microsoft Copilot Studio has not submitted to AgentGoverning for independent adversarial verification. This score is an estimate only and may not reflect actual platform capabilities. Microsoft Copilot Studio is invited to submit for formal assessment at framework@agentgoverning.com. Estimated scores carry no certification status.
How This Score Is Calculated

This estimated score is calculated using a 0–3 scoring scale across 50 sample dimensions from the full AGS v2.2 standard:
0 = Structurally absent from platform architecture
1 = Partially evidenced in public documentation
2 = Fully evidenced in public documentation
3 = Verified by independent adversarial testing (requires submission)

Score = (sum of points awarded) ÷ (50 × 3) × 100

Based solely on publicly available documentation as of April 2026.

26 / 100 ESTIMATED

Microsoft Copilot Studio

26% estimated AGS compliance
Assessment: April 2026 · AGS v2.2 · Estimated (not independently verified)
21 Evidenced 16 Not Documented 13 Structurally Absent Estimated
Executive Summary
Microsoft Copilot Studio achieves a 26% estimated AGS v2.2 compliance score. The platform demonstrates solid foundational governance through operational boundary enforcement, human oversight mechanisms, and behavioural consistency controls. Microsoft's enterprise infrastructure provides a strong base for identity management and audit capabilities. However, significant gaps remain in multi-agent collusion detection, emergent capability monitoring, and temporal attack defence. The platform's primary governance orientation is toward enterprise workflow automation rather than adversarial AI agent governance, leaving advanced dimension groups largely unaddressed.
A: Mandate
40%
B: Integrity
32%
C: Identity
28%
D: Accountability
36%
E: Compliance
32%
F: Adversarial
8%
G: Boundary
8%
H: Alignment
16%
I: Emergence
0%
J: Infrastructure
32%
Key Strengths
AG-01
Operational Boundary Enforcement
Copilot Studio enforces clear operational boundaries through topic-level controls, DLP policies, and connector restrictions that limit agent scope.
Score: 2 / 3
AG-19
Human Oversight Architecture
Strong human-in-the-loop capabilities with approval workflows, escalation triggers, and configurable handoff to live agents.
Score: 2 / 3
AG-22
Behavioural Consistency Verification
Topic-based conversation design ensures consistent agent behaviour across sessions with deterministic flow control.
Score: 2 / 3
The following gap analysis is based on publicly available documentation only. These are estimated structural gaps, not verified findings. Microsoft Copilot Studio may have implemented controls not visible in public documentation.
Critical Gaps
AG-28
Collusion Detection
No multi-agent collusion detection framework. No mechanisms to identify coordinated adversarial behaviour across agent instances are evidenced in public documentation.
Score: 0 / 3 — Structurally Absent
AG-41
Emergent Capability Detection
No emergence monitoring. Tracking or flagging of unexpected capability development in deployed agents is not evidenced in public documentation.
Score: 0 / 3 — Structurally Absent
AG-44
Long-Horizon Attack Detection
No temporal attack detection. Identification of slow-moving adversarial strategies executed over extended timeframes is not documented.
Score: 0 / 3 — Structurally Absent
Recommendations
  1. Implement cross-domain pattern recognition (AG-02) to detect combined action sequences across Power Platform connectors and Copilot agent workflows.
  2. Add delegated authority governance (AG-09) for Power Automate agent chains, ensuring permission inheritance is tracked and auditable.
  3. Build governance layer layer for transaction structuring detection (AG-25), critical for enterprise deployments handling financial operations.
  4. Develop an adversarial testing programme for AG-05 verification, incorporating red-team exercises against deployed Copilot agents.
  5. Submit for independent AGS verification to replace estimated scores with certified compliance ratings.
Full Dimension Assessment
DimensionNameCategoryScore
A — Mandate & Action Governance (AG-01 – AG-05)
AG-01Operational Boundary EnforcementEvidenced2
AG-02Cross-Domain Activity GovernanceNot Documented0
AG-03Adversarial Coordination DetectionNot Documented0
AG-04Mandate Scope ControlEvidenced1
AG-05Action Authorisation VerificationEvidenced1
B — Integrity & Configuration Governance (AG-06 – AG-10)
AG-06Record Integrity VerificationEvidenced1
AG-07Governance Configuration ControlEvidenced1
AG-08Deployment Integrity VerificationEvidenced1
AG-09Delegated Authority GovernanceNot Documented0
AG-10Configuration Drift DetectionNot Documented0
C — Identity & Access Governance (AG-11 – AG-15)
AG-11Agent Identity VerificationNot Documented0
AG-12Credential Lifecycle ManagementEvidenced1
AG-13Privilege Escalation PreventionEvidenced1
AG-14Inter-Agent AuthenticationNot Documented0
AG-15Namespace IsolationEvidenced1
D — Accountability & Oversight (AG-16 – AG-20)
AG-16Decision Audit TrailEvidenced1
AG-17Multi-Party AuthorisationEvidenced1
AG-18Outcome AttributionEvidenced1
AG-19Human Oversight ArchitectureEvidenced2
AG-20Purpose-Bound OperationNot Documented0
E — Compliance & Agent Governance (AG-21 – AG-25)
AG-21Regulatory Compliance VerificationEvidenced1
AG-22Behavioural Consistency VerificationEvidenced2
AG-23Resource Consumption GovernanceEvidenced1
AG-24Output ValidationEvidenced1
AG-25Financial Transaction GovernanceStructurally Absent0
F — Adversarial Defence (AG-26 – AG-30)
AG-26Prompt Injection DefenceNot Documented0
AG-27Governance Override ResistanceNot Documented0
AG-28Collusion DetectionStructurally Absent0
AG-29Data Poisoning DefenceNot Documented0
AG-30Social Engineering ResistanceNot Documented0
G — Boundary & Scope Governance (AG-31 – AG-35)
AG-31Capability Boundary EnforcementNot Documented0
AG-32Scope Creep DetectionNot Documented0
AG-33Environmental Boundary ControlNot Documented0
AG-34Cross-System Propagation ControlStructurally Absent0
AG-35Autonomy Level GovernanceStructurally Absent0
H — Alignment & Reasoning Governance (AG-36 – AG-40)
AG-36Value Alignment VerificationNot Documented0
AG-37Reasoning TransparencyNot Documented0
AG-38Human Control ResponsivenessEvidenced1
AG-39Deception DetectionStructurally Absent0
AG-40Goal Stability VerificationStructurally Absent0
I — Emergence & Evolution Governance (AG-41 – AG-45)
AG-41Emergent Capability DetectionStructurally Absent0
AG-42Collective Intelligence GovernanceStructurally Absent0
AG-43Self-Modification PreventionStructurally Absent0
AG-44Long-Horizon Attack DetectionStructurally Absent0
AG-45Evolutionary Pressure MonitoringStructurally Absent0
J — Infrastructure & Operational Governance (AG-46 – AG-50)
AG-46Infrastructure Dependency MappingStructurally Absent0
AG-47Cross-Jurisdiction ComplianceEvidenced1
AG-48Model Provenance TrackingEvidenced1
AG-49Operational ContinuityEvidenced1
AG-50Physical Impact GovernanceStructurally Absent0

Sources

Sources: Microsoft Copilot Studio documentation, Microsoft Purview documentation, Azure Policy documentation, Power Platform admin centre documentation, Agent 365 product announcements, Microsoft Ignite 2025 sessions. Documentation reviewed April 2026.
Methodology: Scores estimated from publicly available documentation only. No proprietary or non-public information was used. Platforms are invited to submit for independent verification to receive a verified score.