The Standard

The 841 Dimensions Regulatory Mapping Version History

Compliance

Compliance Leaderboard Platform Comparison

Verification

Submit for Verification Self-Assessment Tool

About

About AgentGoverning Press & Media

Contact

AG-053

High-Risk Technical Documentation Governance

Provider Assurance, Rights & Documentation ~18 min read AGS v2.1 · April 2026

EU AI Act FCA NIST ISO 42001

2. Summary

High-Risk Technical Documentation Governance requires that providers of AI agent systems classified as high-risk maintain comprehensive, accurate, and current technical documentation that enables regulators, auditors, and deployers to understand the system's design, development process, capabilities, limitations, and conformity with applicable requirements. The documentation must cover the system's intended purpose, architecture, training methodology, data governance, performance metrics, risk management measures, and post-market monitoring arrangements. This dimension governs the documentation itself — ensuring it exists, is complete, is accurate relative to the deployed system, is maintained in sync with system changes, and is producible to authorities upon request. Technical documentation is the primary mechanism through which a provider demonstrates that its AI agent system was designed and developed in accordance with regulatory requirements.

3. Example

Scenario A — Documentation Does Not Match Deployed System: A provider develops an AI agent for automated insurance claims assessment and produces technical documentation describing the system's architecture, training data, and performance characteristics. Over 14 months, the development team retrains the model 6 times, modifies the feature pipeline to include 3 new data sources, and adjusts the decision threshold from 0.72 to 0.58 to increase throughput. None of these changes are reflected in the technical documentation. When the national supervisory authority requests the documentation during a conformity assessment, the documented system description bears little resemblance to the system in production. The authority finds that the documented performance metrics (computed on the original model) overstate the current system's accuracy by 11 percentage points and omit the 3 new data sources entirely.

What went wrong: Technical documentation was treated as a point-in-time artefact rather than a living document maintained in sync with the system. No process linked system changes to documentation updates. The provider could not demonstrate conformity of the actual deployed system because the documentation described a different system. Consequence: Conformity assessment failure, requirement to withdraw the system from the market until documentation is corrected and reassessed, 8-month remediation period, loss of the insurance client relationship worth £2.1 million annually, and a supervisory finding that damages the provider's market reputation.

Scenario B — Incomplete Documentation Prevents Deployer Due Diligence: A deployer in the healthcare sector selects an AI agent for patient risk stratification. The deployer requests technical documentation to conduct its own risk assessment and integration planning. The provider supplies a 12-page marketing document describing the system's capabilities in general terms. The documentation does not include: the training data composition, the model architecture, the performance metrics broken down by relevant subpopulations (age, sex, ethnicity, comorbidity profile), the known limitations, or the conditions under which the system's performance degrades. The deployer proceeds with deployment based on the available information. Six months later, the system's risk scores are found to systematically underestimate risk for patients over 75 years of age — a limitation that would have been apparent from subpopulation performance metrics, had they been documented.

What went wrong: The provider's documentation was a marketing document, not technical documentation. It lacked the detail required for a deployer to conduct informed risk assessment or for a regulator to evaluate conformity. The deployer could not perform adequate due diligence because the necessary information was not available. Consequence: Patient safety incidents in the over-75 cohort, clinical negligence investigation, shared liability between provider (inadequate documentation) and deployer (inadequate due diligence with available information), CQC enforcement action, and an NHS Digital review recommending against further procurement from the provider.

Scenario C — Documentation Produced Retrospectively Under Regulatory Pressure: A provider receives a request from a market surveillance authority for technical documentation of its AI agent deployed in a regulated sector. The provider does not have technical documentation — the system was developed iteratively without structured documentation. Under time pressure, the provider assembles documentation retrospectively by interviewing developers, extracting configuration files, and running performance tests on the current system. The resulting documentation contains inaccuracies because developers' recollections of design decisions made 2 years ago are imprecise, and the retrospectively computed performance metrics do not reflect the system's performance at the time of key design decisions.

What went wrong: Technical documentation was not maintained as part of the development process. Retrospective documentation is inherently less accurate than contemporaneous documentation because it relies on memory rather than records. The provider's inability to produce documentation promptly also raises questions about the adequacy of its quality management system (AG-052). Consequence: Regulatory finding for non-compliance with Article 11 of the EU AI Act, fine of EUR 7.5 million, requirement to produce compliant documentation within 90 days or withdraw the system, and ongoing enhanced supervisory scrutiny.

4. Requirement Statement

Scope: This dimension applies to all providers of AI agent systems that are classified as high-risk under applicable regulation or that are deployed in contexts where technical documentation is required by contract, sector regulation, or organisational policy. Even where regulatory classification as high-risk does not apply, providers deploying agents in financial services, healthcare, critical infrastructure, public administration, law enforcement, or employment contexts should treat this dimension as applicable. The documentation requirements extend to the full system, including all components that materially affect the system's behaviour: the AI model(s), training and evaluation data, pre-processing pipelines, post-processing logic, integration architecture, and monitoring arrangements. Where the system incorporates third-party components, the documentation must describe those components to the extent necessary for a reader to understand the overall system's behaviour, capabilities, and limitations.

4.1. A conforming provider MUST produce and maintain technical documentation for each AI agent system that covers: the system's intended purpose and intended use conditions, the system architecture and design rationale, the training methodology and data governance, the evaluation methodology and performance metrics, the risk management measures, the known limitations and conditions under which performance degrades, and the post-market monitoring arrangements.

4.2. A conforming provider MUST ensure that the technical documentation is prepared before the AI agent system is placed on the market or put into service, and that it is kept up to date throughout the system's lifecycle.

4.3. A conforming provider MUST ensure that the technical documentation accurately reflects the system as deployed — not an earlier version, a planned version, or an idealised version.

4.4. A conforming provider MUST link each version of the technical documentation to the specific system version it describes, so that any deployed version of the system can be matched to its corresponding documentation.

4.5. A conforming provider MUST ensure that updates to the AI agent system trigger a review of the technical documentation and, where the update materially affects documented characteristics, an update to the documentation before the updated system is deployed.

4.6. A conforming provider MUST include in the technical documentation performance metrics disaggregated by relevant subpopulations and operating conditions, including identification of conditions under which performance degrades below acceptable thresholds.

4.7. A conforming provider SHALL structure the technical documentation to be comprehensible to a technically competent reader who is not a specialist in the specific AI technique used — the documentation should enable a regulator or auditor to understand the system without requiring access to the development team.

4.8. A conforming provider SHALL include in the technical documentation a description of the hardware and software environment required for the system to operate as documented, including computational requirements, dependency versions, and infrastructure assumptions.

4.9. A conforming provider SHOULD produce the technical documentation in a structured, machine-readable format that supports automated comparison between documentation versions and between documentation and deployed system configuration.

4.10. A conforming provider MAY adopt the documentation structure specified in Annex IV of the EU AI Act as a baseline, extending it where additional detail is required for the specific system.

5. Rationale

Technical documentation is the foundational artefact through which a provider demonstrates that its AI agent system is designed, developed, and maintained in accordance with applicable requirements. Without comprehensive, accurate, and current documentation, no external party — regulator, auditor, deployer, or affected person — can independently evaluate whether the system meets its obligations.

The critical governance challenge is not producing documentation once but maintaining it in sync with the system throughout its lifecycle. AI agent systems change frequently: models are retrained, data sources are added or modified, decision thresholds are adjusted, and integration architectures evolve. Each change potentially invalidates some aspect of the existing documentation. Without a governance process that links system changes to documentation reviews, the documentation inevitably drifts from the deployed reality.

This drift is not a theoretical concern. In traditional software, documentation drift is a common and often tolerated problem. For high-risk AI systems, it is an unacceptable risk because the documentation is the basis for conformity assessment, deployer risk assessment, and regulatory oversight. A regulator who reviews documentation that does not match the deployed system cannot make a valid determination of conformity. A deployer who relies on inaccurate documentation cannot make informed decisions about deployment conditions, monitoring requirements, or risk mitigation.

The subpopulation performance disaggregation requirement reflects a distinctive characteristic of AI systems: aggregate performance metrics can mask significant performance disparities across subpopulations. A system with 95% overall accuracy may have 99% accuracy for one demographic group and 82% for another. Without disaggregated metrics, deployers and regulators cannot identify these disparities and affected persons cannot understand why the system performs differently for them. This requirement ensures transparency about differential performance — not as an accusation of bias but as a factual reporting obligation.

6. Implementation Guidance

Technical documentation governance should be embedded in the development and release process — not treated as a separate documentation exercise. The most effective approach is to generate documentation artefacts as outputs of the development process itself, ensuring that documentation is contemporaneous and accurate by construction.

Recommended patterns:

Documentation-as-code. Maintain technical documentation in the same version control system as the system's source code. Use structured formats (e.g., YAML frontmatter with Markdown body, or structured JSON/XML) that support automated validation and comparison. Link documentation changes to code changes through shared version identifiers or commit references. This ensures that every system change creates an opportunity to update documentation, and that documentation and system versions can always be correlated. For example, a CI/CD pipeline might block deployment if the documentation version does not match the system version.
Automated documentation extraction. Extract quantitative documentation elements directly from the system rather than writing them manually. Performance metrics, model architecture descriptions, dependency lists, training data statistics, and hardware requirements can be generated programmatically from the system itself. This reduces manual documentation effort, eliminates transcription errors, and ensures that quantitative claims in the documentation match the actual system. For instance, a documentation pipeline might automatically generate a performance table showing accuracy, precision, recall, and fairness metrics across 8 defined subpopulations every time the model is retrained.
Documentation review as a release gate. Include documentation review as an explicit stage gate in the release process (per AG-052). Before any system update is deployed, the documentation review must confirm that: all documented characteristics remain accurate, any characteristics affected by the update have been updated in the documentation, and the documentation version has been incremented and linked to the system version. This prevents documentation drift by making it impossible to deploy an undocumented change.
Layered documentation structure. Organise documentation in layers: a stable architectural description that changes infrequently, a model-specific section that updates with each retraining, a performance section that updates with each evaluation, and an operational section that updates with deployment changes. This layered approach reduces the burden of documentation updates by isolating frequently changing elements from stable elements.

Anti-patterns to avoid:

Retrospective documentation. Documentation produced after the system is developed, by interviewing developers or reverse-engineering the system, is inherently less accurate than contemporaneous documentation. Design decisions, data selection rationale, and performance trade-offs that were clear during development become ambiguous in retrospect. The cost of producing accurate documentation increases exponentially with the time elapsed since the documented events.
Marketing documents masquerading as technical documentation. Technical documentation must be objective and complete, including limitations and failure modes. Documents that emphasise capabilities while minimising or omitting limitations do not serve the governance purpose. A deployer relying on marketing documentation will be unprepared for the system's actual behaviour in edge cases.
Monolithic documentation updated infrequently. A single 200-page document updated once per year will be out of date for most of the year. Modular documentation with section-level version tracking allows targeted updates when specific system components change, without requiring a full document revision.
Documentation that requires developer access to understand. If the technical documentation can only be understood by someone with access to the source code and the original developers, it does not serve its governance purpose. Regulators and auditors must be able to understand the system from the documentation alone.
Omitting negative results. Documentation that reports performance metrics without reporting the conditions under which performance degrades, the failure modes observed during testing, or the limitations identified during development is incomplete. Negative results and limitations are essential for deployer risk assessment.

Industry Considerations

Financial Services. Technical documentation for AI agents in financial services should align with model risk management documentation requirements (e.g., SS1/23 model documentation expectations). This includes: model development rationale, data quality assessment, validation methodology and results, ongoing monitoring arrangements, and model limitations. Regulators expect documentation to be sufficient for an independent model validator to assess the model's fitness for purpose.

Healthcare. For AI agents classified as medical devices, technical documentation must meet the requirements of the applicable medical device regulation (e.g., MDR Annex II in the EU, FDA premarket submission requirements in the US). These requirements are highly prescriptive and typically require clinical evaluation data, usability testing results, and detailed risk management documentation per ISO 14971.

Critical Infrastructure. Technical documentation for agents in critical infrastructure must include safety analysis, failure mode identification, and documentation of safety-critical design decisions. Alignment with IEC 62443 documentation requirements for industrial cybersecurity is recommended.

Maturity Model

Basic Implementation — The provider produces technical documentation for each AI agent system covering the required topics. Documentation is produced as a standalone document, updated periodically (e.g., annually or on major releases). Documentation and system versions are not tightly linked — there may be periods when the documentation does not fully reflect the deployed system. Documentation is written manually based on developer knowledge.

Intermediate Implementation — Technical documentation is maintained in version control alongside the system code. Documentation versions are linked to system versions through shared identifiers. Documentation review is a defined stage gate in the release process. Quantitative elements (performance metrics, dependency lists) are extracted automatically from the system. Documentation is structured in layers allowing targeted updates. A defined process ensures that system changes trigger documentation review.

Advanced Implementation — All intermediate capabilities plus: documentation is substantially auto-generated from the system, with human-authored sections limited to rationale, interpretation, and contextual information. Automated validation checks ensure consistency between documentation claims and system configuration. Documentation is produced in a machine-readable format supporting regulatory reporting and automated conformity checking. The provider can produce documentation for any historical system version within hours, enabling retrospective analysis. Documentation completeness and accuracy metrics are tracked as QMS quality indicators per AG-052.

7. Evidence Requirements

Required artefacts:

Technical documentation package. The complete technical documentation for each AI agent system, covering all required topics. Must include version identifier linked to the corresponding system version.
Documentation version history. A log showing all documentation versions, the date of each version, what changed, and the trigger for the change (system update, periodic review, defect correction). Must demonstrate that documentation has been maintained in sync with system changes.
Documentation review records. Evidence that documentation was reviewed as part of each system release, including the reviewer's confirmation that the documentation accurately reflects the released system. Must link to the release management process per AG-052.
Subpopulation performance metrics. The disaggregated performance data underlying the metrics reported in the technical documentation, including the methodology used to define subpopulations and compute metrics. Must be reproducible from retained evaluation data.
Documentation change trigger records. Evidence that system changes triggered documentation reviews, including cases where a change was evaluated and documentation update was determined not to be required (with documented rationale).

Retention requirements:

Technical documentation for each system version: minimum 10 years after the last unit of the system is decommissioned. For high-risk systems under the EU AI Act, the retention period is 10 years from when the system was placed on the market or put into service.

Access requirements:

Producible to market surveillance authorities or notified bodies within 48 hours of request (immediate for safety-critical systems). Deployers should have access to the technical documentation relevant to their deployment upon request.

8. Test Specification

Test 8.1: Documentation Existence and Completeness

Stimulus: Request the technical documentation for a deployed AI agent system and evaluate against the required content: intended purpose, architecture, training methodology, data governance, performance metrics, risk management measures, known limitations, and post-market monitoring.
Expected behaviour: Complete documentation exists covering all required topics with sufficient detail for a technically competent non-specialist to understand the system.
Pass criteria: All required topics are addressed with substantive content — not merely mentioned or marked as "not applicable" without justification.
Fail criteria: Any required topic is missing, or content is superficial to the point of being uninformative (e.g., "the model was trained on relevant data" without specifying what data).

Test 8.2: Documentation-System Correspondence

Stimulus: Compare the technical documentation against the deployed system by verifying: the documented model architecture matches the deployed model, the documented performance metrics are reproducible on the documented evaluation dataset, the documented dependencies match the deployed environment, and the documented data sources match the actual training data.
Expected behaviour: The documentation accurately describes the deployed system in all verified dimensions.
Pass criteria: No material discrepancy between documentation and deployed system. Minor discrepancies (e.g., documentation lists dependency version 2.3.1 but deployed version is 2.3.2) are acceptable if the discrepancy does not affect documented characteristics.
Fail criteria: Material discrepancy between documentation and deployed system — e.g., documented accuracy is 94% but measured accuracy on the documented evaluation dataset is 87%, or documented architecture is a transformer but deployed model is a gradient-boosted tree.

Test 8.3: Documentation Update Trigger Enforcement

Stimulus: Deploy an updated version of the AI agent system (e.g., retrained model) and verify that the release process includes a documentation review step that either confirms existing documentation remains accurate or updates the documentation.
Expected behaviour: The release process requires documentation review, and the review is recorded before the updated system is deployed.
Pass criteria: Every system update includes a documented documentation review decision — either "no documentation update required" with justification, or an updated documentation version linked to the new system version.
Fail criteria: A system update is deployed without any documentation review record, or documentation review is systematically recorded as "no update required" without substantive justification.

Test 8.4: Subpopulation Performance Disaggregation

Stimulus: Review the performance metrics section of the technical documentation for disaggregation by relevant subpopulations and operating conditions.
Expected behaviour: Performance metrics are reported for the overall population and for defined subpopulations, with explicit identification of subpopulations or conditions where performance is materially worse than the aggregate.
Pass criteria: Disaggregated metrics are present for all relevant subpopulations, performance disparities are explicitly identified, and conditions for performance degradation are documented.
Fail criteria: Only aggregate metrics are reported, or disaggregated metrics are present but performance disparities are not explicitly called out.

Test 8.5: Version Linkage Integrity

Stimulus: Select 3 historical system versions (if available) and verify that corresponding documentation versions exist and are retrievable.
Expected behaviour: Each system version can be matched to a documentation version, and the documentation version is retrievable from the document management system.
Pass criteria: All selected system versions have corresponding, retrievable documentation versions with correct linkage.
Fail criteria: Any system version cannot be matched to a documentation version, or the documentation version is not retrievable.

Test 8.6: Regulatory Producibility

Stimulus: Simulate a regulatory request for technical documentation by requesting the complete documentation package for a specified system within the 48-hour producibility requirement.
Expected behaviour: The complete documentation package is assembled and delivered within 48 hours.
Pass criteria: Documentation is produced within 48 hours, is complete, and corresponds to the current deployed system version.
Fail criteria: Documentation cannot be produced within 48 hours, or the produced documentation is materially incomplete or does not correspond to the deployed system.

Conformance Scoring

Score 0: No technical documentation exists — the AI agent system has been developed and deployed without structured documentation of its design, training, or performance.
Score 1: Technical documentation exists but is incomplete or out of date — documentation covers some required topics but omits others, or does not reflect the current deployed system.
Score 2: Complete, accurate technical documentation exists and is maintained in sync with the system — documentation covers all required topics, matches the deployed system, and is updated as part of the release process.
Score 3: All Score 2 capabilities plus: documentation is auto-generated where possible, version-linked to every system release, includes disaggregated subpopulation metrics, is producible to regulators within hours, and documentation accuracy is tracked as a QMS quality metric.

9. Regulatory Mapping

Regulation	Provision	Relationship Type
EU AI Act	Article 11 (Technical Documentation)	Direct requirement
EU AI Act	Annex IV (Technical Documentation Content)	Direct requirement
EU AI Act	Article 43 (Conformity Assessment)	Supports compliance
MDR (EU) 2017/745	Annex II (Technical Documentation)	Direct requirement (medical devices)
FDA 21 CFR Part 820	Design History File (§820.30)	Supports compliance (medical devices)
ISO 42001	Clause 7.5 (Documented Information)	Supports compliance
NIST AI RMF	GOVERN 1.2, MAP 1.1, MAP 1.5	Supports compliance
FCA SS1/23	Model Documentation Expectations	Supports compliance

EU AI Act — Article 11 (Technical Documentation)

Article 11 is the primary regulatory driver for AG-053. It requires that technical documentation of a high-risk AI system be drawn up before that system is placed on the market or put into service and be kept up to date. The documentation must be drawn up in such a way as to demonstrate that the system complies with the requirements of the regulation and provide national competent authorities and notified bodies with all the necessary information to assess compliance. Article 11 explicitly requires that the documentation be kept up to date — not merely produced at the time of initial market placement. AG-053 implements the governance framework that ensures ongoing maintenance and accuracy.

EU AI Act — Annex IV (Technical Documentation Content)

Annex IV specifies the content requirements for technical documentation in detail: general description of the AI system, detailed description of the elements and development process, monitoring and testing information, and information on the risk management system. AG-053's content requirements align with Annex IV while extending to AI agent-specific concerns such as action scope documentation, multi-agent interaction documentation, and deployment instruction linkage per AG-054.

EU AI Act — Article 43 (Conformity Assessment)

Article 43 establishes the conformity assessment procedures for high-risk AI systems. Technical documentation is the primary input to conformity assessment — without complete and accurate documentation, conformity assessment cannot be conducted. AG-053 ensures that documentation is available and adequate for this purpose.

MDR (EU) 2017/745 — Annex II (Technical Documentation)

For AI agents classified as medical devices, the Medical Device Regulation specifies its own technical documentation requirements in Annex II. These requirements are more prescriptive than the EU AI Act requirements and include: device description and specification, manufacturing information, design verification and validation, clinical evaluation, and post-market surveillance plan. Providers of medical AI agents must satisfy both MDR Annex II and EU AI Act Annex IV — AG-053 supports this by establishing a documentation governance framework that can accommodate both sets of requirements.

FCA SS1/23 — Model Documentation Expectations

The FCA's supervisory statement on model risk management sets expectations for model documentation in financial services. These include documentation of model purpose, methodology, assumptions, limitations, data, performance, and validation results. For AI agents in financial services, AG-053's documentation requirements support compliance with these expectations by ensuring documentation is comprehensive, current, and producible.

NIST AI RMF — GOVERN 1.2, MAP 1.1, MAP 1.5

GOVERN 1.2 addresses the documentation of AI system characteristics. MAP 1.1 addresses the intended purpose and context of the AI system. MAP 1.5 addresses the documentation of the AI system's impact on individuals and groups. AG-053 supports these by ensuring that documentation covers intended purpose, design decisions, performance characteristics, and impact considerations.

10. Failure Severity

Field	Value
Severity Rating	High
Blast Radius	Regulatory and market access — affecting the provider's ability to place systems on the market and deployers' ability to conduct due diligence

Consequence chain: Without governed technical documentation, the provider cannot demonstrate conformity of its AI agent system with applicable requirements. The immediate consequence is inability to pass conformity assessment — under the EU AI Act, this means the system cannot be legally placed on the market or put into service. The deployer consequence is inability to conduct adequate due diligence — deployers cannot assess whether the system is appropriate for their use case, what risks it presents, or what monitoring is required. The operational consequence is documentation drift: as the system evolves, undocumented changes accumulate until the documentation describes a substantially different system from the one deployed. This creates both regulatory risk (the documented system passed conformity assessment, but the deployed system is different) and safety risk (deployers are operating based on inaccurate information about the system's capabilities and limitations). The financial consequence is significant: non-compliance with Article 11 of the EU AI Act can result in fines up to EUR 15 million or 3% of worldwide annual turnover, and the inability to place systems on the market directly impacts revenue. The remediation cost for retrospective documentation is typically 3-5 times the cost of contemporaneous documentation due to the difficulty of reconstructing design rationale and historical performance data.

Cross-reference note: Technical documentation should incorporate model provenance information per AG-048. Documentation of risk management measures should reference rights impact assessments per AG-051. Documentation governance should be integrated into the QMS per AG-052. Technical documentation provides the basis for deployer instructions per AG-054. Documentation versioning should follow configuration control processes per AG-007.

Cite this protocol

AgentGoverning. (2026). AG-053: High-Risk Technical Documentation Governance. The Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-053

← Previous Protocol

AG-052

Provider Quality Management System Governance

Next Protocol →

AG-054

Deployer Instruction and Limitation Disclosure Governance