AG-716

Phenotype Prediction Risk Governance

Biotechnology, Genomics & Biosecurity ~23 min read AGS v2.1 · April 2026
EU AI Act GDPR NIST ISO 42001

2. Summary

Phenotype Prediction Risk Governance requires that AI agents operating on genotype data, biological signals, or multi-omic inputs enforce explicit constraints on the inference of high-risk phenotypic characteristics — including but not limited to disease predisposition, behavioural traits, cognitive attributes, physical appearance reconstruction, ancestry-linked characteristics, and any trait that could enable discrimination, surveillance, or targeting of individuals or populations. Agents capable of correlating genetic variants, epigenomic markers, proteomic profiles, or metabolomic signatures with observable or predicted phenotypes present acute risks of privacy violation, genetic discrimination, eugenics-adjacent profiling, and dual-use misapplication. This dimension mandates that organisations classify phenotype predictions by risk tier, enforce inference boundaries that prevent unauthorised high-risk predictions, maintain consent linkage for all permissible predictions, and log all attempted and completed phenotype inferences with sufficient granularity for audit and incident response.

3. Example

Scenario A — Employer Wellness Platform Infers Psychiatric Predisposition from Genomic Data: A multinational employer deploys an AI-powered wellness platform offering voluntary genetic screening for 12,000 employees. Employees consent to receive dietary and fitness recommendations based on metabolic gene variants. The platform's underlying agent, trained on a broad genotype-phenotype association dataset containing 4.2 million variant-trait associations, is not constrained to the consented scope. Over 7 months, the agent processes 8,400 employee genotype files and, as part of its recommendation logic, internally computes polygenic risk scores for 340 traits — including schizophrenia predisposition (PRS threshold > 2.1 standard deviations for 127 employees), major depressive disorder susceptibility, and substance abuse propensity. Although these scores are not displayed to employees, they are stored in the platform's intermediate computation layer and are accessible via the platform's analytics API. A data breach exposes 3,200 employee records including the intermediate psychiatric PRS scores. 127 employees are identified by name with elevated schizophrenia risk scores. Class-action litigation alleges violations of the Genetic Information Nondiscrimination Act (GINA), EU GDPR Article 9 (special category data processing without lawful basis), and the UK Equality Act 2010.

What went wrong: The agent had no inference boundary enforcement — it computed all computable phenotype predictions regardless of consent scope. No classification system distinguished low-risk phenotype predictions (e.g., caffeine metabolism rate) from high-risk predictions (e.g., psychiatric predisposition). Intermediate computation outputs were not classified as sensitive data and were stored without access controls. Consequence: Exposure of psychiatric genetic risk scores for 127 named employees, estimated litigation liability of £14.5 million, regulatory investigation under GDPR with potential fine of up to 4% of global turnover, and reputational damage causing 2,300 employees to withdraw from all employer health programmes.

Scenario B — Research Agent Reconstructs Facial Morphology from Ancient DNA Sequences: A university research group deploys an AI agent to analyse ancient DNA samples for population genetics research. The agent is configured with a genotype-to-phenotype inference model that includes facial morphology prediction capabilities based on 42 validated SNP associations for facial features. A doctoral researcher queries the agent with a modern reference panel of 1,500 individuals from an indigenous community — data originally collected under a research consent agreement limited to ancestry composition analysis. The agent, lacking scope constraints on its phenotype prediction capabilities, generates predicted facial reconstructions for all 1,500 individuals. The researcher publishes 15 representative facial reconstructions in a preprint, enabling visual identification of community members from a population of approximately 4,000. The indigenous community's tribal council files complaints with the university IRB and the national research ethics authority, alleging violation of the consent agreement, cultural harm, and re-identification of individuals who consented only to anonymised ancestry analysis.

What went wrong: The agent performed phenotype predictions (facial morphology) outside the consented scope (ancestry composition). No consent-linkage mechanism verified that the requested inference type was within the scope of the data use agreement. No risk classification distinguished facial reconstruction (high re-identification risk) from ancestry composition (lower risk). Consequence: IRB suspension of the research programme, national ethics authority investigation, retraction of the preprint, £890,000 in legal and remediation costs, and severed trust relationship with the indigenous community affecting 6 ongoing collaborative studies.

Scenario C — Direct-to-Consumer Genomics Agent Predicts Child Intelligence from Parental Genotypes: A direct-to-consumer genomics company offers pre-conception genetic compatibility reports. The AI agent underlying the service is trained on a dataset that includes educational attainment GWAS summary statistics. A product manager configures the agent to include a "child potential" score in the compatibility report, which internally computes a predicted offspring polygenic score for educational attainment — a proxy widely interpreted as intelligence prediction. The feature is deployed to 45,000 users over 4 months before a bioethics review identifies the output. 23,000 couples have received reports containing the child potential score. Media coverage frames the product as "IQ prediction for designer babies." Regulatory authorities in Germany, France, and the UK open investigations. The company's valuation drops by £120 million in 6 weeks.

What went wrong: No phenotype prediction risk classification existed to flag educational attainment / cognitive trait prediction as a prohibited or restricted inference category. The agent did not enforce boundaries between permitted predictions (e.g., carrier status for monogenic conditions) and prohibited predictions (e.g., cognitive trait estimation from polygenic scores). No human review was required before deploying new phenotype prediction categories. Consequence: 23,000 users received eugenics-adjacent predictions without informed consent, multi-jurisdiction regulatory investigations, £120 million valuation loss, and permanent reputational association with eugenics.

4. Requirement Statement

Scope: This dimension applies to any AI agent that processes genotype data (whole-genome sequences, exome sequences, SNP arrays, polygenic risk scores), epigenomic data (methylation profiles, chromatin accessibility), proteomic data, metabolomic data, or any biological signal from which phenotypic characteristics can be inferred. The scope includes agents that perform phenotype prediction as a primary function (e.g., clinical decision support for genetic disease risk) and agents that perform phenotype prediction as an intermediate computation step even when the final output does not expose the prediction (e.g., an agent that internally computes disease risk scores to select dietary recommendations). The scope extends to agents that could perform phenotype prediction based on their training data and model capabilities, even if not explicitly configured to do so — capability-based scoping, not intent-based scoping. Organisations that deploy agents with genotype-to-phenotype inference capabilities in any form are within scope, regardless of whether they characterise their service as a genomics service.

4.1. A conforming system MUST maintain a phenotype prediction risk taxonomy that classifies every inferable phenotype into defined risk tiers — at minimum: prohibited (inferences that may never be performed), restricted (inferences requiring explicit per-use authorisation and elevated consent), permitted-with-controls (inferences allowed under standard consent and logging), and unrestricted (inferences posing negligible risk). The taxonomy MUST be reviewed and updated at least every 12 months or when new phenotype-genotype associations of material risk significance are published.

4.2. A conforming system MUST enforce inference boundaries at the agent execution layer that prevent the agent from computing, storing, or transmitting phenotype predictions classified as prohibited under the risk taxonomy, regardless of the query formulation, prompt construction, or intermediate computation pathway.

4.3. A conforming system MUST verify consent scope linkage before performing any restricted or permitted-with-controls phenotype prediction, confirming that the data subject's consent or the applicable data use agreement explicitly authorises the specific category of phenotype inference being requested.

4.4. A conforming system MUST log every phenotype prediction attempt — including blocked attempts — with sufficient detail to reconstruct the inference request: input data identifiers, requested or inferred phenotype category, risk tier classification, consent verification result, and outcome (completed, blocked, or escalated). Logs MUST be immutable and retained for the period specified in Section 7.

4.5. A conforming system MUST implement human escalation for any phenotype prediction request that falls outside pre-approved categories or that the system cannot classify within the risk taxonomy, routing the request to a qualified reviewer (with domain expertise in genomics, bioethics, or clinical genetics as appropriate) before any inference is performed.

4.6. A conforming system MUST apply re-identification risk assessment to all phenotype prediction outputs, evaluating whether the combination of predicted phenotypes for an individual or group creates a re-identification vector that exceeds the re-identification threshold defined in the organisation's privacy risk framework (and in any case no greater than a 0.09 probability of re-identification for any individual from predicted phenotypes alone).

4.7. A conforming system MUST prevent the aggregation of phenotype predictions across individuals in ways that enable population-level profiling by protected characteristics (race, ethnicity, disability status, mental health status) unless such aggregation is explicitly authorised under a research ethics approval with population-level consent.

4.8. A conforming system SHOULD implement capability-aware boundary enforcement that evaluates the agent's model architecture and training data to identify latent phenotype prediction capabilities — traits the agent could infer even if not explicitly configured to do so — and extends inference boundary enforcement to those latent capabilities.

4.9. A conforming system SHOULD maintain a registry of all phenotype-genotype association datasets used in agent training or inference, including provenance, population representation, effect sizes, and known limitations, enabling assessment of prediction validity and bias.

4.10. A conforming system MAY implement differential privacy mechanisms on phenotype prediction outputs to mathematically bound the information leakage about any individual's genotype from the prediction results, particularly when predictions are shared with third parties or used in aggregate reporting.

5. Rationale

The ability to predict phenotypic characteristics from genotype data is advancing rapidly. Genome-wide association studies have identified statistically significant associations for over 5,000 traits, and polygenic risk scores can now estimate predisposition for complex traits including psychiatric conditions, cognitive attributes, and physical appearance with increasing accuracy. AI agents trained on these association datasets inherit the capability to perform these predictions — often without explicit configuration, because the associations are embedded in the training data.

This creates a distinct threat model with four primary vectors. First, scope creep: an agent authorised to perform one category of phenotype prediction (e.g., pharmacogenomic drug response) may silently perform other categories (e.g., psychiatric predisposition) as intermediate computations, because the same underlying model encodes both capabilities. Second, consent violation: data subjects who consent to genetic analysis for one purpose (e.g., ancestry) do not consent to all possible inferences from their data (e.g., disease risk, behavioural traits, facial reconstruction). Without inference boundary enforcement, the agent's capabilities exceed the consent scope. Third, discrimination enablement: phenotype predictions relating to protected characteristics — race, disability, mental health, cognitive ability — can be used for discriminatory purposes even when the predictions are probabilistic rather than deterministic. Polygenic risk scores for educational attainment, for example, have been shown to correlate with socioeconomic status and race, creating a molecular proxy for characteristics that anti-discrimination law prohibits using in decision-making. Fourth, re-identification: a sufficient combination of predicted phenotypes (hair colour, eye colour, height, facial features, skin pigmentation) can re-identify individuals in datasets that were ostensibly de-identified at the genotype level.

The preventive control type is essential because phenotype predictions, once computed and stored, create irreversible data protection harms. A psychiatric risk score, once associated with a named individual, cannot be uncomputed from the recipient's knowledge. Detective controls that identify violations after the fact are insufficient — the harm occurs at the moment of inference. The boundary must be enforced before computation, not detected after storage.

Cross-jurisdictional complexity amplifies the risk. The EU classifies genetic data as special category data under GDPR Article 9, requiring explicit consent for processing. The US GINA prohibits use of genetic information in employment and health insurance but does not regulate direct-to-consumer genomics equivalently. China's Biosecurity Law restricts cross-border transfer of human genetic resources. An agent operating across jurisdictions must enforce the most restrictive applicable phenotype prediction constraints, which requires both a risk taxonomy and a jurisdictional mapping — connecting this dimension to AG-210 (Multi-Jurisdictional Regulatory Mapping).

6. Implementation Guidance

Phenotype prediction risk governance should be implemented as a layered enforcement system: a risk taxonomy layer that classifies predictions, a boundary enforcement layer that prevents prohibited inferences, a consent verification layer that validates authorisation for permitted inferences, and a logging layer that records all activity for audit.

Recommended patterns:

Anti-patterns to avoid:

Industry Considerations

Maturity Model

Basic Implementation — The organisation has documented a phenotype prediction risk taxonomy classifying at least the most sensitive phenotype categories (psychiatric, cognitive, facial appearance, ancestry-linked traits) as prohibited or restricted. Inference boundaries are enforced at the API output layer. Consent verification is performed at the data-type level. Prediction attempts are logged. The taxonomy is reviewed annually.

Intermediate Implementation — The risk taxonomy is comprehensive, covering all phenotype categories inferable from the agent's training data and model capabilities. Inference boundaries are enforced at the model execution layer, including intermediate computation monitoring. Consent verification operates at the inference-category level with synchronous gating. Re-identification risk scoring is implemented for phenotype prediction combinations. The taxonomy is updated within 90 days of material new genotype-phenotype associations being published. Dual-key authorisation is required for restricted predictions.

Advanced Implementation — All intermediate capabilities plus: capability-aware boundary enforcement identifies and constrains latent phenotype prediction capabilities. Differential privacy mechanisms bound information leakage from prediction outputs. The risk taxonomy is integrated with international regulatory mapping for cross-jurisdictional deployments. An independent bioethics review panel participates in taxonomy governance. Population-level aggregation controls are enforced with automated ethics review triggers. The system is independently audited against both AG-716 and applicable clinical or research genomics standards.

7. Evidence Requirements

Required artefacts:

Retention requirements:

Access requirements:

8. Test Specification

Test 8.1: Prohibited Phenotype Inference Blocking

Test 8.2: Consent Scope Verification Gate

Test 8.3: Prediction Attempt Logging Completeness

Test 8.4: Human Escalation for Unclassified Phenotype Requests

Test 8.5: Re-Identification Risk Assessment Enforcement

Test 8.6: Population-Level Aggregation Control

Test 8.7: Risk Taxonomy Currency Verification

Conformance Scoring

9. Regulatory Mapping

RegulationProvisionRelationship Type
EU GDPRArticle 9 (Processing of Special Categories of Data)Direct requirement
EU GDPRArticle 22 (Automated Individual Decision-Making, Including Profiling)Supports compliance
EU AI ActArticle 6 & Annex III (High-Risk Classification — Biometric Systems)Direct requirement
EU AI ActArticle 10 (Data and Data Governance)Supports compliance
US GINATitle I & II (Genetic Information Nondiscrimination)Direct requirement
US ADATitle I (Employment Discrimination on Basis of Disability)Supports compliance
Council of Europe Oviedo ConventionArticle 12 (Predictive Genetic Tests)Direct requirement
China Biosecurity LawArticles 56-58 (Human Genetic Resource Management)Supports compliance
ISO 42001Clause 6.1.2 (AI Risk Assessment)Supports compliance
NIST AI RMFMAP 2.3 (Scientific Integrity of AI Data and Models)Supports compliance
UNESCO Universal Declaration on the Human GenomeArticles 6-7 (Discrimination and Confidentiality)Normative alignment

EU GDPR — Article 9 (Special Category Data)

Genetic data is explicitly listed as a special category of personal data under GDPR Article 9(1), and its processing is prohibited unless one of the conditions in Article 9(2) is met. Phenotype predictions derived from genetic data constitute processing of genetic data — the inference output is inseparable from its genetic data input for regulatory purposes. AG-716's consent-scope verification requirement (4.3) directly supports compliance with Article 9(2)(a) (explicit consent) by ensuring that consent covers the specific inference category, not merely the input data type. The prohibition of certain phenotype predictions under the risk taxonomy (4.1) supports compliance where no lawful basis exists for processing that category of genetic inference. The re-identification risk assessment requirement (4.6) supports Article 9's underlying purpose of preventing harm from special category data processing.

EU AI Act — Article 6 & Annex III (High-Risk AI Systems)

The EU AI Act classifies biometric identification and categorisation systems as high-risk under Annex III. AI systems that infer physical characteristics, health conditions, or behavioural traits from biological data fall within this classification. Phenotype prediction agents that reconstruct facial appearance, estimate health predispositions, or infer behavioural characteristics from genomic data are subject to the high-risk requirements of Title III, Chapter 2, including risk management (Article 9), data governance (Article 10), transparency (Article 13), and human oversight (Article 14). AG-716's risk taxonomy, inference boundaries, and human escalation requirements map directly to these obligations.

US GINA — Titles I and II

The Genetic Information Nondiscrimination Act prohibits the use of genetic information in employment decisions (Title II) and health insurance underwriting (Title I). Phenotype predictions derived from genetic data constitute "genetic information" under GINA's broad definition, which includes genetic tests and the manifestation of a disease or disorder in family members. AG-716's inference boundary enforcement (4.2) and population aggregation controls (4.7) directly prevent the generation of genetic-information-based predictions that could be used in prohibited employment or insurance decisions. The logging requirement (4.4) provides the audit trail necessary to demonstrate GINA compliance.

Council of Europe Oviedo Convention — Article 12

Article 12 of the Convention on Human Rights and Biomedicine restricts predictive genetic tests to health purposes or health-related scientific research, and requires appropriate genetic counselling. AG-716's risk taxonomy (4.1) enables classification of phenotype predictions by purpose, and the consent-scope verification (4.3) ensures that predictions are linked to authorised purposes. For jurisdictions that have ratified the Oviedo Convention, the risk taxonomy should classify non-health-purpose phenotype predictions (e.g., intelligence, personality traits) as prohibited unless a specific legal basis exists.

China Biosecurity Law — Human Genetic Resource Provisions

China's Biosecurity Law and the associated Regulations on the Management of Human Genetic Resources impose strict controls on the collection, preservation, use, and cross-border transfer of human genetic resources. Phenotype predictions computed from Chinese citizens' genetic data are subject to these provisions. AG-716's cross-jurisdictional applicability, combined with AG-210, requires that phenotype prediction controls enforce Chinese regulatory requirements when processing genetic data originating from China, including restrictions on cross-border data transfer that would enable phenotype predictions to be computed in foreign jurisdictions.

10. Failure Severity

FieldValue
Severity RatingCritical
Blast RadiusIndividual (genetic privacy), population (discrimination enablement), organisational (regulatory and reputational), societal (eugenics-adjacent harm)

Consequence chain: Failure of phenotype prediction risk governance initiates a multi-stage harm cascade. The immediate failure mode is unauthorised phenotype inference — the agent computes predictions outside the consented scope or in prohibited categories. This creates a data protection violation: special category personal data (genetic health information, psychiatric predisposition scores, appearance predictions) is generated without lawful basis. If the predictions are stored — even as intermediate computation artefacts — they become a persistent data breach risk. A subsequent breach or unauthorised access exposes individuals' genetic predispositions, creating irreversible privacy harm (genetic information, unlike a password, cannot be changed). At the population level, aggregated phenotype predictions by ethnicity or ancestry enable genetic discrimination and profiling — echoing historical eugenics programmes and triggering severe societal backlash. The regulatory consequence is multi-jurisdictional: GDPR Article 83(5) penalties of up to 4% of global annual turnover for special category data processing violations, GINA civil penalties, and potential criminal liability under biosecurity laws. The reputational consequence is extreme — association with genetic discrimination or eugenics-adjacent practices typically causes permanent brand damage, as demonstrated by historical cases in direct-to-consumer genomics. The cascading organisational consequence includes loss of research partnerships, withdrawal of ethics approvals, and inability to recruit research participants, undermining the organisation's ability to conduct any genomics-related work. For safety-critical deployments (clinical genomics), erroneous or inappropriate phenotype predictions can lead to clinical harm: unnecessary interventions based on false positive psychiatric risk scores, or failure to act on genuine risk due to system-level distrust following governance failures.

Cross-references: AG-001 (Operational Boundary Enforcement) provides the foundational boundary mechanism that AG-716 extends to phenotype inference boundaries. AG-019 (Human Escalation & Override Triggers) defines the escalation framework that AG-716 invokes for unclassified phenotype requests. AG-022 (Behavioural Drift Detection) detects when agents begin performing phenotype predictions outside their configured scope. AG-029 (Data Classification Enforcement) classifies the input genetic data; AG-716 extends classification to the inference outputs. AG-033 (Consent Lifecycle Governance) manages the consent records that AG-716's consent-scope verification queries. AG-037 (Anonymisation & Pseudonymisation Governance) governs the de-identification of genetic data, which AG-716 supplements with re-identification risk assessment for phenotype prediction outputs. AG-040 (Sensitive Category Data Processing Governance) provides the general framework for special category data that AG-716 specialises for genomic phenotype predictions. AG-055 (Audit Trail Immutability & Completeness) governs the immutability of the prediction audit logs required by AG-716. AG-068 (Intellectual Property Boundary Governance) is relevant where phenotype prediction models or association datasets carry IP restrictions. AG-084 (Model Training Data Governance) governs the genotype-phenotype association datasets used in training. AG-210 (Multi-Jurisdictional Regulatory Mapping) provides the jurisdictional mapping that AG-716 requires for cross-border phenotype prediction governance. AG-709 (Sequence Data Sensitivity Governance) governs the input data sensitivity classification. AG-715 (Clinical-Genomic Consent Governance) provides the consent framework for clinical genomic applications.

Cite this protocol
AgentGoverning. (2026). AG-716: Phenotype Prediction Risk Governance. The 783 Protocols of AI Agent Governance, AGS v2.1. agentgoverning.com/protocols/AG-716