PT - JOURNAL ARTICLE AU - Alexander S. Hatoum AU - Frank R. Wendt AU - Marco Galimberti AU - Renato Polimanti AU - Benjamin Neale AU - Henry R. Kranzler AU - Joel Gelernter AU - Howard J. Edenberg AU - Arpana Agrawal TI - Genetic Data Can Lead to Medical Discrimination: Opioid Use Disorder as a Cautionary Tale AID - 10.1101/2020.09.12.20193342 DP - 2020 Jan 01 TA - medRxiv PG - 2020.09.12.20193342 4099 - http://medrxiv.org/content/early/2020/12/16/2020.09.12.20193342.short 4100 - http://medrxiv.org/content/early/2020/12/16/2020.09.12.20193342.full AB - Using genetics to predict the likelihood of future psychiatric disorders, such as opioid use disorder (OUD), poses scientific and ethical challenges. Machine learning models are beginning to proliferate in psychiatry, however, most machine learning models in psychiatric genetics to date have not accounted for ancestry. Using an empirical example of a proposed genetic test for OUD and by generating a simulated random binary phenotype, we show that ML genetic prediction is completely confounded by ancestry, potentially discriminatory, and of no benefit for clinical practice. In an empirical example, we examine results from five ML algorithms trained with brain reward-derived “candidate” SNPs proposed for commercial use and demonstrate that the algorithms do not predict OUD better than chance when ancestry is balanced but are highly confounded with ancestry in an out-of-sample test set. We show how such a test could also predict subpopulations in admixed samples. Random sets of variants matched to the candidate SNPs by allele frequency produced similarly flawed predictions, further questioning the plausibility of selecting candidate variants. Finally, using random SNPs that predict a random simulated phenotype we show that the bias attributable to ancestral confounding would impact any such ML-based genetic prediction algorithm. Given the small and distributed single-variant genetic effect sizes associated with most psychiatric disorders, researchers and clinicians are encouraged to be skeptical of claims of high prediction accuracy from the growing number of ML-derived genetic algorithms, particularly when models are naive to polygenicity and ancestral confounding.Competing Interest StatementHRK is an advisory board member for Dicerna and a member of the American Society of Clinical Psychopharmacology's Alcohol Clinical Trials Initiative, which was supported in the last three years by AbbVie, Alkermes, Dicerna, Ethypharm, Indivior, Lilly, Lundbeck, Otsuka, Pfizer, Arbor, and Amygdala Neurosciences. HRK and JG are named as inventors on PCT patent application #15/878,640 entitled: "Genotype-guided dosing of opioid agonists," filed January 24, 2018.Funding StatementThis research is supported by MH109532. ASH acknowledges support from DA007261; AA acknowledges support from K02DA032573. Yale-Penn (phs000425.v1.p1; phs000952.v1.p1) was supported by National Institutes of Health Grants RC2 DA028909, R01 DA12690, R01 DA12849, R01 DA18432, R01 AA11330, and R01 AA017535 and the Veterans Affairs Connecticut and Philadelphia Veterans Affairs Mental Illness Research, Educational, and Clinical Centers.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Yale and UPenn (site-specific) IRBAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesResults from learning curves and ML scripts will be made available upon publication.