TY - JOUR T1 - A scalable EHR-based approach for phenotype discovery and variant interpretation for hereditary cancer genes JF - medRxiv DO - 10.1101/2021.03.18.21253763 SP - 2021.03.18.21253763 AU - Chenjie Zeng AU - Lisa A. Bastarache AU - Ran Tao AU - Eric Venner AU - Scott Hebbring AU - Justin D. Andujar AU - Sarah T. Bland AU - David R. Crosslin AU - Siddharth Pratap AU - Ayorinde Cooley AU - Jennifer A. Pacheco AU - Kurt D. Christensen AU - Emma Perez AU - Carrie L. Blout Zawatsky AU - Leora Witkowski AU - Hana Zouk AU - Chunhua Weng AU - Kathleen A. Leppig AU - Patrick M. A. Sleiman AU - Hakon Hakonarson AU - Marc. S. Williams AU - Yuan Luo AU - Gail P. Jarvik AU - Robert C. Green AU - Wendy K. Chung AU - Ali G. Gharavi AU - Niall J. Lennon AU - Heidi L. Rehm AU - Richard A. Gibbs AU - Josh F. Peterson AU - Dan M. Roden AU - Georgia L. Wiesner AU - Joshua C. Denny Y1 - 2021/01/01 UR - http://medrxiv.org/content/early/2021/03/24/2021.03.18.21253763.abstract N2 - Knowledge of the clinical spectrum of rare genetic disorders helps in disease management and variant pathogenicity interpretation. Leveraging electronic health record (EHR)-linked genetic testing data from the eMERGE network, we determined the associations between a set of 23 hereditary cancer genes and 3017 phenotypes in 23544 individuals. This phenome-wide association study replicated 45% (184/406) of known gene-phenotype associations (P = 5.1×10−125). Meta-analysis with an independent EHR-derived cohort of 3242 patients confirmed 14 novel associations with phenotypes in the neoplastic, genitourinary, digestive, congenital, metabolic, mental and neurologic categories. Phenotype risk scores (PheRS) based on weighted aggregations of EHR phenotypes accurately predicted variant pathogenicity for at least 50% of pathogenic variants for 8/23 genes. We generated a catalog of PheRS for 7800 variants, including 5217 variants of uncertain significance, to provide empirical evidence of potential pathogenicity. This study highlights the potential of EHR data in genomic medicine.Competing Interest StatementAli G. Gharavi serves as a consultant to Goldfinch Bio and receives research funding from Renal Research Institute.Clinical TrialN/AFunding StatementSupport for the research and personnel was provided by the R01LM010685 grant from the National Library of Medicine and the eMERGE grants. The eMERGE sites were funded through several series of grants from the National Human Genome Research Institute: U01HG8657, U01HG006375, U01HG004610 (Kaiser Permanente Washington/University of Washington); U01HG8685 (Brigham and Womens Hospital); U01HG8672, U01HG006378, U01HG004608 (Vanderbilt University Medical Center); U01HG8666, U01HG006828 (Cincinnati Childrens Hospital Medical Center); U01HG6379, U01HG04599 (Mayo Clinic); U01HG8679, U01HG006382 (Geisinger Clinic); U01HG008680 (Columbia University Health Sciences); U01HG8684, U01HG006830 (Childrens Hospital of Philadelphia); U01HG8673, U01HG006388, U01HG004609 (Northwestern University); U54MD007593, U54MD007586 (Meharry Medical College); U01HG8676 (Partners Healthcare/Broad Institute); U01HG8664 (Baylor College of Medicine); U01HG006389 (Essentia Institute of Rural Health, Marshfield Clinic Research Foundation and Pennsylvania State University); U01HG006380 (Icahn School of Medicine at Mount Sinai); U01HG8701, U01HG006385, U01HG04603 (Vanderbilt University Medical Center serving as the Coordinating Center); eMERGE Genotyping Centers were also funded through U01HG004438 (CIDR) and U01HG004424 (the Broad Institute). Vanderbilt University Medical Centers Synthetic Derivative, Research Derivative and BioVU are supported by institutional funding and by the CTSA grant ULTR000445 from NCATS/NIH. The majority of CJZs work on this project was supported by T32 CA160056 (NCI).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Ethics committees and their decisions for each site under the Electronic Medical Record and Genomics (eMERGE) Network are as as follows. CHOP was approved by the Committees for the Protection of Human Subjects of the Children's Hospital of Philadelphia; Cincinnati was approved by the Institution Review Board of the Cincinnati Children's Hospital Medical Center; Columbia was approved by the Human Research Protection Office and Institution Review Boards of Columbia University; Geisinger was approved by the Geisinger Institutional Review Board; Harvard was approved by the Partners Human Research Committee; Mayo was approved by the Mayo Clinic Institutional Review Board. Northwestern was approved by the Northwestern University's Institutional Review Board. UWKP was approved by the Kaiser Permanente Washington Research and Humans Subjects Review Office. Vanderbilt was approved by the Vanderbilt University Institutional Review Board. The hereditary cancer registry at Vanderbilt was approved by the Vanderbilt University Institutional Review Board. All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesGenetic and phenotype data of the eMERGEseq cohort are publicly available in the dbGaP repository under phs001616.v1.p1. All summary statistics for significant gene-phenotype associations from the eMERGEseq and the HCR cohorts are provided in the Supplemental Tables S3-6. All summary statistics for associations of PheRS with genetic variants are provided in Supplemental Tables S9-10. ER -