TY - JOUR T1 - Leveraging Health Systems Data to Characterize a Large Effect Variant Conferring Risk for Liver Disease in Puerto Ricans JF - medRxiv DO - 10.1101/2021.03.31.21254662 SP - 2021.03.31.21254662 AU - Gillian M. Belbin AU - Stephanie Rutledge AU - Tetyana Dodatko AU - Sinead Cullina AU - Michael C. Turchin AU - Sumita Kohli AU - Denis Torre AU - Muh-Ching Yee AU - Christopher R. Gignoux AU - Noura S. Abul-Husn AU - Sander M. Houten AU - Eimear E. Kenny Y1 - 2021/01/01 UR - http://medrxiv.org/content/early/2021/04/06/2021.03.31.21254662.abstract N2 - Broad-scale adoption of genomic data in health systems offers opportunities for extending methods for the discovery of variation linked to underlying genomic disease risk. We applied a population-scale linkage mapping approach in a large multi-ethnic biobank to a spectrum of disease outcomes derived from Electronic Health Records (EHRs) and uncovered a risk locus for liver disease. We used genome sequencing and in silico approaches to fine-map the signal to a non-coding variant (c.2784-12T>C) in the gene ABCB4. In vitro analysis confirmed the variant disrupted splicing of the ABCB4 pre-mRNA. Four of five homozygotes had evidence of advanced liver disease, and there was a significant association with liver disease among heterozygotes, suggesting the variant is linked to increased risk of liver disease in an allele dose-dependent manner. Population-level screening revealed the variant to be at a carrier rate of 1.95% in Puerto Rican individuals, likely as the result of a Puerto Rican founder effect. This work demonstrates that integrating EHR and genomic data at a population-scale can facilitate novel strategies for understanding the continuum of genomic risk for common diseases, particularly in populations underrepresented in genomic medicine.Competing Interest StatementThe authors have declared no competing interest.Funding StatementResearch reported in this paper was supported by the Office of Research Infrastructure of the National Institutes of Health under award numbers S10OD018522 and S10OD026880.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This study was approved by the Icahn School of Medicine at Mount Sinai Institutional Review Board (Institutional Review Board 07 0529).All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe exome sequencing datasets generated and/or analyzed during the current study are not publicly available, but summary statistics are available from the corresponding author on reasonable request. ER -