Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Multi-ancestry gene-trait connection landscape using electronic health record (EHR) linked biobank data

Binglan Li, Yogasudha Veturi, Anastasia Lucas, Yuki Bradford, Shefali S. Verma, Anurag Verma, Joseph Park, Wei-Qi Wei, Qiping Feng, Bahram Namjou, Krzysztof Kiryluk, Iftikhar Kullo, Yuan Luo, Milton Pividori, Hae Kyung Im, View ORCID ProfileCasey S. Greene, Marylyn D. Ritchie
doi: https://doi.org/10.1101/2021.10.21.21265225
Binglan Li
1Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yogasudha Veturi
2Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anastasia Lucas
2Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yuki Bradford
2Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shefali S. Verma
4Department of Pathology and Lab Medicine, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anurag Verma
5Department of Medicine, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joseph Park
2Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wei-Qi Wei
6Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Qiping Feng
7Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37203
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Bahram Namjou
8Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center (CCHMC), Cincinnati, OH, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Krzysztof Kiryluk
9Department of Medicine, Division of Nephrology, College of Physicians and Surgeons, Columbia University, New York, New York, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Iftikhar Kullo
10Division of Cardiovascular Diseases and the Gonda Vascular Center, Mayo Clinic, Rochester, MN, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yuan Luo
11Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Milton Pividori
2Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hae Kyung Im
12Departments of Medicine and Human Genetics, University of Chicago, Chicago, IL 60637, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Casey S. Greene
13University of Colorado School of Medicine, Anschutz Medical Campus, Aurora, CO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Casey S. Greene
Marylyn D. Ritchie
2Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: marylyn@pennmedicine.upenn.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Understanding genetic factors of complex traits across ancestry groups holds a key to improve the overall health care quality for diverse populations in the United States. In recent years, multiple electronic health record-linked (EHR-linked) biobanks have recruited participants of diverse ancestry backgrounds; these biobanks make it possible to obtain phenome-wide association study (PheWAS) summary statistics on a genome-wide scale for different ancestry groups. Moreover, advancement in bioinformatics methods provide novel means to accelerate the translation of basic discoveries to clinical utility by integrating GWAS summary statistics and expression quantitative trait locus (eQTL) data to identify complex trait-related genes, such as transcriptome-wide association study (TWAS) and colocalization analyses. Here, we combined the advantages of multi-ancestry biobanks and data integrative approaches to investigate the multi-ancestry, gene-disease connection landscape. We first performed a phenome-wide TWAS on Electronic Medical Records and Genomics (eMERGE) III network participants of European ancestry (N = 68,813) and participants of African ancestry (N = 12,658) populations, separately. For each ancestry group, the phenome-wide TWAS tested gene-disease associations between 22,535 genes and 309 curated disease phenotypes in 49 primary human tissues, as well as cross-tissue associations. Next, we identified gene-disease associations that were shared across the two ancestry groups by combining the ancestry-specific results via meta-analyses. We further applied a Bayesian colocalization method, fastENLOC, to prioritize likely functional gene-disease associations with supportive colocalized eQTL and GWAS signals. We replicated the phenome-wide gene-disease analysis in the analogous Penn Medicine BioBank (PMBB) cohorts and sought additional validations in the PhenomeXcan UK Biobank (UKBB) database, PheWAS catalog, and systematic literature review. Phenome-wide TWAS identified many proof-of-concept gene-disease associations, e.g. FTO-obesity association (p = 7.29e-15), and numerous novel disease-associated genes, e.g. association between GATA6-AS1 with pulmonary heart disease (p = 4.60e-10). In short, the multi-ancestry, gene-disease connection landscape provides rich resources for future multi-ancestry complex disease research. We also highlight the importance of expanding the size of non-European ancestry datasets and the potential of exploring ancestry-specific genetic analyses as these will be critical to improve our understanding of the genetic architecture of complex disease.

Competing Interest Statement

MDR is on the scientific advisory board for Cipherome.

Funding Statement

eMERGE Network (Phase III). This phase of the eMERGE Network was initiated and funded by the NHGRI through the following grants: U01HG8657 (Group Health Cooperative/University of Washington); U01HG8685 (Brigham and Womens Hospital); U01HG8672 (Vanderbilt University Medical Center); U01HG8666 (Cincinnati Childrens Hospital Medical Center); U01HG6379 (Mayo Clinic); U01HG8679 (Geisinger Clinic); U01HG8680 (Columbia University Health Sciences); U01HG8684 (Childrens Hospital of Philadelphia); U01HG8673 (Northwestern University); U01HG8701 (Vanderbilt University Medical Center serving as the Coordinating Center); U01HG8676 (Partners Healthcare/Broad Institute); and U01HG8664 (Baylor College of Medicine). Penn Medicine BioBank (PMBB). The PMBB is funded by the Perelman School of Medicine at the University of Pennsylvania, a gift from the Smilow family, and the National Center for Advancing Translational Sciences of the National Institutes of Health under CTSA Award Number UL1TR001878. We thank D. Birtwell, H. Williams, P. Baumann and M. Risman for informatics support regarding the PMBB. We thank the staff of the Regeneron Genetics Center for whole-exome sequencing of DNA from PMBB participants.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

IRB of the University of Pennsylvania gave ethnical approval for this work.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

All data produced in the present work are contained in the manuscript

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted October 26, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Multi-ancestry gene-trait connection landscape using electronic health record (EHR) linked biobank data
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Multi-ancestry gene-trait connection landscape using electronic health record (EHR) linked biobank data
Binglan Li, Yogasudha Veturi, Anastasia Lucas, Yuki Bradford, Shefali S. Verma, Anurag Verma, Joseph Park, Wei-Qi Wei, Qiping Feng, Bahram Namjou, Krzysztof Kiryluk, Iftikhar Kullo, Yuan Luo, Milton Pividori, Hae Kyung Im, Casey S. Greene, Marylyn D. Ritchie
medRxiv 2021.10.21.21265225; doi: https://doi.org/10.1101/2021.10.21.21265225
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Multi-ancestry gene-trait connection landscape using electronic health record (EHR) linked biobank data
Binglan Li, Yogasudha Veturi, Anastasia Lucas, Yuki Bradford, Shefali S. Verma, Anurag Verma, Joseph Park, Wei-Qi Wei, Qiping Feng, Bahram Namjou, Krzysztof Kiryluk, Iftikhar Kullo, Yuan Luo, Milton Pividori, Hae Kyung Im, Casey S. Greene, Marylyn D. Ritchie
medRxiv 2021.10.21.21265225; doi: https://doi.org/10.1101/2021.10.21.21265225

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (217)
  • Allergy and Immunology (496)
  • Anesthesia (106)
  • Cardiovascular Medicine (1112)
  • Dentistry and Oral Medicine (197)
  • Dermatology (141)
  • Emergency Medicine (275)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (508)
  • Epidemiology (9807)
  • Forensic Medicine (5)
  • Gastroenterology (482)
  • Genetic and Genomic Medicine (2335)
  • Geriatric Medicine (223)
  • Health Economics (464)
  • Health Informatics (1569)
  • Health Policy (738)
  • Health Systems and Quality Improvement (609)
  • Hematology (238)
  • HIV/AIDS (508)
  • Infectious Diseases (except HIV/AIDS) (11674)
  • Intensive Care and Critical Care Medicine (617)
  • Medical Education (240)
  • Medical Ethics (67)
  • Nephrology (258)
  • Neurology (2162)
  • Nursing (134)
  • Nutrition (340)
  • Obstetrics and Gynecology (427)
  • Occupational and Environmental Health (520)
  • Oncology (1187)
  • Ophthalmology (366)
  • Orthopedics (129)
  • Otolaryngology (221)
  • Pain Medicine (148)
  • Palliative Medicine (50)
  • Pathology (314)
  • Pediatrics (700)
  • Pharmacology and Therapeutics (303)
  • Primary Care Research (268)
  • Psychiatry and Clinical Psychology (2196)
  • Public and Global Health (4694)
  • Radiology and Imaging (786)
  • Rehabilitation Medicine and Physical Therapy (459)
  • Respiratory Medicine (625)
  • Rheumatology (276)
  • Sexual and Reproductive Health (227)
  • Sports Medicine (214)
  • Surgery (252)
  • Toxicology (43)
  • Transplantation (120)
  • Urology (94)