Burden Testing of Rare Variants Identified through Exome Sequencing via Publicly Available Control Data

Am J Hum Genet. 2018 Oct 4;103(4):522-534. doi: 10.1016/j.ajhg.2018.08.016. Epub 2018 Sep 27.

Abstract

The genetic causes of many Mendelian disorders remain undefined. Factors such as lack of large multiplex families, locus heterogeneity, and incomplete penetrance hamper these efforts for many disorders. Previous work suggests that gene-based burden testing-where the aggregate burden of rare, protein-altering variants in each gene is compared between case and control subjects-might overcome some of these limitations. The increasing availability of large-scale public sequencing databases such as Genome Aggregation Database (gnomAD) can enable burden testing using these databases as controls, obviating the need for additional control sequencing for each study. However, there exist various challenges with using public databases as controls, including lack of individual-level data, differences in ancestry, and differences in sequencing platforms and data processing. To illustrate the approach of using public data as controls, we analyzed whole-exome sequencing data from 393 individuals with idiopathic hypogonadotropic hypogonadism (IHH), a rare disorder with significant locus heterogeneity and incomplete penetrance against control subjects from gnomAD (n = 123,136). We leveraged presumably benign synonymous variants to calibrate our approach. Through iterative analyses, we systematically addressed and overcame various sources of artifact that can arise when using public control data. In particular, we introduce an approach for highly adaptable variant quality filtering that leads to well-calibrated results. Our approach "re-discovered" genes previously implicated in IHH (FGFR1, TACR3, GNRHR). Furthermore, we identified a significant burden in TYRO3, a gene implicated in hypogonadotropic hypogonadism in mice. Finally, we developed a user-friendly software package TRAPD (Test Rare vAriants with Public Data) for performing gene-based burden testing against public databases.

Keywords: TRAPD; gene-based burden analysis; hypogonadotropic hypogonadism.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Animals
  • Databases, Genetic
  • Exome / genetics*
  • Exome Sequencing / methods
  • Female
  • Genome-Wide Association Study / methods
  • Humans
  • Hypogonadism / genetics
  • Male
  • Mice
  • Polymorphism, Single Nucleotide / genetics*
  • Sequence Analysis, DNA / methods
  • Software

Supplementary concepts

  • Idiopathic Hypogonadotropic Hypogonadism