ABSTRACT
Comprehensively characterizing genotype-phenotype correlations (GPCs) in Mendelian disease would create new opportunities for improving clinical management and understanding disease biology. However, heterogeneous approaches to data sharing, reuse, and analysis have hindered progress in the field. We developed Genotype Phenotype Evaluation of Statistical Association (GPSEA), a software package that leverages the Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema to represent case-level clinical and genetic data about individuals. GPSEA applies an independent filtering strategy to boost statistical power to detect categorical GPCs represented by Human Phenotype Ontology terms. GPSEA additionally enables visualization and analysis of continuous phenotypes, clinical severity scores, and survival data such as age of onset of disease or clinical manifestations. We applied GPSEA to 85 cohorts with 6613 previously published individuals with variants in one of 80 genes associated with 122 Mendelian diseases and identified 225 significant GPCs, with 48 cohorts having at least one statistically significant GPC. These results highlight the power of standardized representations of clinical data for scalable discovery of GPCs in Mendelian disease.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was supported by grants from the the National Human Genome Research Institute (A phenomics-first resource for interpretation of variants; 5RM1HG010860 and The Human Phenotype Ontology: Accelerating Computational Integration of Clinical Data for Genomics; 5U24HG011449). J.X.C. and A.J.M were supported by 1R35HG011297. P.N.R. was supported by a Professorship of the Alexander von Humboldt Foundation. AK and OV were supported by the grant NU23-05-00097 issued by the Czech Health Research Council, Ministry of Health of the Czech Republic.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Genotypic and phenotypic data about individuals with rare Mendelian disease were derived from the Phenopacket Store repository.23 Version 0.1.24 of Phenopacket Store includes 8182 phenopackets representing 491 Mendelian and chromosomal diseases associated with 457 genes and 4469 unique pathogenic alleles curated from 1227 different publications. The data is available from https://github.com/monarch-initiative/phenopacket-store
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes