Abstract
Polygenic Scores (PGSs) summarize an individual’s genetic propensity for a given trait in a single value, based on SNP effect sizes derived from Genome-Wide Association Study (GWAS) results. Methods have been developed that apply Bayesian approaches to improve the prediction accuracy of PGSs through optimization of estimated effect sizes. While these methods are generally well-calibrated for continuous traits (implying the predicted values are on average equal to the true trait values), they are not well-calibrated for binary disorder traits in ascertained samples. This is a problem because well-calibrated PGSs are needed to reliably compute the absolute disorder probability for an individual to facilitate future clinical implementation. Here we introduce the Bayesian polygenic score Probability Conversion (BPC) approach, which computes an individual’s predicted disorder probability using GWAS summary statistics, an existing Bayesian PGS method (e.g. PRScs, SBayesR), the individual’s genotype data, and a prior disorder probability. The BPC approach transforms the PGS to its underlying liability scale, computes the variances of the PGS in cases and controls, and applies Bayes’ Theorem to compute the absolute disorder probability; it is practical in its application as it does not require a tuning dataset with both genotype and phenotype data. We applied the BPC approach to extensive simulated data and empirical data of nine disorders. The BPC approach yielded well-calibrated results that were consistently better than the results of another recently published approach.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
D.P. is supported by the Netherlands Organization for Scientific Research - Gravitation project 'BRAINSCAPES: A Roadmap from Neurogenetics to Neurobiology' (024.004.012) and the European Research Council advanced grant 'From GWAS to Function' (ERC-2018-ADG 834057). The PGC has received major funding from the US National Institute of Mental Health (PGC4: R01MH124839, PGC3: U01 MH109528; PGC2: U01 MH094421; PGC1: U01 MH085520). A.L.P has received an R01 grant from the US National Institutes of Health (HG006399).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study used openly available and secondary human data that were originally located at: https://pgc.unc.edu/for-researchers/download-results/ Therefore, the study did not require approval by an Ethics Committee or Institutional Review Board (IRB).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
Individual-level data from the Psychiatric Genomics Consortium (https://pgc.unc.edu/) and the UK Biobank (https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access) cannot be shared freely, but an access application is required first. The GWAS summary statistics used in the UKB analyses can be requested or downloaded from the following web pages: Breast Cancer (https://bcac.ccge.medschl.cam.ac.uk/bcacdata/oncoarray/oncoarray-and-combined-summary-result/gwas-summary-associations-breast-cancer-risk-2020/); BMI (https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files); Coronary Artery Disease (http://www.cardiogramplusc4d.org/data-downloads/#); Inflammatory Bowel Disease (https://www.ibdgenetics.org/); Multiple Sclerosis (https://imsgc.net/?page_id=31); Prostate Cancer (http://practical.icr.ac.uk/blog/?page_id=8164); Rheumatoid Arthritis (https://data.cyverse.org/dav-anon/iplant/home/kazuyoshiishigaki/ra_gwas/ra_gwas-10-28-2021.tar); Type 2 Diabetes (https://diagram-consortium.org/downloads.html). GWAS summary statistics for Major Depression and Schizophrenia can be downloaded from the PGC website (https://pgc.unc.edu/for-researchers/download-results/). 1000 Genomes reference files can be downloaded from https://ctg.cncr.nl/software/magma.