Structured Abstract
Motivation As precision medicine advances, polygenic scores (PGS) have become increasingly important for clinical risk assessment. Many methods have been developed to create polygenic models with increased accuracy for risk prediction. Our select and shrink with summary statistics (S4) PGS method extends a previous method (polygenic risk score – continuous shrinkage (PRS-CS)) by using a continuous shrinkage prior on effect sizes with a selection strategy for including SNPs to create the best performing model.
Results The S4 method provides overall improved PGS accuracy for UK Biobank participants when compared to LDpred2 and PRS-CS across a variety of phenotypes with differing genetic architectures. Additionally, the S4 method has higher estimated PGS accuracy over LDpred2 in Finnish and Japanese populations. Thus, the S4 method represents an improvement in overall PGS accuracy across multiple phenotypes and increases the transferability of PGS across ancestries.
Availability and Implementation The S4 program is freely available at https://github.com/jpt34/S4_programs.
Supplementary information Supplementary data [will be] available at Bioinformatics online.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Pei-Chen Peng was supported by NCI K99CA256519. Simon Gayther was supported by NCI R01CA244569 and NCI R01CA211707. Amber DeVries and Michelle Jones were supported by the TEAL Louisa M. McGregor Ovarian Cancer Foundation. The University of Cambridge has received salary support for PDPP from the NHS in the East of England through the Clinical Academic Reserve. This research was supported Cancer Research UK Cambridge Centre and the NIHR Cambridge Biomedical Research Centre (BRC‐1215‐20014).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Only publicly available datasets from Biobanks were used in this work. We included UK Biobank (https://www.ukbiobank.ac.uk/enable-your-research/about-our-data), FinnGen (https://www.finngen.fi/en/access_results) and Biobank Japan (https://biobankjp.org/en/).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data produced in the present work are contained in the manuscript.