Abstract
INTRODUCTION Predicting the early onset of dementia due to Alzheimer’s Disease (AD) has major implications for timely clinical management and outcomes. Current diagnostic methods, reliant on invasive and costly procedures, underscore the need for scalable and innovative approaches. To date, considerable effort has been dedicated to developing machine learning (ML) based approaches using different combinations of medical, demographic, cognitive, and clinical data, achieving varying levels of accuracy. However, they often lack the scalability required for large-scale screening and fail to identify underlying risk factors for AD progression. Polygenic risk scores (PRS) have shown promise in predicting disease risk from genetic data. Here, we aim to leverage ML techniques to develop a multi-PRS model that captures both genetic and non-genetic risk factors to diagnose and predict the progression of AD in different stages in older adults.
METHODS We trained and tested ML-based multi-PRS models, integrating genetically predicted clinical, behavioral, psychiatric, and lifestyle risk factors to predict the diagnosis of AD as well as the progression between different cognitive stages. We developed an automatic feature selection pipeline that identifies the relevant traits that predict AD. We also analyzed the interpretability of our pro-posed ML models and the selected features. Leveraging data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), Religious Orders Study and Memory and Aging Project (ROSMAP), and the IEU OpenGWAS Project, our study presents the first known end-to-end ML-based multi-PRS model for AD.
RESULTS Relevant features were selected from an initial set of 53 polygenic risk scores computed for 1567 patients in the ADNI and 1642 patients in the ROSMAP dataset. The proposed multi-PRS ML method produced AUROC scores of 77% on ADNI and 72% on ROSMAP for predicting the diagnosis of AD, substantially surpassing the performance of the uni-variate PRS models. Our models also showed promise in predicting transitions between various cognitive stages (65%-75% AUROC scores). Moreover, the features identified by our automated feature selection pipeline are closely aligned with the widely recognized potentially modifiable risk factors for AD.
DISCUSSION Multi-PRS-based machine learning models can identify risk factors and construct predictive models for early Alzheimer’s disease (AD) diagnosis. This approach offers an automated mechanism to harness genetic data for AD diagnosis and prognosis, enhancing our understanding of the role of various traits in AD development and progression. It will facilitate the implementation of preventive measures at an early stage, thereby contributing to more effective interventions and improved patient outcomes.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study was supported by the Interstellar Early Career Investigator Award (jointly presented by the New York Academy of Sciences (NYAS) and the Japan Agency for Medical Research and Development (AMED)). It was also partially supported by the Research and Innovation Centre for Science and Engineering at BUET (RISE-BUET) Internal Research Grant (ID: 2021-01-016). Data collection and sharing for this project was funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer's Association; Alzheimer's Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. HoffmannLa Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. The ROSMAP studies were funded by the National Institute on Aging: P30AG010161 ADCC, P30AG072975 ADRC, R01AG015819 RISK, R01AG017917 MAP, U01AG46152 AMP-AD Pipeline I, U01AG61356 AMP-AD Pipeline II.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Data used in the preparation of this article are available from the Alzheimer's Disease Neuroimaging Initiative (adni.loni.usc.edu) and Religious Orders Study and Memory and Aging Project (https://www.radc.rush.edu) databases. Accessing the datasets required special access requests to be approved by the corresponding authority.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵† Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in the analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf
An additional literature review has been conducted, resulting in an updated introduction. Furthermore, grammatical errors have been corrected and the text has been appropriately rephrased.
Data Availability
Data used in the preparation of this article are available from the Alzheimer's Disease Neuroimaging Initiative (adni.loni.usc.edu) and Religious Orders Study and Memory and Aging Project (https://www.radc.rush.edu) databases.