Abstract
Cardiovascular diseases (CVDs) are the primary cause of all global death. Timely and accurate identification of people at risk of developing an atherosclerotic CVD and its sequelae, via risk prediction model, is a central pillar of preventive cardiology. However, currently available models only consider a limited set of risk factors and outcomes, do not focus on providing actionable advice to individuals based on their holistic medical state and lifestyle, are often not interpretable, were built with small cohort sizes or are based on lifestyle data from the 1960s, e.g. the Framingham model. The risk of developing atherosclerotic CVDs is heavily lifestyle dependent, potentially making a high percentage of occurrences preventable. Providing actionable and accurate risk prediction tools to the public could assist in atherosclerotic CVD prevention. We developed a benchmarking pipeline to find the best set of data preprocessing and algorithms to predict absolute 10-year atherosclerotic CVD risk. Based on the data of 464,547 UK Biobank participants without atherosclerotic CVD at baseline, we used a comprehensive set of 203 consolidated risk factors associated with atherosclerosis and its sequelae (e.g. heart failure).
Our two best performing absolute atherosclerotic risk prediction models provided higher performance than Framingham and QRisk3. Using a subset of 25 risk factors identified with feature selection, our reduced model achieves similar performance while being less complex. Further, it is interpretable, actionable and highly generalizable. The model could be incorporated into clinical practice and could allow continuous personalized predictions with automated intervention suggestions.
Competing Interest Statement
All of the authors are or were employees of, contractors for, or hold equity in Ada Health GmbH. AK, AB, OB, HH, MJ, DN, BLS and SG are employees or company directors of Ada Health GmbH and some of the listed authors hold stock options in the company. Ada Health GmbH has received research grant funding from the Bill & Melinda Gates Foundation, Fondation Botnar, the Federal Ministry of Education and Research Germany, the Federal Ministry for Economic Affairs and Energy Germany and the European Union. PW is employed by Wicks Digital Health Ltd, which has received funding from Ada Health, AstraZeneca, Baillie Gifford, Biogen, Bold Health, Camoni, Compass Pathways, Coronna, EIT, Endava, Happify, HealthUnlocked, Inbeeo, Kheiron Medical, Lindus Health, Sano Genetics, Self Care Catalysts, The Learning Corp, The Wellcome Trust, THREAD Research, VeraSci, and Woebot. HH is the topic driver of the AI-based symptom assessment group of the WHO/ITU Focus Group on AI4H (Artificial Intelligence for Health) and SG is a member of the clinical evaluation topic group of the WHO/ITU Focus Group on AI4H. A related patent application is currently pending with the title 'System and method for predicting the risk of a patient to develop an atherosclerotic cardiovascular disease' and application number EP21191089.8.
Funding Statement
This research was funded by Ada Health GmbH and has been conducted using the UK Biobank under application id 34802.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The UK Biobank data is accessible through a request process (http://www.ukbiobank.ac.uk/register-apply/). The authors had no special access to the data that other researchers would not have. All utilized risk factors and outcomes are provided in the supporting information.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The UK Biobank data is accessible through a request process (http://www.ukbiobank.ac.uk/register-apply/). The authors had no special access to the data that other researchers would not have. All utilized risk factors and outcomes are provided in the supporting information.