Abstract
Background Guidelines for the prevention of cardiovascular disease (CVD) have recommended the assessment of the total CVD risk by risk scores. Current risk algorithms are low in sensitivity and specificity and they have not incorporated emerging risk markers for CVD. We suggest that CVD risk assessment can be still improved. We have developed a long-term risk prediction model of cardiovascular mortality in patients with stable coronary artery disease (CAD) based on newly available machine learning and on an extended dataset of new biomarkers.
Methods 2953 participants of the Ludwigshafen Risk and Cardiovascular Health (LURIC) study were included. 184 laboratory and 21 demographic markers were ranked according to their contribution to risk of cardiovascular (CV) mortality using different data mining approaches. A self-learning bioinformatics workflow, including seven different machine learning algorithms, was developed for CV risk prediction. The study population was stratified into patients with and without significant CAD. Thereby, significant CAD was defined as a lumen narrowing of 50 % or more in at least one of the coronary segments or a history of definite myocardial infarction. The machine learning models in both subpopulations were compared with established CV risk assessment tools.
Results After a follow-up of 10 years, 603 (20.4%) patients died of cardiovascular causes. 95 (%) patients without CAD deceased within ten years and 247 (13.2 %) patients with CAD within 5 years. Overall and in patients without CAD, NT-proBNP (N-terminal pro B-type natriuretic peptide), TnT (Troponin T), estimated cystatin c based GFR (glomerular filtration rate) and age were the highest ranked predictors, while in patients with CAD, NT-proBNP, GFR, CT-proAVP (C-terminal pro arginine vasopressin) and TNT were highest predictive. In the comparison with the FRS, PROCAM and ESC risk scores, the machine learning workflow produced more accurate and robust CV mortality prediction in patients without CAD. Equivalent CV risk prediction was obtained in the CAD subpopulation in comparison with the Marschner risk score. Overall, the existing algorithms in general tend to assign more patients into the medium risk groups, while the machine learning algorithms tend to have a clearer risk/no risk assignment. The framework is available upon request.
Conclusion We have developed a fully automated and self-validating computational framework of machine learning techniques using an extensive database of clinical, routinely and non-routinely measured laboratory data. Our framework predicts long-term CV mortality at least as accurate as existing CVD risk scores. A combination of four highly ranked biomarkers and the random forest approach showed the best predictive results. Moreover, a dynamic computational model has several advantages over static CVD risk prediction tools: it is freeware, transparent, variable, transferable and expandable to any population, types of events and time frames.
Competing Interest Statement
Dr Maerz reports employment with Synlab Holding Deutschland GmbH, during the conduct of the study; received grants from Abbott Diagnostics, grants and personal fees from Aegerion Pharmaceuticals, grants and personal fees from AMGEN, grants and personal fees from AstraZeneca, grants and personal fees from BASF, grants and personal fees from Danone Research, personal fees from MSD, grants and personal fees from Sanofi, grants and personal fees from Siemens Diagnostics, personal fees from Synageva, all outside the submitted work. Dr Kraemer reports receiving grant and/or personal fees from Alexion, Astellas, Astra-Zeneca, Boehringer Ingelheim, Chiesi, Bayer, Pfizer, all outside the submitted work.
Funding Statement
7th Framework Program of the European Union, integrated projects Atheroremo [grant Agreement number 201668], and RiskyCAD [grant agreement number 305739]; e:AtheroSysMed (Systems medicine of coronary heart disease and stroke, German Ministry of Education and Research [grant number 01ZX1313A-K]).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
We studied participants of the Ludwigshafen Risk and Cardiovascular Health (LURIC). The study protocol has been published (13). In brief, 3316 participants of German ancestry were enrolled between July 1997 and January 2000. Only patients with a coronary angiogram were included. Coronary artery disease (CAD) was assessed by coronary angiography based on maximal luminal narrowing upon visual assessment. All participants were followed over a median observation period of 9.9 years. Written informed consent was obtained from each participant prior to inclusion. The study was in accordance with the Declaration of Helsinki and approved by the ethics committee at the Medical Association of Rheinland-Pfalz (Aerztekammer Rheinland-Pfalz).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present study are available upon reasonable request to the authors.