ABSTRACT
Background PCI is an effective treatment for coronary artery disease. Pre-procedural 30-day mortality post-PCI risk prediction aids in clinical decision-making and benchmarking hospital performance. This study aimed to identify pre-procedural factors to predict the risk of 30-day mortality following Percutaneous Coronary Intervention (PCI) using machine learning (ML) approaches.
Methods The study analysed 93,055 consecutive PCI procedures from the Victorian Cardiac Outcomes Registry (VCOR) in Australia to develop a pre-procedural 30-day mortality prediction model. Five ML approaches—Adaptive Booster (AdB), Decision Tree (DT), Gradient Booster (GB), Random Forest (RF), and Extreme Gradient Booster (XGB) were employed, utilizing Logistic Regression (LR) for comparison. Model performance was evaluated using k-fold cross-validation, with metrics including sensitivity, specificity, accuracy, ROC curve, Brier score, and calibration curve.
Results The study showed that the RF model outperformed other ML models in predicting 30-day mortality, achieving accuracy of 98.4% and a ROC of 94.3%. Utilizing the SHapley Additive exPlanations method, the RF model identified cardiogenic shock, ejection fraction, acute coronary syndrome, estimated GFR, cardiac arrest, age, mechanical ventricular support, complex lesion, lesion location, BMI, sex, and diabetes as the variables that were associated with 30-day mortality post-PCI. In comparison, the traditional LR model exhibited an accuracy of 98.2% and a ROC of 92.9%.
Conclusion A 30-day mortality post-PCI risk prediction model was developed with high accuracy using a ML method. It’s essential to underscore the need for further validation with external data to ensure the applicability of the model to other populations.
WHAT IS ALREADY KNOWN ON THIS TOPIC
A risk-adjustment model for an Australian PCI patient population was previously developed to predict 30-day mortality using traditional regression model.
Medical knowledge, patient characteristics, and clinical practices evolve over time, requiring frequent model updates to reflect new evidence, guidelines, and interventions
WHAT THIS STUDY ADDS
A machine learning (ML)-based preprocedural risk prediction model for 30-day mortality following percutaneous coronary intervention (PCI) was developed.
The ML-based model was compared with the traditional regression model. HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
Risk prediction models aid clinical decision-making, enhance patient counselling, improve care quality, inform healthcare policies, and advance research.
Background
Percutaneous coronary intervention (PCI) is one of the most widely performed medical procedures,1 and is a highly effective in treating coronary artery disease.2 Whilst PCI is extremely safe, patients remain at risk of mortality.3 In 2020-2021, approximately 48,000 PCIs were performed in Australia, with 75% being performed in males.4 It is worth noting that similar to other cardiac procedures, approximately two-thirds of deaths after PCI occur within the first 30 days following the procedure.5 In Australia, the prevalence of 30-day mortality post-PCI was around 2%.6 30-day mortality is influenced by various factors including patient demographic and their pre-procedural clinical status.7-10 Therefore, in order to assess pre-procedural risk of 30-day mortality, it is essential to identify and understand these factors.
A risk-adjusted model for predicting 30-day mortality post-PCI can aid physicians in selecting optimal interventions tailored to individual demographics, lifestyles, patient comorbidities, and clinical presentations. Such models are also essential when comparing outcomes across individual institutions and regions. Thus, this approach may enhance intervention quality by improving risk adjustment and subsequently may reduce post-PCI mortality as well as costs through cost-effective care strategies.11 12 While existing risk-adjusted models developed by clinical quality registries have been valuable, their applicability to contemporary PCI populations may be limited due to variations in study populations, healthcare systems, registry uniformity, risk factors, and methodological approaches.7-10 Additionally, many models in the literature traditionally rely on multivariable logistic and Cox’s regression methods for outcome prediction.7-10
In recent years, there has been a growing interest in the use of machine learning (ML) methods in developing risk prediction models.13 14 Traditional methods (e.g. logistic regression or Cox regression) require more structural data, greater human input for the verification of distributional assumptions and incorporation of application knowledge in choosing the input parameters.15 Conversely, ML approaches deal with high-level non-structural big data from patient databases. These are often able to detect sophisticated data patterns with a multitude of variables that can be tested for numerous interactions and nonlinear relationships with the outcome that traditional statistical methods are sometimes struggle to explain.16 Further, these approaches have been successfully applied to predict patient prognoses in many public health areas, such as the risk of readmission after hospital discharge, cancer progression, and diabetic complications.17-19 Current evidence indicated that ML methods outperformed traditional regression models in population-specific mortality studies.20 Though the application of ML has amplified in medical and health care in Australia, its potential applications in 30-day mortality post-PCI has not been extensively explored in a contemporary Australian population. Therefore, this current study aims to identify pre-procedural factors associated with 30-day mortality post-PCI for an Australian population, find the best ML approach and compare ML’s performance metrics with a traditional logistic regression method.
Methods
Study Population
Data used in this study were collected by the Victorian Cardiac Outcomes Registry (VCOR) comprising 93,055 consecutive PCI cases from 33 (15 public and 18 private) participating Victorian hospitals between 1 January 2013 to 31 December 2021. Patient-level demographics, comorbidities, procedural details, in-hospital and 30-day mortality were captured by the registry, where each PCI was treated as a separate observation. PCI procedures were excluded if they were not the index admission or had missing outcome measures.21
Outcome Variable
The outcome variable for this study was 30-day all-cause mortality post-PCI.
Selection Of Potential Factors Of 30-Day Morality
An extensive systematic review selected 17 potential factors associated with 30-day mortality post-PCI.22 However, five of these 17 factors: hypertension, single or multivessel diseases, heart failure, Thrombolysis in Myocardial Infarction (TIMI) flow and urgency of PCI are not collected in the VCOR registry. The following factors: body mass index (BMI), chronic total occlusion (CTO), lesion complexity (ACC/AHA Lesion Classification B2/C), previous coronary artery bypass grafting (CABG) and lesion location were not among the proposed 17 variables, however they were included in the list of potential factors based on consultation with expert interventional cardiologists. The multicollinearity was checked by throw assessing the standard error (>5 indicates multicollinearity) of variables using logistic regression (LR) analysis.23 Further, first-degree interaction effect between clinically relevant risk factors were also investigated using LR analysis. Operational definitions for all of these factors were presented in supplemental table 1.
Management of missing data
Imputation of missing values may improve model performance. Missing values in this study were imputed using Multiple Imputations by Chained Equations (MICE) with fully conditional specification.24
ML model development
The current literature indicates that among the various ML algorithms, Adaptive Booster (AdaB), Decision Tree (DT), Gradient Booster (GB), Random Forest (RF) and Extreme Gradient Booster (XGB) outperform others in predicting short-term mortality post-PCI (Supplementary Table 2).25-28 Each of these ML algorithms was used to select variables related to 30-day mortality post-PCI, with the most relevant variables were selected and ranked using the SHapley Additive exPlanations (SHAP) technique.29 SHAP Beeswarm plots are a more complex and information-rich display of SHAP values that reveal not just the relative importance of variables, but their actual relationships with the predicted outcome. It provides a general sense of variables’ directionality impact based on the distribution of the red and blue dots. Red dots crossing 0 indicates a higher impact on mortality.
In all algorithms, hyperparameters were tuned using the 10-fold cross-validation technique to train the algorithm, resulting in the entire dataset being randomly divided into 10 equal parts for 10 iterations. The algorithm underwent training (90% of data) and testing (10% of data) for each iteration, with the performance metrics aggregated over 10 iterations to yield an overall performance score. The averaged performance metrics (accuracy, sensitivity, specificity, receiver operating curve (ROC), and Brier score) were then reported for each algorithm based on the 10% testing data. For each algorithm, the change of performances based on accuracy and ROC was assessed with sequential backward exclusion of least weighted variables. The set of variables and optimal performance were then achieved by a significant drop in metrics for each algorithm following the sequential elimination of variables in the model. Finally, the best ML model was the one that provided better model’s performance and parsimony compared to other models (supplemental table 3). Schematic presentation of machine learning model development is presented in figure 1.
Schematic presentation of machine learning model development
Furthermore, the best selected ML method was then employed for each of the STEMI, octogenarian and obese patient cohorts to identify the variables associated with 30-day mortality post-PCI.
Logistic Regression model development
A multiple logistic regression along with backward elimination method was used to select the significant (p<0.05) variables from the list of 17 variables previously discussed.22 As outlined above the entire data set was used for the selection of factors in the model, and then the model was validated using a 10-fold cross validation method. The model was then evaluated using the same set of performance metrics.
Model performance metrics
The assessment and comparison of prediction performance for both ML and traditional LR models were determined by evaluating various metrics (accuracy, sensitivity, specificity, receiver operating curve (ROC), and Brier score) as listed above. These performance evaluation criteria were computed from the confusion matrix, considering true positives, true negatives, false positives, and false negatives (supplemental table 2).
Model calibration was evaluated by examining the Brier score, which falls within the range of 0.0 to 1.0. A Brier score of 0 signifies perfect accuracy, while a Brier score of 1 indicates perfect inaccuracy. A calibration curve was also created to evaluate the calibration performance of the model.
Health services performance
The performance of individual hospitals regarding patients’ adjusted risk of 30-day mortality is depicted in funnel plots. Funnel plots were utilized as a visual tool for comparing health services by plotting estimates of risk-adjusted 30-day mortality rates against the total number of procedures performed.30 These plots are a valuable addition to performance monitoring systems. To generate the funnel plot, adjusted risk was computed for each patient using regression coefficients derived from LR analysis. Health services’ performance in predicting 30-day mortality was assessed and compared between current (LR and ML) and existing VCOR models.
Data analysis package: All data analyses were undertaken using Stata (version 17), R-studio (version 2024.03.0) and Python (version 3.12.2) statistical software packages.
Results
Baseline patient characteristics
An overview of the baseline characteristics of the 93,055 participants is presented in Supplementary Table 4. The average age was 66.5 (±11.9) years and 76% were male with one-third of patients having had a prior PCI. Around 50.5% of patients presented with acute coronary syndrome (ACS) and 22.7% had a prior history of diabetes. Prior cerebrovascular event was present in 3.6% and peripheral vascular disease prevalence was 3.5%. The rate of intubated out-of-hospital cardiac arrest (OHCA) was 1.1%, 2.2% presented with cardiogenic shock, and 2.6% of patients had severe renal impairment (estimated glomerular filtration rate (eGFR) <30 mL/min/1.73 m2).
The overall 30-day all-cause mortality was 2.1%. Patients who presenting with cardiac arrest had the highest mortality rates (47.5%) followed by cardiogenic shock (44.1%). Further, patients with severely reduced left ventricular ejection fraction (LVEF) (<30%) had a mortality rate of 15.2%, while those with severe loss of kidney function (eGFR<30 mL/min/1.73 m2) had a mortality of 10.5%. The mortality in elderly patients (80 years and over) was 4.3% while those with STEMI had a mortality rate of 6.8% (table 1)
ML model comparison with LR model
The RF algorithm outperformed other ML algorithms (supplemental table 3). The top 12 variables (based on their impact in the model performance) included in the RF model were cardiogenic shock, LVEF, ACS, eGFR, intubated OHCA, age, mechanical ventricular support, complex lesion, lesion location, BMI, sex, and diabetes (figure 2) and Supplementary Table 3). In contrast, the LR model identified cardiogenic shock, intubated OHCA, mechanical ventricular support, LVEF, age, eGFR, ACS, peripheral vascular disease (PVD), cerebrovascular disease (CVD), and complex lesion (figure 3). For the RF model, the ROC was 94.3% and accuracy was 98.4%. In contrast, the LR model had a ROC of 92.3% and accuracy of 98.2% (Figure 4).
Beeswarm SHapley Additive exPlanations (SHAP) plot presenting factors related to 30-day mortality post-PCI
Factors identified by the Logistic Regression model
Discrimination (graph in the left)) and calibration (graph in the right) of the Random Forest and Logistic Regression models
The sensitivity and specificity of the RF model (sensitivity = 60.1% and specificity = 98.6%) were better than LR model (sensitivity = 49.7% and specificity = 99%) (Supplementary figure 1). Brier score as well as calibration curve were used to assess calibration performance of the models. The Brier score for the RF model of 1.4% demonstrated better model calibration compared with that of the LR model of 1.8% (figure 4 and supplemental figure 1).
Moreover, when validating the previous VCOR risk adjustment model, the ROC was 83.8%, indicating a significant difference compared with the ROC of the RF model (94.3%) and the traditional LR model (92.3%) (supplemental figure 2).
In the sensitivity analysis, both the RF and LR models were trained using 70% of the entire dataset, while the remaining 30% served as the validation (test) set. Hyperparameter tuning was performed for each model using a 10-fold cross-validation approach. The trained models were then assessed on the validation dataset, and their performance was compared using metrics such as accuracy, sensitivity, specificity, precision, recall, ROC, and Brier score. Both models showed similar performance across these metrics (supplemental table 5).
Interpretability of variables in the RF model
The SHAP plot in the figure 2 provides insights into the influence and direction of various variables on predicting 30-day mortality post-PCI, as depicted by the distribution of red and blue dots. The variables are ranked in order of importance, with those at the top having the greatest impact on the predictions. Cardiogenic shock is identified as the most influential predictor, followed by LVEF and ACS. The horizontal axis displays SHAP values, which represent the contribution of each variable to the model’s output. Negative SHAP values are associated with a lower likelihood of mortality, while positive values indicate a higher risk. The dots’ colours reflect the variable’s value for individual instances: blue indicates lower values, while red indicates higher values. The vertical bar on the right side provides the coding for categorical variables or the scale for numerical ones. For example, cardiogenic shock is coded as 0 for normal and 1 for cardiogenic shock. Thus, blue dots represent normal values (code 0), while red dots correspond to cardiogenic shock (code 1). The plot shows that cardiogenic shock is the most critical factor influencing 30-day mortality, as patients with this condition (red dots) exhibit positive SHAP values, indicating an increased mortality risk. Other significant predictors of 30-day mortality include severely reduced LVEF, severe kidney impairment, OHCA, older age, requiring mechanical ventricular support, the presence of left main or graft lesions, female sex, obesity, and diabetes.
Health service performance
In the funnel plot, four out of 33 health services were outside the 95% limit when utilizing the current RF model while five health services were outside that limit for LR models (Figure 4). When the previous VCOR model was used, nine health services were identified as lying outside the 95% confidence limit, warranting further scrutiny. None of the health service exceeded the 99.8% limit, indicating consistent and reliable assessment of health services performance across all models (figure 5).
Funnel plots to assess institutional performance
Results of subgroup analyses
In the context of patients with STEMI, the RF model identified the significance of cardiogenic shock, LVEF, eGFR, intubated OHCA, and age as the top five risk factors associated with 30-day mortality post-PCI. In separate analysis for octogenarian patients, the RF model identified ACS, LVEF, cardiogenic shock, eGFR, and intubated OHCA as the top five factors influencing 30-day mortality. In obese patients, the RF model identified cardiogenic shock, LVEF, intubated OHCA, ACS, and eGFR as the top five influential factors of 30-day mortality post-PCI (figure 6). The performance of the RF algorithm in predicting 30-day mortality post-PCI for subgroups including STEMI, octogenarian and obese patients were presented in table 2 and supplemental figure 3.
Beeswarm plots presenting top 10 factors associated with 30-day mortality for ST-elevated myocardial infraction (STEMI), octogenarian ad obese patients
Discussion
This study observed an all-cause 30-day mortality post-PCI rate of 2.1%, underscoring the critical need for accurate risk assessment to guide clinical decisions and performance evaluations. This study primarily aimed to develop an optimized risk-adjustment model for predicting 30-day mortality post-PCI using ML algorithms. Additionally, it sought to compare the predictive efficacy of these ML models with both a traditional LR model developed in this study and an existing risk-adjusted model based on the VCOR dataset. In alignment with objectives, this study identified the RF model as a better predictor of 30-day mortality, exhibiting an overall better performance metrics (accuracy, discrimination (ROC), sensitivity, specificity, and Brier score). Moreover, ML methods demonstrated robust predictive capabilities in mortality risk assessment when contrasted with the traditional LR model. Further, the promising performance of the ML model in predicting 30-day mortality following PCI indicates its potential for prospective utilization of risk adjustment, and enhancing hospital performance evaluation.
In this study, five well-established ML algorithms (AdaB, DT, GB, RF, and XGB) were employed to identify pre-procedural factors associated with 30-day mortality post-PCI. Among the various algorithms assessed, the RF model was slightly better compared to other ML models. The performance of the RF model also surpassed the performance of the traditional LR model. further, we observed that the RF model correctly predicted approximately three out of five deaths compared to the traditional LR model where just half of the deaths were correctly predicted. Compared to other current findings, in a Taiwanese study, the DT model exhibited an excellent predictive capabilities for 30-day mortality post-PCI.27 Meanwhile, research in the United States favoured the GB-based classifier as the optimal predictive algorithm for various cardiac interventions.25 26 Consistent with the current study, an Italian population study identified the RF model as superior in predicting 30-day mortality post-PCI, achieving a ROC score of 81.6% during external validation.28 These findings suggest that ML-based approaches, particularly the RF model, hold significant promise in developing predictive models for 30-day mortality post-PCI.
The RF model emphasizes the pivotal role of cardiogenic shock as a primary factor significantly associated with 30-day mortality post-PCI, a finding corroborated by the traditional LR model. This aligns with previous research across diverse studies, underscoring cardiogenic shock as a critical factor linked to 30-day post-PCI mortality and affirming the robust performance of the model.31 32 In addition to cardiogenic shock, the RF model identified several other key factors, including LVEF, ACS, eGFR, intubated OHCA, age, mechanical ventricular support, complex lesion, lesion location, BMI, sex, and diabetes as the top factors associated with 30-day all-cause mortality post-PCI. A previous study in Taiwan, utilizing the DT model, determined hyperlipidemia, hypertension, diabetes, heart failure, stroke, and chronic kidney disease as factors linked to 30-day mortality post-PCI.27 It is noteworthy that studies utilizing ML methods to identify factors associated with 30-day mortality post-PCI have been limited in the context of the Australian population. However, an Australian study employing traditional LR modelling revealed nearly identical factors to those identified by the ML-based RF model.6 Moreover, consistent findings across various studies using traditional LR models highlighted the significance of factors such as age, BMI, LVEF, eGFR, cardiac arrest, ACS, mechanical ventricular support, PVD, CVD, and complex lesions as significant predictors of 30-day mortality post-PCI.6 10
Both the RF and the traditional LR model identified common risk factors for 30-day all-cause mortality, including cardiogenic shock, LVEF, eGFR, intubated OHCA, ACS, age, mechanical ventricular support, PVD, and complex lesions. The slight variation in factors’ selection between the RF model and the traditional LR model may contribute to a significant difference in the models’ discrimination. Furthermore, most of the variables selected by the RF model exhibited higher model performance in different studies using traditional LR models, underscoring the importance of the chosen set of pre-procedural variables for model development.6 32 33
Risk-adjusted 30-day mortality prediction is disseminated to participating hospitals in the VCOR network, allowing for health service performance benchmarking against other health services via risk-adjusted funnel plots. Both the ML model and the current LR model displayed slightly enhanced performance compared to the previous risk-adjusted model in assessing health service performance. Additionally, the RF model’s predictive ability in evaluating 30-day all-cause mortality post-PCI demonstrated encouraging results in assessing hospital performance. The ML model’s identification of critical risk factors for 30-day mortality post-PCI suggests the possibility of integrating or substituting traditional LR models within the VCOR risk-adjustment process and public reporting mechanisms. It may be beneficial to repeat these analyses when data over a longer period of time is available to reassess the presence of consistent differences in prediction.
Strengths and limitations
The current study unveiled several strengths and limitations. The inclusion of a large volume of data substantially contributed to enhancing the accuracy of predicting 30-day all-cause mortality post-PCI is the key strength. The proposed set of pre-procedural variables holds promise for boosting the model’s overall performance. However, the study is not without its limitations. Firstly, the data source encompassed patients from a specific geographic region in Australia, constraining the generalizability of the findings and necessitating validation in diverse populations. Secondly, the study only explored a subset of ML approaches, leaving unaddressed the performance of methods that were not evaluated herein. ML algorithms are not able to produce p-value (significance level) and beta-coefficient while selecting influential variables comparable to traditional LR models. Finally, it’s worth noting that due to the limitations of the availability of all 17 factors proposed as per earlier study,25 that there may be missed opportunities for improving overall performance metrics.
Conclusion
In this study, a robust model for 30-day all-cause mortality has been developed for all PCI procedures using ML approach. The ML based RF model significantly enhanced the accuracy of predicting the risk of mortality within 30-day post-PCI and identified the top 12 influential factors including cardiogenic shock, LVEF, ACS, eGFR, intubated OHCA, age, mechanical ventricular support, complex lesion, lesion location, BMI, sex, and diabetes. Notably, the ML approach outperformed the LR method when predicting 30-day mortality post-PCI. This ML-based risk-adjusted model has the potential to assist clinicians in early identification of patients at risk as well as utility for benchmarking institutional performance. However, there is need for further validation utilising external data to ensure the applicability of these findings in clinical practice.
Data availability statement
Data are available upon reasonable request. Anonymized personal data were obtained from the Victorian Cardiac Outcome Registry (VCOR) after ethical approval and a confidentiality assessment. In accordance with Australian laws and regulations, access to personal sensitive data is restricted to researchers who meet the legal requirements for such access. For inquiries regarding data access, please contact Dr. Diem Dinh.
Contributors
Conceptualisation: MRKC, BB, DD, DS. Methodology: BB, MRKC, MNK. Analysis: MRKC, MAM, MSI. Manuscript drafting: MY. Manuscript review and critical revision: BB, DS, DD, AB, CMR, SN, JL, DPC. Visualisation: MRKC, MSI. Supervision: BB, DS, DD, MNK. Project administration: DS, DD. MRKC, BB are the guarantor.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Competing interests
None declared.
Patient and public involvement
This research did not involve patients or the public in its design, conduct, reporting, or dissemination.
Patient consent for publication
Not applicable.
Ethics approval
The primary ethics approval was granted by the ethics committee at The Alfred Hospital (approval number 47/12), and also approved by each participating hospital, including the use of opt-out consent.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Abbreviations
- PCI
- Percutaneous coronary intervention
- AdB
- Adaptive Booster
- DT
- Decision Tree
- GB
- Gradient Booster
- RF
- Random Forest
- XGB
- Extreme Gradient Booster
- LR
- Logistic Regression
- eGFR
- estimated Glomerular filtration rate
- ACS
- Acute coronary syndrome
- OHCA
- Out-of-hospital cardiac arrest
- PVD
- Peripheral vascular disease
- CVD
- Cerebrovascular disease
- CTO
- Chronic total occlusion
- CABG
- coronary artery bypass grafting
- ROC
- Receiver operating characteristic
- STEMI
- ST-elevated myocardial infraction
- SHAP
- SHapley Additive exPlanations
- BMI
- Body mass index
- ML
- Machine Learning
- VCOR
- Victorian cardiac outcome registry
- MICE
- Multiple Imputations by Chained Equations
- TIMI
- Thrombolysis in Myocardial Infarction