Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Machine learning approaches to predict 30-day mortality following percutaneous coronary intervention in an Australian population

View ORCID ProfileMohammad Rocky Khan Chowdhury, View ORCID ProfileDion Stub, View ORCID ProfileMd Nazmul Karim, Angela Brennan, View ORCID ProfileChristopher M. Reid, Shane Nanayakkara, Jeffrey Lefkovits, View ORCID ProfileMohammad Ali Moni, View ORCID ProfileMd Shofiqul Islam, Derek P. Chew, Diem Dinh, Baki Billah
doi: https://doi.org/10.1101/2025.03.01.25323134
Mohammad Rocky Khan Chowdhury
1School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mohammad Rocky Khan Chowdhury
Dion Stub
2Department of Cardiology, Alfred Hospital, Melbourne, VIC, Australia
3School of Population Health, Curtin University, Perth, WA, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Dion Stub
Md Nazmul Karim
1School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Md Nazmul Karim
Angela Brennan
1School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher M. Reid
3School of Population Health, Curtin University, Perth, WA, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Christopher M. Reid
Shane Nanayakkara
2Department of Cardiology, Alfred Hospital, Melbourne, VIC, Australia
4Monash-Alfred-Baker Centre for Cardiovascular Research, Monash University, Melbourne, VIC, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jeffrey Lefkovits
5Department of Cardiology, Royal Melbourne Hospital, Melbourne, VIC, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mohammad Ali Moni
6Artificial Intelligence and Digital Health, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioral Sciences, The University of Queensland, St Lucia, QLD, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mohammad Ali Moni
Md Shofiqul Islam
7Institute for Intelligent Systems Research and Innovation, Deakin University, Geelong, VIC, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Md Shofiqul Islam
Derek P. Chew
8Cardiac Informatics Research, Victorian Heart Institute, Monash University, VIC, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Diem Dinh
1School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Baki Billah
1School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: baki.billah{at}monash.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Background PCI is an effective treatment for coronary artery disease. Pre-procedural 30-day mortality post-PCI risk prediction aids in clinical decision-making and benchmarking hospital performance. This study aimed to identify pre-procedural factors to predict the risk of 30-day mortality following Percutaneous Coronary Intervention (PCI) using machine learning (ML) approaches.

Methods The study analysed 93,055 consecutive PCI procedures from the Victorian Cardiac Outcomes Registry (VCOR) in Australia to develop a pre-procedural 30-day mortality prediction model. Five ML approaches—Adaptive Booster (AdB), Decision Tree (DT), Gradient Booster (GB), Random Forest (RF), and Extreme Gradient Booster (XGB) were employed, utilizing Logistic Regression (LR) for comparison. Model performance was evaluated using k-fold cross-validation, with metrics including sensitivity, specificity, accuracy, ROC curve, Brier score, and calibration curve.

Results The study showed that the RF model outperformed other ML models in predicting 30-day mortality, achieving accuracy of 98.4% and a ROC of 94.3%. Utilizing the SHapley Additive exPlanations method, the RF model identified cardiogenic shock, ejection fraction, acute coronary syndrome, estimated GFR, cardiac arrest, age, mechanical ventricular support, complex lesion, lesion location, BMI, sex, and diabetes as the variables that were associated with 30-day mortality post-PCI. In comparison, the traditional LR model exhibited an accuracy of 98.2% and a ROC of 92.9%.

Conclusion A 30-day mortality post-PCI risk prediction model was developed with high accuracy using a ML method. It’s essential to underscore the need for further validation with external data to ensure the applicability of the model to other populations.

WHAT IS ALREADY KNOWN ON THIS TOPIC

  • A risk-adjustment model for an Australian PCI patient population was previously developed to predict 30-day mortality using traditional regression model.

  • Medical knowledge, patient characteristics, and clinical practices evolve over time, requiring frequent model updates to reflect new evidence, guidelines, and interventions

WHAT THIS STUDY ADDS

  • A machine learning (ML)-based preprocedural risk prediction model for 30-day mortality following percutaneous coronary intervention (PCI) was developed.

  • The ML-based model was compared with the traditional regression model. HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

  • Risk prediction models aid clinical decision-making, enhance patient counselling, improve care quality, inform healthcare policies, and advance research.

Background

Percutaneous coronary intervention (PCI) is one of the most widely performed medical procedures,1 and is a highly effective in treating coronary artery disease.2 Whilst PCI is extremely safe, patients remain at risk of mortality.3 In 2020-2021, approximately 48,000 PCIs were performed in Australia, with 75% being performed in males.4 It is worth noting that similar to other cardiac procedures, approximately two-thirds of deaths after PCI occur within the first 30 days following the procedure.5 In Australia, the prevalence of 30-day mortality post-PCI was around 2%.6 30-day mortality is influenced by various factors including patient demographic and their pre-procedural clinical status.7-10 Therefore, in order to assess pre-procedural risk of 30-day mortality, it is essential to identify and understand these factors.

A risk-adjusted model for predicting 30-day mortality post-PCI can aid physicians in selecting optimal interventions tailored to individual demographics, lifestyles, patient comorbidities, and clinical presentations. Such models are also essential when comparing outcomes across individual institutions and regions. Thus, this approach may enhance intervention quality by improving risk adjustment and subsequently may reduce post-PCI mortality as well as costs through cost-effective care strategies.11 12 While existing risk-adjusted models developed by clinical quality registries have been valuable, their applicability to contemporary PCI populations may be limited due to variations in study populations, healthcare systems, registry uniformity, risk factors, and methodological approaches.7-10 Additionally, many models in the literature traditionally rely on multivariable logistic and Cox’s regression methods for outcome prediction.7-10

In recent years, there has been a growing interest in the use of machine learning (ML) methods in developing risk prediction models.13 14 Traditional methods (e.g. logistic regression or Cox regression) require more structural data, greater human input for the verification of distributional assumptions and incorporation of application knowledge in choosing the input parameters.15 Conversely, ML approaches deal with high-level non-structural big data from patient databases. These are often able to detect sophisticated data patterns with a multitude of variables that can be tested for numerous interactions and nonlinear relationships with the outcome that traditional statistical methods are sometimes struggle to explain.16 Further, these approaches have been successfully applied to predict patient prognoses in many public health areas, such as the risk of readmission after hospital discharge, cancer progression, and diabetic complications.17-19 Current evidence indicated that ML methods outperformed traditional regression models in population-specific mortality studies.20 Though the application of ML has amplified in medical and health care in Australia, its potential applications in 30-day mortality post-PCI has not been extensively explored in a contemporary Australian population. Therefore, this current study aims to identify pre-procedural factors associated with 30-day mortality post-PCI for an Australian population, find the best ML approach and compare ML’s performance metrics with a traditional logistic regression method.

Methods

Study Population

Data used in this study were collected by the Victorian Cardiac Outcomes Registry (VCOR) comprising 93,055 consecutive PCI cases from 33 (15 public and 18 private) participating Victorian hospitals between 1 January 2013 to 31 December 2021. Patient-level demographics, comorbidities, procedural details, in-hospital and 30-day mortality were captured by the registry, where each PCI was treated as a separate observation. PCI procedures were excluded if they were not the index admission or had missing outcome measures.21

Outcome Variable

The outcome variable for this study was 30-day all-cause mortality post-PCI.

Selection Of Potential Factors Of 30-Day Morality

An extensive systematic review selected 17 potential factors associated with 30-day mortality post-PCI.22 However, five of these 17 factors: hypertension, single or multivessel diseases, heart failure, Thrombolysis in Myocardial Infarction (TIMI) flow and urgency of PCI are not collected in the VCOR registry. The following factors: body mass index (BMI), chronic total occlusion (CTO), lesion complexity (ACC/AHA Lesion Classification B2/C), previous coronary artery bypass grafting (CABG) and lesion location were not among the proposed 17 variables, however they were included in the list of potential factors based on consultation with expert interventional cardiologists. The multicollinearity was checked by throw assessing the standard error (>5 indicates multicollinearity) of variables using logistic regression (LR) analysis.23 Further, first-degree interaction effect between clinically relevant risk factors were also investigated using LR analysis. Operational definitions for all of these factors were presented in supplemental table 1.

Management of missing data

Imputation of missing values may improve model performance. Missing values in this study were imputed using Multiple Imputations by Chained Equations (MICE) with fully conditional specification.24

ML model development

The current literature indicates that among the various ML algorithms, Adaptive Booster (AdaB), Decision Tree (DT), Gradient Booster (GB), Random Forest (RF) and Extreme Gradient Booster (XGB) outperform others in predicting short-term mortality post-PCI (Supplementary Table 2).25-28 Each of these ML algorithms was used to select variables related to 30-day mortality post-PCI, with the most relevant variables were selected and ranked using the SHapley Additive exPlanations (SHAP) technique.29 SHAP Beeswarm plots are a more complex and information-rich display of SHAP values that reveal not just the relative importance of variables, but their actual relationships with the predicted outcome. It provides a general sense of variables’ directionality impact based on the distribution of the red and blue dots. Red dots crossing 0 indicates a higher impact on mortality.

In all algorithms, hyperparameters were tuned using the 10-fold cross-validation technique to train the algorithm, resulting in the entire dataset being randomly divided into 10 equal parts for 10 iterations. The algorithm underwent training (90% of data) and testing (10% of data) for each iteration, with the performance metrics aggregated over 10 iterations to yield an overall performance score. The averaged performance metrics (accuracy, sensitivity, specificity, receiver operating curve (ROC), and Brier score) were then reported for each algorithm based on the 10% testing data. For each algorithm, the change of performances based on accuracy and ROC was assessed with sequential backward exclusion of least weighted variables. The set of variables and optimal performance were then achieved by a significant drop in metrics for each algorithm following the sequential elimination of variables in the model. Finally, the best ML model was the one that provided better model’s performance and parsimony compared to other models (supplemental table 3). Schematic presentation of machine learning model development is presented in figure 1.

Figure 1
  • Download figure
  • Open in new tab
Figure 1

Schematic presentation of machine learning model development

Furthermore, the best selected ML method was then employed for each of the STEMI, octogenarian and obese patient cohorts to identify the variables associated with 30-day mortality post-PCI.

Logistic Regression model development

A multiple logistic regression along with backward elimination method was used to select the significant (p<0.05) variables from the list of 17 variables previously discussed.22 As outlined above the entire data set was used for the selection of factors in the model, and then the model was validated using a 10-fold cross validation method. The model was then evaluated using the same set of performance metrics.

Model performance metrics

The assessment and comparison of prediction performance for both ML and traditional LR models were determined by evaluating various metrics (accuracy, sensitivity, specificity, receiver operating curve (ROC), and Brier score) as listed above. These performance evaluation criteria were computed from the confusion matrix, considering true positives, true negatives, false positives, and false negatives (supplemental table 2).

Model calibration was evaluated by examining the Brier score, which falls within the range of 0.0 to 1.0. A Brier score of 0 signifies perfect accuracy, while a Brier score of 1 indicates perfect inaccuracy. A calibration curve was also created to evaluate the calibration performance of the model.

Health services performance

The performance of individual hospitals regarding patients’ adjusted risk of 30-day mortality is depicted in funnel plots. Funnel plots were utilized as a visual tool for comparing health services by plotting estimates of risk-adjusted 30-day mortality rates against the total number of procedures performed.30 These plots are a valuable addition to performance monitoring systems. To generate the funnel plot, adjusted risk was computed for each patient using regression coefficients derived from LR analysis. Health services’ performance in predicting 30-day mortality was assessed and compared between current (LR and ML) and existing VCOR models.

Data analysis package: All data analyses were undertaken using Stata (version 17), R-studio (version 2024.03.0) and Python (version 3.12.2) statistical software packages.

Results

Baseline patient characteristics

An overview of the baseline characteristics of the 93,055 participants is presented in Supplementary Table 4. The average age was 66.5 (±11.9) years and 76% were male with one-third of patients having had a prior PCI. Around 50.5% of patients presented with acute coronary syndrome (ACS) and 22.7% had a prior history of diabetes. Prior cerebrovascular event was present in 3.6% and peripheral vascular disease prevalence was 3.5%. The rate of intubated out-of-hospital cardiac arrest (OHCA) was 1.1%, 2.2% presented with cardiogenic shock, and 2.6% of patients had severe renal impairment (estimated glomerular filtration rate (eGFR) <30 mL/min/1.73 m2).

The overall 30-day all-cause mortality was 2.1%. Patients who presenting with cardiac arrest had the highest mortality rates (47.5%) followed by cardiogenic shock (44.1%). Further, patients with severely reduced left ventricular ejection fraction (LVEF) (<30%) had a mortality rate of 15.2%, while those with severe loss of kidney function (eGFR<30 mL/min/1.73 m2) had a mortality of 10.5%. The mortality in elderly patients (80 years and over) was 4.3% while those with STEMI had a mortality rate of 6.8% (table 1)

View this table:
  • View inline
  • View popup
Table 1 Mortality rate after 30 days of PCI

ML model comparison with LR model

The RF algorithm outperformed other ML algorithms (supplemental table 3). The top 12 variables (based on their impact in the model performance) included in the RF model were cardiogenic shock, LVEF, ACS, eGFR, intubated OHCA, age, mechanical ventricular support, complex lesion, lesion location, BMI, sex, and diabetes (figure 2) and Supplementary Table 3). In contrast, the LR model identified cardiogenic shock, intubated OHCA, mechanical ventricular support, LVEF, age, eGFR, ACS, peripheral vascular disease (PVD), cerebrovascular disease (CVD), and complex lesion (figure 3). For the RF model, the ROC was 94.3% and accuracy was 98.4%. In contrast, the LR model had a ROC of 92.3% and accuracy of 98.2% (Figure 4).

Figure 2
  • Download figure
  • Open in new tab
Figure 2

Beeswarm SHapley Additive exPlanations (SHAP) plot presenting factors related to 30-day mortality post-PCI

Figure 3
  • Download figure
  • Open in new tab
Figure 3

Factors identified by the Logistic Regression model

Figure 4
  • Download figure
  • Open in new tab
Figure 4

Discrimination (graph in the left)) and calibration (graph in the right) of the Random Forest and Logistic Regression models

The sensitivity and specificity of the RF model (sensitivity = 60.1% and specificity = 98.6%) were better than LR model (sensitivity = 49.7% and specificity = 99%) (Supplementary figure 1). Brier score as well as calibration curve were used to assess calibration performance of the models. The Brier score for the RF model of 1.4% demonstrated better model calibration compared with that of the LR model of 1.8% (figure 4 and supplemental figure 1).

Moreover, when validating the previous VCOR risk adjustment model, the ROC was 83.8%, indicating a significant difference compared with the ROC of the RF model (94.3%) and the traditional LR model (92.3%) (supplemental figure 2).

In the sensitivity analysis, both the RF and LR models were trained using 70% of the entire dataset, while the remaining 30% served as the validation (test) set. Hyperparameter tuning was performed for each model using a 10-fold cross-validation approach. The trained models were then assessed on the validation dataset, and their performance was compared using metrics such as accuracy, sensitivity, specificity, precision, recall, ROC, and Brier score. Both models showed similar performance across these metrics (supplemental table 5).

Interpretability of variables in the RF model

The SHAP plot in the figure 2 provides insights into the influence and direction of various variables on predicting 30-day mortality post-PCI, as depicted by the distribution of red and blue dots. The variables are ranked in order of importance, with those at the top having the greatest impact on the predictions. Cardiogenic shock is identified as the most influential predictor, followed by LVEF and ACS. The horizontal axis displays SHAP values, which represent the contribution of each variable to the model’s output. Negative SHAP values are associated with a lower likelihood of mortality, while positive values indicate a higher risk. The dots’ colours reflect the variable’s value for individual instances: blue indicates lower values, while red indicates higher values. The vertical bar on the right side provides the coding for categorical variables or the scale for numerical ones. For example, cardiogenic shock is coded as 0 for normal and 1 for cardiogenic shock. Thus, blue dots represent normal values (code 0), while red dots correspond to cardiogenic shock (code 1). The plot shows that cardiogenic shock is the most critical factor influencing 30-day mortality, as patients with this condition (red dots) exhibit positive SHAP values, indicating an increased mortality risk. Other significant predictors of 30-day mortality include severely reduced LVEF, severe kidney impairment, OHCA, older age, requiring mechanical ventricular support, the presence of left main or graft lesions, female sex, obesity, and diabetes.

Health service performance

In the funnel plot, four out of 33 health services were outside the 95% limit when utilizing the current RF model while five health services were outside that limit for LR models (Figure 4). When the previous VCOR model was used, nine health services were identified as lying outside the 95% confidence limit, warranting further scrutiny. None of the health service exceeded the 99.8% limit, indicating consistent and reliable assessment of health services performance across all models (figure 5).

Figure 5
  • Download figure
  • Open in new tab
Figure 5

Funnel plots to assess institutional performance

Results of subgroup analyses

In the context of patients with STEMI, the RF model identified the significance of cardiogenic shock, LVEF, eGFR, intubated OHCA, and age as the top five risk factors associated with 30-day mortality post-PCI. In separate analysis for octogenarian patients, the RF model identified ACS, LVEF, cardiogenic shock, eGFR, and intubated OHCA as the top five factors influencing 30-day mortality. In obese patients, the RF model identified cardiogenic shock, LVEF, intubated OHCA, ACS, and eGFR as the top five influential factors of 30-day mortality post-PCI (figure 6). The performance of the RF algorithm in predicting 30-day mortality post-PCI for subgroups including STEMI, octogenarian and obese patients were presented in table 2 and supplemental figure 3.

Figure 6
  • Download figure
  • Open in new tab
Figure 6

Beeswarm plots presenting top 10 factors associated with 30-day mortality for ST-elevated myocardial infraction (STEMI), octogenarian ad obese patients

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2 Random Forest model’s performance in predicting 30-day mortality post-PCI for different subgroups

Discussion

This study observed an all-cause 30-day mortality post-PCI rate of 2.1%, underscoring the critical need for accurate risk assessment to guide clinical decisions and performance evaluations. This study primarily aimed to develop an optimized risk-adjustment model for predicting 30-day mortality post-PCI using ML algorithms. Additionally, it sought to compare the predictive efficacy of these ML models with both a traditional LR model developed in this study and an existing risk-adjusted model based on the VCOR dataset. In alignment with objectives, this study identified the RF model as a better predictor of 30-day mortality, exhibiting an overall better performance metrics (accuracy, discrimination (ROC), sensitivity, specificity, and Brier score). Moreover, ML methods demonstrated robust predictive capabilities in mortality risk assessment when contrasted with the traditional LR model. Further, the promising performance of the ML model in predicting 30-day mortality following PCI indicates its potential for prospective utilization of risk adjustment, and enhancing hospital performance evaluation.

In this study, five well-established ML algorithms (AdaB, DT, GB, RF, and XGB) were employed to identify pre-procedural factors associated with 30-day mortality post-PCI. Among the various algorithms assessed, the RF model was slightly better compared to other ML models. The performance of the RF model also surpassed the performance of the traditional LR model. further, we observed that the RF model correctly predicted approximately three out of five deaths compared to the traditional LR model where just half of the deaths were correctly predicted. Compared to other current findings, in a Taiwanese study, the DT model exhibited an excellent predictive capabilities for 30-day mortality post-PCI.27 Meanwhile, research in the United States favoured the GB-based classifier as the optimal predictive algorithm for various cardiac interventions.25 26 Consistent with the current study, an Italian population study identified the RF model as superior in predicting 30-day mortality post-PCI, achieving a ROC score of 81.6% during external validation.28 These findings suggest that ML-based approaches, particularly the RF model, hold significant promise in developing predictive models for 30-day mortality post-PCI.

The RF model emphasizes the pivotal role of cardiogenic shock as a primary factor significantly associated with 30-day mortality post-PCI, a finding corroborated by the traditional LR model. This aligns with previous research across diverse studies, underscoring cardiogenic shock as a critical factor linked to 30-day post-PCI mortality and affirming the robust performance of the model.31 32 In addition to cardiogenic shock, the RF model identified several other key factors, including LVEF, ACS, eGFR, intubated OHCA, age, mechanical ventricular support, complex lesion, lesion location, BMI, sex, and diabetes as the top factors associated with 30-day all-cause mortality post-PCI. A previous study in Taiwan, utilizing the DT model, determined hyperlipidemia, hypertension, diabetes, heart failure, stroke, and chronic kidney disease as factors linked to 30-day mortality post-PCI.27 It is noteworthy that studies utilizing ML methods to identify factors associated with 30-day mortality post-PCI have been limited in the context of the Australian population. However, an Australian study employing traditional LR modelling revealed nearly identical factors to those identified by the ML-based RF model.6 Moreover, consistent findings across various studies using traditional LR models highlighted the significance of factors such as age, BMI, LVEF, eGFR, cardiac arrest, ACS, mechanical ventricular support, PVD, CVD, and complex lesions as significant predictors of 30-day mortality post-PCI.6 10

Both the RF and the traditional LR model identified common risk factors for 30-day all-cause mortality, including cardiogenic shock, LVEF, eGFR, intubated OHCA, ACS, age, mechanical ventricular support, PVD, and complex lesions. The slight variation in factors’ selection between the RF model and the traditional LR model may contribute to a significant difference in the models’ discrimination. Furthermore, most of the variables selected by the RF model exhibited higher model performance in different studies using traditional LR models, underscoring the importance of the chosen set of pre-procedural variables for model development.6 32 33

Risk-adjusted 30-day mortality prediction is disseminated to participating hospitals in the VCOR network, allowing for health service performance benchmarking against other health services via risk-adjusted funnel plots. Both the ML model and the current LR model displayed slightly enhanced performance compared to the previous risk-adjusted model in assessing health service performance. Additionally, the RF model’s predictive ability in evaluating 30-day all-cause mortality post-PCI demonstrated encouraging results in assessing hospital performance. The ML model’s identification of critical risk factors for 30-day mortality post-PCI suggests the possibility of integrating or substituting traditional LR models within the VCOR risk-adjustment process and public reporting mechanisms. It may be beneficial to repeat these analyses when data over a longer period of time is available to reassess the presence of consistent differences in prediction.

Strengths and limitations

The current study unveiled several strengths and limitations. The inclusion of a large volume of data substantially contributed to enhancing the accuracy of predicting 30-day all-cause mortality post-PCI is the key strength. The proposed set of pre-procedural variables holds promise for boosting the model’s overall performance. However, the study is not without its limitations. Firstly, the data source encompassed patients from a specific geographic region in Australia, constraining the generalizability of the findings and necessitating validation in diverse populations. Secondly, the study only explored a subset of ML approaches, leaving unaddressed the performance of methods that were not evaluated herein. ML algorithms are not able to produce p-value (significance level) and beta-coefficient while selecting influential variables comparable to traditional LR models. Finally, it’s worth noting that due to the limitations of the availability of all 17 factors proposed as per earlier study,25 that there may be missed opportunities for improving overall performance metrics.

Conclusion

In this study, a robust model for 30-day all-cause mortality has been developed for all PCI procedures using ML approach. The ML based RF model significantly enhanced the accuracy of predicting the risk of mortality within 30-day post-PCI and identified the top 12 influential factors including cardiogenic shock, LVEF, ACS, eGFR, intubated OHCA, age, mechanical ventricular support, complex lesion, lesion location, BMI, sex, and diabetes. Notably, the ML approach outperformed the LR method when predicting 30-day mortality post-PCI. This ML-based risk-adjusted model has the potential to assist clinicians in early identification of patients at risk as well as utility for benchmarking institutional performance. However, there is need for further validation utilising external data to ensure the applicability of these findings in clinical practice.

Data availability statement

Data are available upon reasonable request. Anonymized personal data were obtained from the Victorian Cardiac Outcome Registry (VCOR) after ethical approval and a confidentiality assessment. In accordance with Australian laws and regulations, access to personal sensitive data is restricted to researchers who meet the legal requirements for such access. For inquiries regarding data access, please contact Dr. Diem Dinh.

Contributors

Conceptualisation: MRKC, BB, DD, DS. Methodology: BB, MRKC, MNK. Analysis: MRKC, MAM, MSI. Manuscript drafting: MY. Manuscript review and critical revision: BB, DS, DD, AB, CMR, SN, JL, DPC. Visualisation: MRKC, MSI. Supervision: BB, DS, DD, MNK. Project administration: DS, DD. MRKC, BB are the guarantor.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Competing interests

None declared.

Patient and public involvement

This research did not involve patients or the public in its design, conduct, reporting, or dissemination.

Patient consent for publication

Not applicable.

Ethics approval

The primary ethics approval was granted by the ethics committee at The Alfred Hospital (approval number 47/12), and also approved by each participating hospital, including the use of opt-out consent.

Data Availability

All data produced in the present study are available upon reasonable request to the authors

Abbreviations

PCI
Percutaneous coronary intervention
AdB
Adaptive Booster
DT
Decision Tree
GB
Gradient Booster
RF
Random Forest
XGB
Extreme Gradient Booster
LR
Logistic Regression
eGFR
estimated Glomerular filtration rate
ACS
Acute coronary syndrome
OHCA
Out-of-hospital cardiac arrest
PVD
Peripheral vascular disease
CVD
Cerebrovascular disease
CTO
Chronic total occlusion
CABG
coronary artery bypass grafting
ROC
Receiver operating characteristic
STEMI
ST-elevated myocardial infraction
SHAP
SHapley Additive exPlanations
BMI
Body mass index
ML
Machine Learning
VCOR
Victorian cardiac outcome registry
MICE
Multiple Imputations by Chained Equations
TIMI
Thrombolysis in Myocardial Infarction

References

  1. 1.↵
    Khera S, Kolte D, Bhatt DL. Percutaneous coronary intervention. Translational research in coronary artery disease: Elsevier 2016:179–94.
  2. 2.↵
    Jennings S, Bennett K, Shelley E, et al. Trends in percutaneous coronary intervention and angiography in Ireland, 2004–2011: implications for Ireland and Europe. IJC Heart & Vessels 2014;4:35–39.
    OpenUrlPubMed
  3. 3.↵
    Update AS. Heart disease and stroke statistics—2020 update: a report from the American Heart Association. Circulation 2020;141(9):e139–e596.
    OpenUrlCrossRefPubMed
  4. 4.↵
    AIHW. Heart, stroke and vascular disease: Australian facts. National Hospital Morbidity Database (NHMD). Australian Institute of Health and Welfare (AIHW). 2020
  5. 5.↵
    Serruys PW, Morice M-C, Kappetein AP, et al. Percutaneous coronary intervention versus coronary-artery bypass grafting for severe coronary artery disease. New England journal of medicine 2009;360(10):961–72.
    OpenUrlCrossRefPubMedWeb of Science
  6. 6.↵
    Tacey M, Dinh DT, Andrianopoulos N, et al. Risk-adjusting key outcome measures in a clinical quality PCI registry: development of a highly predictive model without the need to exclude high-risk conditions. JACC: Cardiovascular Interventions 2019;12(19):1966–75.
    OpenUrl
  7. 7.↵
    Andrews M, Iqbal J, Wall JJ, et al. Development and Validation of a Novel Risk Score for Primary Percutaneous Coronary Intervention for ST-Elevation Myocardial Infarction. Cardiovascular Revascularization Medicine 2019;20(11):980–84.
    OpenUrlPubMed
  8. 8.
    Bulluck H, Zheng H, Chan MY, et al. Independent predictors of cardiac mortality and hospitalization for heart failure in a multi-ethnic Asian ST-segment elevation myocardial infarction population treated by primary percutaneous coronary intervention. Scientific Reports 2019;9(1):10072.
    OpenUrlPubMed
  9. 9.
    Cheng JM, Helming AM, van Vark LC, et al. A simple risk chart for initial risk assessment of 30-day mortality in patients with cardiogenic shock from ST-elevation myocardial infarction. European Heart Journal: Acute Cardiovascular Care 2016;5(2):101–07.
    OpenUrlPubMed
  10. 10.↵
    Cockburn J, Kemp T, Ludman P, et al. Percutaneous coronary intervention in octogenarians: a risk scoring system to predict 30-day outcomes in the elderly. Catheterization and Cardiovascular Interventions 2021;98(7):1300–07.
    OpenUrlPubMed
  11. 11.↵
    Goldstein BA, Navar AM, Pencina MJ, et al. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. Journal of the American Medical Informatics Association: JAMIA 2017;24(1):198.
    OpenUrlCrossRefPubMed
  12. 12.↵
    McCreanor V, Nowbar A, Rajkumar C, et al. Cost-effectiveness analysis of percutaneous coronary intervention for single-vessel coronary artery disease: an economic evaluation of the ORBITA trial. BMJ open 2021;11(2):e044054.
    OpenUrlAbstract/FREE Full Text
  13. 13.↵
    Panch T, Pearson-Stuttard J, Greaves F, et al. Artificial intelligence: opportunities and risks for public health. The Lancet Digital Health 2019;1(1):e13–e14.
    OpenUrlCrossRef
  14. 14.↵
    Motwani M, Dey D, Berman DS, et al. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. European heart journal 2017;38(7):500–07.
    OpenUrlCrossRefPubMed
  15. 15.↵
    Breiman L. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science 2001;16(3):199–231.
    OpenUrlCrossRefPubMedWeb of Science
  16. 16.↵
    Goldstein BA, Navar AM, Pencina MJ, et al. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. Journal of the American Medical Informatics Association 2017;24(1):198–208.
    OpenUrlCrossRefPubMed
  17. 17.↵
    Morgan DJ, Bame B, Zimand P, et al. Assessment of machine learning vs standard prediction rules for predicting hospital readmissions. JAMA network open 2019;2(3):e190348–e48.
    OpenUrl
  18. 18.
    Kehl KL, Elmarakeby H, Nishino M, et al. Assessment of deep natural language processing in ascertaining oncologic outcomes from radiology reports. JAMA oncology 2019;5(10):1421–29.
    OpenUrlPubMed
  19. 19.↵
    Dagliati A, Marini S, Sacchi L, et al. Machine learning methods to predict diabetes complications. Journal of diabetes science and technology 2018;12(2):295–302.
    OpenUrl
  20. 20.↵
    Kwon J-m, Jeon K-H, Kim HM, et al. Deep-learning-based out-of-hospital cardiac arrest prognostic system to predict clinical outcomes. Resuscitation 2019;139:84–91.
    OpenUrlPubMed
  21. 21.↵
    Lefkovits J BA, Dinh D, Carruthers H, Doyle J, Lucas M, Stub D, Reid CM The Victorian Cardiac Outcomes Registry Annual Report 2021 Monash University, SPHPM August 2022, Report No 9, pages 8. 2021
  22. 22.↵
    Chowdhury MRK, Stub D, Dinh D, et al. Preoperative Variables of 30-Day Mortality in Adults Undergoing Percutaneous Coronary Intervention: A Systematic Review. Heart, Lung and Circulation 2024
  23. 23.↵
    Chan Y. Biostatistics 202: logistic regression analysis. Singapore medical journal 2004;45(4):149–53.
    OpenUrlPubMed
  24. 24.↵
    Liu Y, De A. Multiple imputation by fully conditional specification for dealing with missing data in a large epidemiologic study. International journal of statistics in medical research 2015;4(3):287.
    OpenUrlPubMed
  25. 25.↵
    Al’Aref SJ, Singh G, van Rosendael AR, et al. Determinants of in-hospital mortality after percutaneous coronary intervention: a machine learning approach. Journal of the American Heart Association 2019;8(5):e011160.
    OpenUrlPubMed
  26. 26.↵
    Khera R, Haimovich J, Hurley N, et al. Machine-learning to improve prediction of mortality following acute myocardial infarction: an assessment in the NCDR-Chest Pain-Myocardial infarction registry. bioRxiv 2019:540369.
  27. 27.↵
    Hsieh M-H, Lin S-Y, Lin C-L, et al. A fitting machine learning prediction model for short-term mortality following percutaneous catheterization intervention: a nationwide population-based study. Annals of Translational Medicine 2019;7(23)
  28. 28.↵
    Burrello J, Gallone G, Burrello A, et al. Prediction of all-cause mortality following percutaneous coronary intervention in bifurcation lesions using machine learning algorithms. Journal of Personalized Medicine 2022;12(6):990.
    OpenUrlPubMed
  29. 29.↵
    Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Advances in neural information processing systems 2017;30
  30. 30.↵
    Spiegelhalter DJ. Funnel plots for comparing institutional performance. Statistics in medicine 2005;24(8):1185–202.
    OpenUrlCrossRefPubMedWeb of Science
  31. 31.↵
    Stehli J, Martin C, Brennan A, et al. Sex differences persist in time to presentation, revascularization, and mortality in myocardial infarction treated with percutaneous coronary intervention. Journal of the American Heart Association 2019;8(10):e012161.
    OpenUrlCrossRefPubMed
  32. 32.↵
    Wall JJ, Iqbal J, Andrews M, et al. Development and validation of a clinical risk score to predict mortality after percutaneous coronary intervention. Open Heart 2017;4(2)
  33. 33.↵
    Song J, Liu Y, Wang W, et al. A nomogram predicting 30-day mortality in patients undergoing percutaneous coronary intervention. Frontiers in Cardiovascular Medicine 2022:2200.
Back to top
PreviousNext
Posted March 06, 2025.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Machine learning approaches to predict 30-day mortality following percutaneous coronary intervention in an Australian population
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Machine learning approaches to predict 30-day mortality following percutaneous coronary intervention in an Australian population
Mohammad Rocky Khan Chowdhury, Dion Stub, Md Nazmul Karim, Angela Brennan, Christopher M. Reid, Shane Nanayakkara, Jeffrey Lefkovits, Mohammad Ali Moni, Md Shofiqul Islam, Derek P. Chew, Diem Dinh, Baki Billah
medRxiv 2025.03.01.25323134; doi: https://doi.org/10.1101/2025.03.01.25323134
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Machine learning approaches to predict 30-day mortality following percutaneous coronary intervention in an Australian population
Mohammad Rocky Khan Chowdhury, Dion Stub, Md Nazmul Karim, Angela Brennan, Christopher M. Reid, Shane Nanayakkara, Jeffrey Lefkovits, Mohammad Ali Moni, Md Shofiqul Islam, Derek P. Chew, Diem Dinh, Baki Billah
medRxiv 2025.03.01.25323134; doi: https://doi.org/10.1101/2025.03.01.25323134

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Epidemiology
Subject Areas
All Articles
  • Addiction Medicine (431)
  • Allergy and Immunology (757)
  • Anesthesia (221)
  • Cardiovascular Medicine (3298)
  • Dentistry and Oral Medicine (365)
  • Dermatology (280)
  • Emergency Medicine (479)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1173)
  • Epidemiology (13385)
  • Forensic Medicine (19)
  • Gastroenterology (899)
  • Genetic and Genomic Medicine (5158)
  • Geriatric Medicine (482)
  • Health Economics (783)
  • Health Informatics (3276)
  • Health Policy (1143)
  • Health Systems and Quality Improvement (1193)
  • Hematology (432)
  • HIV/AIDS (1019)
  • Infectious Diseases (except HIV/AIDS) (14638)
  • Intensive Care and Critical Care Medicine (913)
  • Medical Education (478)
  • Medical Ethics (127)
  • Nephrology (525)
  • Neurology (4930)
  • Nursing (262)
  • Nutrition (730)
  • Obstetrics and Gynecology (886)
  • Occupational and Environmental Health (795)
  • Oncology (2524)
  • Ophthalmology (728)
  • Orthopedics (282)
  • Otolaryngology (347)
  • Pain Medicine (323)
  • Palliative Medicine (90)
  • Pathology (544)
  • Pediatrics (1302)
  • Pharmacology and Therapeutics (551)
  • Primary Care Research (557)
  • Psychiatry and Clinical Psychology (4218)
  • Public and Global Health (7512)
  • Radiology and Imaging (1708)
  • Rehabilitation Medicine and Physical Therapy (1016)
  • Respiratory Medicine (980)
  • Rheumatology (480)
  • Sexual and Reproductive Health (498)
  • Sports Medicine (424)
  • Surgery (549)
  • Toxicology (72)
  • Transplantation (236)
  • Urology (205)