Abstract
Background While low-dose computed tomography scans are traditionally used for attenuation correction in hybrid myocardial perfusion imaging (MPI), they also contain additional anatomic and pathologic information not utilized in clinical assessment. We seek to uncover the full potential of these scans utilizing a holistic artificial intelligence (AI)-driven image framework for image assessment.
Methods Patients with SPECT/CT MPI from 4 REFINE SPECT registry sites were studied. A multi-structure model segmented 33 structures and quantified 15 radiomics features for each on CT attenuation correction (CTAC) scans. Coronary artery calcium and epicardial adipose tissue scores were obtained from separate deep-learning models. Normal standard quantitative MPI features were derived by clinical software. Extreme Gradient Boosting derived all-cause mortality risk scores from SPECT, CT, stress test, and clinical features utilizing a 10-fold cross-validation regimen to separate training from testing data. The performance of the models for the prediction of all-cause mortality was evaluated using area under the receiver-operating characteristic curves (AUCs).
Results Of 10,480 patients, 5,745 (54.8%) were male, and median age was 65 (interquartile range [IQR] 57-73) years. During the median follow-up of 2.9 years (1.6-4.0), 651 (6.2%) patients died. The AUC for mortality prediction of the model (combining CTAC, MPI, and clinical data) was 0.80 (95% confidence interval [0.74-0.87]), which was higher than that of an AI CTAC model (0.78 [0.71-0.85]), and AI hybrid model (0.79 [0.72-0.86]) incorporating CTAC and MPI data (p<0.001 for all).
Conclusion In patients with normal perfusion, the comprehensive model (0.76 [0.65-0.86]) had significantly better performance than the AI CTAC (0.72 [0.61-0.83]) and AI hybrid (0.73 [0.62-0.84]) models (p<0.001, for all).CTAC significantly enhances AI risk stratification with MPI SPECT/CT beyond its primary role - attenuation correction. A comprehensive multimodality approach can significantly improve mortality prediction compared to MPI information alone in patients undergoing cardiac SPECT/CT.
1. Introduction
Myocardial perfusion scintigraphy is widely used for the evaluation of coronary artery disease (CAD), with over 15-20 million scans performed worldwide.1,2 During myocardial perfusion imaging (MPI), a low-dose non-contrast computed tomography attenuation correction (CTAC) scan is often used to correct for soft-tissue attenuation, leading to improved diagnostic accuracy.3,4 Attenuation correction by computed tomography (CT) is recommended by American Society of Nuclear Cardiology guidelines.5 Although the myocardium is the structure of principal interest during SPECT/CT MPI, its CTAC scan provides a wealth of additional information about other visible organs. Incidental findings have been reported in up to 59.5% of SPECT/CT MPI studies, of which some are clinically important and necessitate further diagnosis and treatment.6,7
However, due to limitations in the quality of CTAC images (low dose, no electrocardiographic gating), detection and characterization of abnormal findings on CTAC can be challenging.8 Consequently, the additional information present in hybrid cardiac scans is often underutilized during clinical reporting. While some methods have been developed to derive information about coronary artery calcium (CAC) and epicardial adipose tissue (EAT) from CTAC scans, 9,10 many other potentially clinically important features, like extracardiac structures, are present in these scans, yet to date their added value to MPI has not been systematically evaluated.
The aim of this study is to develop a holistic artificial intelligence (AI)-based approach for the prediction of all-cause mortality from SPECT/CT MPI utilizing all possible information contained in the hybrid images and to separately evaluate the value of CTAC images for this purpose, which have been previously underutilized.
2. Material and methods
2.1 Study population
In this retrospective study we utilized CTAC scans of patients who underwent SPECT/CT MPI from 4 sites (University of Calgary, Yale University, Columbia University, University of Ottawa Heart Institute) participating in the Registry of Fast Myocardial Perfusion Imaging with Next generation SPECT (REFINE SPECT).11 The study protocol was approved by the IRB at all participating sites and complied with the Declaration of Helsinki. Baseline demographic and clinical characteristics were obtained from the REFINE SPECT registry.11 CTAC image acquisition at each participating site is shown in Table S1. The outcome was all-cause mortality (referred to subsequently simply as “mortality”), which was determined using the national death index for sites in the United States and administrative databases in Canada.
2.2 Myocardial Perfusion Image Analysis
Total perfusion deficit (TPD), end-diastolic stress shape index (ratio between the maximum left ventricular (LV) diameter in short axis and the length of the LV in end-diastole at stress), stress ejection fraction, and end-diastolic volume were quantified automatically from non-attenuation-corrected MPI scans at the core laboratory (Cedars-Sinai Medical Center, Los Angeles) with the use of dedicated software (Quantitative Perfusion SPECT [QPS] software, Cedars-Sinai Medical Center, Los Angeles)12. Normal myocardial perfusion was defined as stress TPD <5%13, whereas moderate-to-severe ischemia was defined as TPD ≥10% of the myocardium.14
2.3 Multi-structure deep learning feature extraction from CTAC
The study design is shown in Figure 1. TotalSegmentator, a multi-structure segmentation deep learning (DL) model, was used to segment structures visible on CTAC.15 Out of all segmented structures, we selected thirty-three structures with a frequency of >80% on all scans (Figure S1). The automatic extraction of imaging features for all selected structures was performed with PyRadiomics package (version 3.0.1).16 In per-organ analysis, we included eleven first-order and four 3D features (Table S2).
2.4 Automated Coronary Artery Calcium Scoring
Our formerly validated deep learning model was used for CAC segmentation and scoring.17,18 To segment heart mask and CAC on CTAC images, two convolutional long short-term memory (convLSTM) networks were tested externally on data (10,480 CTAC scans) from 4 different sites. To automatically obtain CAC scores from the deep learning segmentation, established methods were used.19
2.5 Automated Epicardial Adipose Tissue Scoring
A previously developed deep learning model was used to estimate EAT volume and density (-190 and -30 Hounsfield units [HU]) from CTAC scans.10 For EAT model training and validation purposes, we used 500 CTAC scans from one site (Yale University). Patients who were used for EAT model training and validation were not included in this analysis.
2.6 Classification Models
Extreme Gradient Boosting (XGBoost) models (version 1.7.3), a currently leading machine learning method, were used for mortality classification. These models generate all-cause mortality risk scores by applying 10-fold cross-validation regimen across the entire dataset. Within each fold, 90% of the data was first set aside for model training and validation. This 90% was further divided, with 80% used for training and 20% for validation. The remaining 10% of the data in each fold was used for testing and kept separate from training and validation to ensure each patient was tested exactly once across all folds. 10 separate models were built, and each was tested independently. Testing results were concatenated from all models for the overall performance evaluation. Hyper-parameter tuning to optimize the model parameters was conducted during training and validation, separately in each fold using the grid-search method.
Three key benefits of employing 10-fold cross-validation are primarily 3-fold: 1) it reduces the variability of prediction errors, leading to a more accurate evaluation; 2) it maximizes the data utilization while minimizing the chance of overfitting and cross-contamination of information among data splits; 3) it guards against validating the hypothesis influenced by arbitrary data split (Type III error).20,21
2.7 Models
Four models were used for the mortality endpoint: 1 – model incorporating DL-EAT (EAT), 2 – model combining quantitative CTAC image analysis of all segmented structures [radiomics], DL-EAT and DL-CAC (AI CTAC), 3 – model incorporating all variables included in the CTAC model as well as stress ejection fraction, stress end-diastolic volume, stress shape index end-diastolic, stress TPD, and other SPECT imaging features [see Table S3] (AI hybrid), 4 – model combining CTAC, MPI and clinical data (All), whereas Coronary calcium (DL-CAC score) and Perfusion (utilizing stress TPD) were univariate comparisons.
Clinical data include patient demographics such as age, sex, body mass index [BMI]. Also included is past medical history: hypertension, diabetes, dyslipidemia, prior CAD (prior myocardial infarction, percutaneous coronary intervention [PCI], and coronary artery bypass graft [CABG]). Further, the clinical data encompass variables from stress test such as the type of test, peak stress heart rate, peak stress blood pressure, and ECG response to stress.
2.8 Model Explainability
The predictive power of variables included in model training was evaluated using XGBoost feature importance, which quantifies the increase in accuracy resulting from the addition of each feature. SHapley Additive exPlanations (SHAP), a game-theoretic feature importance method, was used to explain how structures contributed to the overall risk in model inference for individual patients.22
2.9 Thresholds for Comparisons of Machine Learning
Patients were classified into low or high-risk groups based on AI-derived all-cause mortality risk score. This classification was achieved by setting a threshold that aligns with the proportion of patients identified by the established clinical criteria for ischemia (≥10%).23,24
2.10 Statistical Analysis
Continuous variables with a normal distribution are presented as mean ± standard deviation (SD) and not normally distributed variables as medians with interquartile range (IQR) [IQ1-IQ3]. Categorical variables are expressed as count and relative frequencies (percentages). Differences between categorical variables were compared by the Pearson’s χ2 test whereas continuous variables were compared by Wilcoxon Mann-Whitney test, as appropriate. The performance of the models was evaluated using receiver-operating characteristics analysis, and area under the receiver-operating characteristic (AUC) analysis values were compared with the DeLong test.25 Kaplan-Meier survival curve, alongside univariate Cox proportional hazard models, were employed to evaluate the association with mortality. Log-rank test was used to ascertain the statistical significance. The improvement in model predictions was measured using the time-dependent net reclassification improvement score at 2 years.26 Confidence intervals were calculated by the percentile bootstrap method. A two-tailed p-value of <0.05 was considered statistically significant. All statistical analyses were performed with Pandas (version 2.1.1) and Numpy (version 1.24.3), Scipy (version 1.11.4), Lifelines (version 0.28.0) and Scikit-learn (version 1.3.0) in Python 3.11.5 (Python Software Foundation, Wilmington, DE, USA), as well as “nricens” package (version 1.6) in R version 4.3.2 (R Foundation for Statistical Computing, Vienna, Austria).
3. Results
3.1 Patient Characteristics
Study population
In total 10,983 participants from 4 sites were enrolled in the REFINE SPECT registry, of which 500 CTAC scans from one site were used for EAT-model training and validation. Of the 10,483 remaining participants, 3 were excluded due to incomplete CTAC scans. The final study cohort consisted of 10,480 participants (Figure S2).
Table 1 represents baseline characteristics stratified by sex. Of all participants, 5,745 (54.8%) were male, and median age was 65 (57, 73) years. During the median 2.9-year (1.6-4.0) follow-up period, 651 (6.2%) patients died. Normal myocardial perfusion was present in 6,165 (58.8%) patients, of whom 274 (4.4%) died. Patients with normal perfusion were significantly younger (p<0.001), more often female, and less often diagnosed with hypertension (p<0.001), diabetes (p<0.001), and dyslipidemia (p=0.007) (Table S4).
Myocardial Imaging Perfusion Quantitative Image Analysis Parameters
In all patients, the median total perfusion deficit (TPD) was 2.6% (0.9-6.0) and was higher in male than female patients (2.7 vs. 2.5, respectively, p<0.001) (Table 1). Significantly lower stress ejection fraction was observed in men compared with women (59% vs. 70%, respectively, p<0.001). The median TPD in patients with abnormal perfusion was 7.0 (4.9-11.7), whereas the median stress ejection fraction in this group was 59 (49-68) (Table S4).
Coronary Artery Calcium and Epicardial Adipose Tissue
CAC was 0 in 3,753 (35.8%) patients, >0-100 in 1,982 (18.9%), >100-400 in 1,462 (14.0%), and >400 in 3,283 (31.3%) subjects. The median EAT volume and density were 130 mL (90, 183) and -65 HU (-70, -61), respectively (Table 1).
In patients with normal perfusion, 2,459 (39.9%) subjects had no CAC, 1,305 (21.2%) had CAC >0 and ≤100, 862 (14.0%) had CAC >100 and ≤400, and 1,539 (25.0%) had CAC >400. The median EAT volume and density in patients with normal perfusion were 129 mL (89, 179) and -65 HU (-70, -61), respectively (Table S4).
3.2 Model Performance
Figure 2 represents the model performance and feature importance for mortality in all patients, subjects with normal perfusion, and patients without calcified lesions in coronary arteries. The lungs were the top feature in all patients, in patients with normal perfusion as well as in subjects without coronary calcifications. Table S5 shows AUCs with 95% CI for all AI models in all patients included in the study. There was a better performance of the AI CTAC model than the EAT model (AUC 0.56, 95% CI 0.49-0.63, p<0.001), and coronary calcium (AUC 0.64, 95% CI 0.57-0.71, p<0.001) alone. There was a statistically significant difference in the prediction performance of the AI hybrid model and the CTAC model (AUC 0.79 vs. 0.78, p<0.001).
AUCs with 95% CI for all AI models in patients with normal myocardial perfusion are shown in Table S6 whereas in subjects with no coronary calcium in Table S7. In the group with normal perfusion, the performance of the AI CTAC model was significantly better compared to Perfusion (AUC 0.72 vs. 0.55, respectively, p<0.001). The AI hybrid model incorporating CTAC and MPI features had higher prediction performance compared to the AI CTAC-only model (AUC 0.73 vs. 0.72, respectively, p<0.001). Among the patients with no calcium, the AI CTAC model significantly outperformed Perfusion (AUC 0.71 vs. 0.59, respectively, p<0.001). The AI hybrid model was significantly better than AI CTAC-only model (AUC 0.75 vs 0.71, respectively, p<0.001).
3.3 Association with Outcomes and Multivariable Model
Kaplan-Meier Curves stratified by TPD (ischemia <10% and ≥10%), and a matched proportion of patients with high and low AI scores (AI threshold at 0.095, high risk in 9.8%) are shown in Figure 3. AI score led to an improved risk reclassification of patients who experienced mortality (23.9%, 95% CI 19.5-28.5, p<0.001), and patients who did not experience mortality (2.4%, 95% CI 1.7-3.2, p<0.001), with an overall net reclassification improvement of 26.4% (95% CI 21.8-31.0, p<0.001).
Figure S3 illustrates findings of multivariable analyses. After adjusting for age, sex (male), hypertension, dyslipidemia, diabetes mellitus, peripheral vascular disease, past myocardial infarction, and family history of CAD, patients with abnormal perfusion were at higher risk of death compared to patients with normal myocardial perfusion (adjusted hazard ratio [HR] 1.71, 95% CI 1.46-2.01, p<0.001). Moreover, CAC >400 (adjusted HR 2.11, 95% CI 1.67-2.65, p<0.001) was associated with an increased risk of death.
3.4 Structure Specific Risk Evaluation
Examples of patients classified to be at a higher risk of death (with extracardiac structures, notably the lungs and aorta, contributing the most to mortality) are shown in Figure 4.
4. Discussion
In this study, we have demonstrated the potential value of holistic anatomic, functional and clinical evaluation of CTAC scans for improving all-cause mortality prediction in patients undergoing hybrid perfusion MPI. We developed a fully automated AI model incorporating multi-structure segmentation and radiomic feature extraction in parallel to deep learning-based CAC and EAT quantification. This model improves mortality prediction from multimodality myocardial perfusion, with a combined model improving upon any feature set (SPECT, CTAC, or clinical) in isolation. Moreover, it provides physicians with guidance regarding portions of CTAC scans which require further scrutiny to identify potentially important underlying conditions indicating potentially significant incidental findings, despite coronary artery disease being the primary indication for the examination. This fully automated workflow could be leveraged by physicians to unlock the full potential of hybrid SPECT/CT imaging.
Several studies have proven the role of AI in predicting cardiovascular events from SPECT/CT using clinical, MPI 27,28, and CTAC data.29,30 Nevertheless, only a limited number of CTAC findings, like CAC29, or EAT10 were included in these previous analyses. More recently we demonstrated that deep learning cardiac chamber volumes (from CTAC) provided incremental and complementary value to CAC and SPECT variables.31 Ashrafinia et al. used radiomic features from SPECT MPI to predict CAC score derived from CT scans32, whereas Amini et al. applied a quantitative image analysis approach not only to diagnose CAD, but also for risk classification.33 The proposed AI approach integrates simultaneous assessment of multiple structures on CTAC by leveraging strengths of deep learning and quantitative image analyses. Importantly, the model incorporating SPECT, CTAC, and clinical data had the highest prediction performance suggesting that AI-derived information encrypted in CTAC is complementary to traditional methods for analysis.
By integrating functional imaging (SPECT) with anatomic characteristics (CT), hybrid imaging has not only enhanced nuclear medicine by improving diagnostic accuracy 34,35, but also provides an enormous amount of data contained in CTACs - which to date is not fully utilized. However, accurate interpretation and identification can be challenging due to image quality of these low-dose, non-electrocardiographically gated, and often free- breathing scans.8 Importantly, these auxiliary scans may be interpreted by physicians without dedicated training in interpreting chest CT.36 In some cases, unexpected intra-, and extrathoracic radiotracer uptake can lead to identification of conditions like thymoma, breast and lung cancer.37,38 Previous studies have demonstrated the ability of AI to support physicians in identifying potentially important incidental findings.39,40 Our AI model could potentially help with this clinical challenge by combining 33 cardiac and extracardiac structures automatically segmented from CTAC scans, and ranking those structures based on their importance in predicting mortality in each patient.
Limitations
This study has some limitations. It was a retrospective study with non-uniform CTAC acquisition protocols from multiple sites, however, this highlights the generalizability of the approach. Some organs were only partially visible or not visualized on all scans. For example, organs like kidneys and thyroid were excluded from the analysis because of their high missingness across the cohort. No information regarding the reported cause of death is available in this large, multicenter registry. Therefore, we are not able to evaluate the associations between organ features and cause-specific death. Finally, radiological evaluation of CTACs was performed only with radiomic features and no information regarding reported incidental findings is available in this cohort.
Conclusions
We demonstrate a significant, yet underappreciated, role of CTAC in risk stratification with MPI SPECT/CT. Fully automated AI integration of quantitative features from multiple organs derived from CTAC, perfusion and clinical data images significantly improves mortality risk stratification in patients undergoing SPECT/CT MPI as compared to MPI only.
Data Availability
To the extent allowed by data sharing agreements and IRB protocols, the data from this manuscript will be shared upon written request.
Disclosures
Dr. Robert Miller received consulting fees and research support from Pfizer. Drs. Berman and Slomka, and Paul B. Kavanagh participate in software royalties for QPS software at Cedars-Sinai Medical Center. Dr. Slomka has received consulting fees from Synektik. Drs. Berman, Einstein, and Edward Miller have served or currently serve as consultants for GE Healthcare. Dr. Einstein has received speaker fees from Ionetix; has received consulting fees from W. L. Gore & Associates; has received authorship fees from Wolter Kluwer Healthcare-UpToDate; has served on a scientific advisory board for Canon Medical Systems; and has received grants to his institution from Attralus, Bruker, Canon Medical Systems, Eidos Therapeutics, Intellia Therapeutics, Ionis Pharmaceuticals, Neovasc, Pfizer, Roche Medical Systems, and W. L. Gore & Associates. Dr. Ruddy has received research grant support from GE Healthcare and Advanced Accelerator Applications. Dr. David Ouyang reported having a patent pending for EchoNet-LVH. The remaining authors have nothing to disclose.
Data Availability Statement
To the extent allowed by data sharing agreements and IRB protocols, the data from this manuscript will be shared upon written request.
List of abbreviations
- AI
- artificial intelligence
- AUC
- area under the receiver-operating characteristic curve
- BMI
- body mass index
- CABG
- coronary artery bypass graft
- CAC
- coronary artery calcium
- CAD
- coronary artery disease
- CI
- confidence interval
- convLSTM
- convolutional long short-term memory
- CT
- computed tomography
- CTAC
- computed tomography attenuation correction
- DL
- deep learning
- EAT
- epicardial adipose tissue
- HR
- hazard ratio
- HU
- Hounsfield units
- LVEF
- left ventricular ejection fraction
- MPI
- myocardial perfusion imaging
- PCI
- percutaneous coronary intervention
- SPECT
- single-photon emission computed tomography
- TPD
- total perfusion deficit
Acknowledgements
This research was supported in part by grants R01HL089765 and R35HL161195 from the National Heart, Lung, and Blood Institute at the National Institutes of Health (PI: Piotr Slomka). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. MB is supported by a research award from the Kosciuszko Foundation – The American Centre of Polish Culture.