Abstract
Background Impaired microvascular and vasomotor function is a common consequence of aging, diabetes, and other risk factors, and is associated with adverse cardiac outcomes. Such impairments are not readily identified by standard clinical methods of cardiovascular testing such as coronary angiography and noninvasive single photon emission tomography (SPECT) myocardial perfusion imaging (MPI). We hypothesized that signals embedded within stress electrocardiograms (ECGs) identify individuals with microvascular and vasomotor dysfunction.
Methods We developed and validated a novel convolutional neural network (CNN) using stress and rest ECG data (ECG-Flow) to identify patients with impaired myocardial flow reserve (MFR) on quantitative positron emission tomography (PET) MPI (N=3887). Diagnostic accuracy was validated with an internal holdout set of patients undergoing stress PET MPI (N=963). The prognostic association of ECG-Flow with mortality was then evaluated in a separate cohort of patients undergoing SPECT MPI (N=5102).
Results ECG-Flow achieved good diagnostic accuracy for impaired MFR in the holdout PET cohort (AUC, sensitivity, specificity: 0.737, 71.1%, 65.7%). Abnormal ECG-Flow was found to be significantly associated with mortality in both PET holdout and SPECT MPI cohorts (adjusted HR 2.12 [95ρ CI 1.45, 2.10], ρ = 0.0001, and 2.07 [1.82, 2.36], ρ < 0.0001, respectively).
Conclusion Signals predictive of microvascular and vasomotor dysfunction are embedded in stress ECG waveforms. These signals can be identified by deep learning methods and are related to prognosis in patients undergoing both stress PET and SPECT MPI.
1 Introduction
Coronary microvascular/vasomotor dysfunction (CMD) develops as a result of aging, diabetes, and a range of other cardiometabolic diseases (1) and cause myocardial ischemia and symptomatic angina without obstructive coronary disease (2,3). Approximately half of patients referred for invasive cardiac evaluation are found to have no obstructive coronary disease (4), and of these, more than two-thirds have some form of CMD (2). Further, CMD is associated with markedly increased rates of adverse cardiac outcomes (5).
Despite high prevalence and prognostic relevance, CMD is difficult to diagnose with standard clinical tools, and many patients remain undiagnosed or have presumptive diagnoses without confirmation. Advanced diagnostic measures that can identify CMD include quantitative coronary or myocardial flow reserve (MFR) as measured from invasive vasomotor testing (4,6) or noninvasive stress imaging with positron emission tomography (PET) myocardial perfusion imaging (MPI) (7), transthoracic doppler echocardiography, or cardiovascular magnetic resonance imaging (6). Although the prognostic value of myocardial flow measures has been well-established (8), these advanced techniques are not widely available and are substantially more costly than standard cardiovascular testing methods, such as invasive coronary angiography and stress electrocardiography, which cannot quantify these parameters. Challenges and heterogeneity in the diagnosis of CMD have also been a major barrier to the design and implementation of clinical trials in this space.
Electrocardiography (ECG) signals are well known to reflect cardiac structure and function (9) and have long been used for noninvasive detection of ischemia and infarction through analysis of ST segment and Q wave changes (10). However, the diagnostic accuracy of traditional ECG criteria for detection of myocardial ischemia at rest or even during stress has been modest (11,12). Inspired by recent studies applying machine learning to ECG analysis (13,14), we posited that stress ECGs acquired concurrently with MPI testing would be a rich source of information on coronary vasomotor and microvascular dysfunction (15).
We hypothesized that a machine learning convolutional neural network (CNN) using stress and rest ECG waveform data could identify individuals with impaired myocardial flow reserve. The model was trained, validated, and tested with paired ECG and PET-derived MFR measurements. We then assessed the CNN model’s ability to predict mortality risk in a large single-center registry of patients undergoing standard noninvasive MPI and stress ECG for which MFR was not available.
2 Methods
2.1 Patient population
All consecutive patients from 2015 to 2020 that underwent at least one stress-rest PET-CT or SPECT-CT MPI exam at the University of Michigan Frankel Cardiovascular Center were included in our MPI-ECG registry. Exclusion criteria included history of heart transplantation, missing or uninterpretable image data, or missing demographic or hemodynamic data.
For patients with more than one MPI exam, only the earliest evaluable exam was included. Patients that underwent both PET-CT and SPECT-CT exams were included in the PET cohort only and were excluded from the SPECT cohort to avoid label leakage. The PET cohort was randomly split into training, validation, and holdout test subsets with a ratio of 60:20:20ρ. The SPECT cohort was used as an independent patient population for prognostic evaluation. All patient data was de-identified and informed consent was waived under an exemption from the University of Michigan Institutional Review Board.
2.2 Data source
Cardiac 82Rb PET exams were performed according to guidelines for MPI testing (16) and measurement of MBF (17,18) as previously described (19). Cardiac 99mTc-sestamibi SPECT exams were performed according to guidelines (20) as previously described (21). Left ventricular ejection fraction (LVEF) (22) and stress total perfusion deficit (TPD) (23) were routinely estimated in all nuclear images. Stress was induced either pharmacologically with intravenous bolus administration of regadenoson (0.4 mg), by treadmill exercise using a Bruce, modified Bruce, or Cornell protocol, or by a combination of intravenous regadenoson and low-level treadmill exercise. All PET exams were conducted with regadenoson vasodilator stress while SPECT exams employed a wide variety of stress protocols. Heart rate, systolic, and diastolic blood pressure were monitored continuously during imaging.
Twelve-lead ECG waveforms of 10 second duration were recorded immediately before stress testing and at 1 minute intervals during stress (in supine position for pharmacologic or combination pharmacologic/low-level exercise stress) or once during each exercise stage (for exercise stress). For CNN model development, both rest and stress ECG and quantitative PET MFR data were used. For prognostic model evaluation rest and stress ECG data were used for CNN model estimation of ECG-Flow as well as stress MPI findings (LVEF and stress TPD) (Figure 1).
CONSORT diagram.
2.3 Outcomes
The binary outcome for supervised CNN model training was MFR < 2, which is a widely used threshold for prognostically significant MFR impairment (24,25).
The primary patient outcome for prognostic evaluation of the CNN was mortality from all causes between the date of imaging and 31 July 2020. The vital status of each patient was determined by integrating data from death certificates and hospital records.
2.4 Deep learning CNN model
Model development
ECG waveform data was used directly as multichannel time series input to a novel CNN model (Figure 2) constructed with two input channels for waveforms acquired during pharmacologically induced stress and at rest. Each input channel consisted of convolutional blocks that applied convolutions along either the time axis or the ECG channel axis (26) followed by batch normalization, ReLu activation, and max pooling. After two additional fully connected layers, the waveform outputs were concatenated followed by global average pooling. Finally, a fully connected layer with softmax activation was applied to compute a probability score which was thresholded at 50ρ for the binary outcome. We tested concatenation of demographic data (patient age, body mass index, and sex) and hemodynamics (systolic blood pressure and heart rate, both at stress and rest) before the final fully connected layers. We also tested reducing the number of input waveform leads from twelve to eight. The three augmented unipolar leads are linear combinations of the bipolar leads, and by Einthoven’s triangle equation only two of the three bipolar leads are linearly independent (27). This suggests that leads III, aVR, aVL, and aVF may be discarded if the CNN can effectively learn these linear features from the remaining leads. The final model consisted of 72,929 trainable parameters to be optimized (Supplemental Table 4). The model was developed using the Keras deep learning framework (v2.10) with the Tensorflow backend (v2.10).
(A) CNN model (number of trainable parameters: 72,929)
Model optimization and training
The model was optimized with the PET training and validation cohorts using a general principle of balancing all sources of regularization (28). Model training was performed on one NVIDIA V100 GPU with 16 GB of RAM. A binary cross-entropy cost function was utilized with the AdamW optimizer (29,30) to maximize validation accuracy. Two learning rate schedules were tested: linear warmup with cosine decay (31), and the “1cycle” learning rate scheduler (32). An initial learning rate range test was conducted as a function of weight decay to determine the optimal weight decay and maximum learning rate bound for the “1cycle” scheduler (Supplemental Figure 2). A batch size of 128 was chosen to maximize memory usage on the GPU. Training iterations were stopped when validation accuracy failed to improve for twelve consecutive epochs.
Model evaluation
Final model evaluation was performed with the PET holdout test cohort after completion of training and tuning. Model performance is reported as diagnostic accuracy of the binary outcome and area under the ROC curve (AUC) relative to the true outcome of PET-derived MFR < 2. Prognostic performance was also tested in the PET holdout test cohort, as well as the SPECT MPI cohorts with either pharmacologic or exercise-induced stress (Figure 1). The two SPECT cohorts were tested separately as the appropriateness of the model for exercise stress ECG data was uncertain and considered exploratory.
2.5 Statistical methods
For each prognostic cohort, Cox proportional hazards models and Kaplan-Meier survival curves were constructed to evaluate the association of ECG-predicted MFR impairment with mortality risk. Cox models were adjusted for baseline risk factors and standard MPI findings selected a priori on the basis of judgment and prior work (24,33). Baseline covariates included patient age, sex, body mass index (BMI), diabetes, hypertension, hyperlipidemia, known coronary artery disease (CAD, history of myocardial infarction (MI) or previous percutaneous coronary intervention (PCI) or coronary artery bypass graft), LVEF, and stress TPD as a combined relative measure of ischemia or scar. Subgroup analysis was performed stratifying by age (< 60y), BMI (< 30kg/m2), sex, diabetes, dyslipidemia, hypertension, known CAD, LVEF (< 50ρ), and stress TPD (< 5ρ).
Model discrimination was assessed using likelihood ratio χ2 and c-index. The change in discrimination after adding ECG-predicted MFR to the baseline Cox models was assessed with continuous net reclassification improvement (NRI) (34).
Continuous variables are summarized as mean ± SD or median [1st – 3rd quartiles]. Welch’s unequal variances t-test or Wilcoxon rank sum tests were used as appropriate for comparisons of continuous parameters, and chi-squared tests were used to compare categorical variables. Two-sided p-values less than 0.05 were considered statistically significant. Statistical analyses were performed using R version 4.1 (35) with packages survival (36), rms (37), and survminer (38).
3 Results
3.1 Patient population
Of 12,416 patients in the MPI-ECG registry, 4,854 patients underwent PET-CT, and 7,303 underwent SPECT-CT (Figure 1). A PET MPI subset of 963 patients were held out for diagnostic evaluation. These subjects as well as 5,102 patients undergoing pharmacologic stress SPECT MPI and 1,533 patients undergoing exercise stress SPECT MPI were also evaluated prognostically.
3.2 Baseline characteristics
Baseline characteristics are shown in Table 1 stratified by patient cohort. As expected, the PET MPI cohorts for Model Derivation and Holdout Test were nearly identical. Not surprisingly the two SPECT MPI cohorts (i.e. pharmacologic/combined pharmacologic-exercise and exercise-only stress) differed meaningfully from the PET MPI cohort. They were slightly older, with lower rates of obesity, diabetes, and history of MI, and higher rates of hypertension, hyperlipidemia, and mortality than the PET cohorts. Baseline characteristics of the SPECT MPI cohort stratified by stress modality are shown in Supplemental Table 1.
3.3 ECG-predicted MFR impairment
Overall, 47.9ρ (N=461) of patients in the PET holdout test set had PET-measured MFR < 2. Among this cohort, the CNN ECG-Flow model accuracy for ECG-predicted MFR < 2 was 68.0ρ, with AUC of 0.737, sensitivity and specificity of 71.1ρ and 65.7ρ; positive and negative predictive value of 65.6ρ and 70.5ρ; and F1-score of 68.3ρ (Figure 3).
Accuracy of ECG-predicted MFR (ECG-Flow) in the PET holdout test cohort. ROC curve and confusion matrix (N=963 patients).
3.4 Prognostic assessment
Abnormal ECG-Flow was significantly and strongly associated with risk of death from any cause in adjusted Cox models of all three prognostic cohorts (Table 2). Significantly improved c-index (from 0.662 to 0.688; and from 0.671 to 0.697) and overall net reclassification improvement (0.324 [0.146, 0.505]; and 0.460 [0.388, 0.535]) were also noted in PET holdout and SPECT pharmacologic stress cohorts (Table 2). Adjusted hazard ratios were similar between PET holdout (HR 2.12 [1.45, 3.10], ρ = 0.0001) and SPECT MPI (HR 2.07 [1.82, 2.36], ρ < 0.0001) cohorts, as well as between ECG-Flow and an adjusted model of PET-measured MFR < 2 (HR 3.20 [2.19, 4.68], ρ < 0.0001) in the PET holdout cohort (Table 2). Overall NRI was significantly increased in both PET and pharmaceutical stress SPECT cohorts (Table 3). Intriguingly, ECG-Flow also performed well in the exploratory SPECT exercise stress cohort (HR 4.33 [2.47, 7.58], ρ < 0.0001) (Supplemental Table 2). However, overall NRI was not significantly greater than zero in this population (Table 3), possibly due to much lower event rate (6ρ) in this broadly healthier population. Kaplan-Meier incidence curves are shown in Figure 4 and Supplemental Figure 1, demonstrating significantly increased mortality risk in patients with abnormal ECG-Flow. In addition, mortality risk of PET-measured and ECG-Flow in the PET holdout test cohort are compared in Figure 4 (a).
Kaplan-Meier plots of all-cause mortality incidence (a) in the PET holdout test cohort stratified by ECG-predicted (ECG-Flow) and PET-measured MFR; and (b) in the SPECT pharmacologic stress cohort stratified by ECG-predicted MFR.
Cox models of impaired MFR and risk of all-cause mortality for the PET holdout test and SPECT pharmacologic stress cohorts. Cox models were adjusted for patient age, BMI, sex, diabetes, dyslipidemia, hypertension, known CAD, LVEF, and stress TPD. For each model, either ECG-predicted (ECG-Flow) or PET-measured MFR impairment (< 2.0) were added to this baseline model.
Continuous Net Reclassification Improvement (NRI) within each cohort. In each case, ECG-predicted MFR impairment was added to the base model of demographic covariates, risk factors, known CAD, LVEF, and stress TPD.
3.5 Subgroup analysis
In subgroups stratified by patient characteristics, abnormal ECG-Flow was consistently a strong predictor across subgroups and remained significantly associated with higher mortality risk in the PET holdout test cohort (Table 4) and both pharmacologic (Table 5) and exercise stress (Supplemental Table 3) SPECT cohorts. ECG-Flow was a significant independent predictor of mortality in women, patients with diabetes, hypertension, LV dysfunction (LVEF < 50ρ), and those with minimal or no stress perfusion abnormality (TPD < 5ρ) representing patients with likely diffuse CMD.
Subgroup analysis of ECG-predicted MFR (ECG-Flow) and risk of all-cause mortality in the PET holdout test cohort. Models were adjusted for patient age, BMI, sex, diabetes, dyslipidemia, hypertension, known CAD, LVEF, and stress TPD.
Subgroup analysis of ECG-predicted MFR (ECG-Flow) and risk of all-cause mortality in the SPECT pharmacologic stress cohort. Models were adjusted for patient age, BMI, sex, diabetes, dyslipidemia, hypertension, known CAD, LVEF, and stress TPD.
4 Discussion
To our knowledge we have developed and validated the first deep learning CNN model (ECG-Flow) with the ability to predict impaired MFR from stress-rest ECG waveform data, demonstrating that CMD pathophysiology can result in characteristic electrophysiologic changes detectable in surface ECG tracings. As further clinical evaluation of the CNN model, we demonstrated consistently strong prognostic performance of ECG-Flow in two distinct clinical populations: a PET holdout test cohort for which PET-measured MFR was available, and a large independent SPECT MPI cohort for which MFR measurements were not possible (Table 2). Our results demonstrate that impaired ECG-Flow is independently associated with higher risk of death after adjusting for clinical risk factors and standard MPI measurements of LV dysfunction (LVEF) and stress perfusion abnormality (TPD). In both cohorts, impaired ECG-Flow improved the prognostic performance of standard MPI in terms of improved c-index and continuous net reclassification improvement. In subgroup analysis of patients with minimal or no stress perfusion abnormalities, ECG-Flow remained a consistent predictor of adverse outcomes, consistent with its identification with CMD (39). The results confirm our hypothesis that signals predictive of coronary microvascular/vasomotor dysfunction are embedded in stress and rest ECG waveforms and suggest that further CNN model development and external validation is warranted in larger multicenter datasets.
The availability of large ECG databases has recently enabled the development of a plethora of machine learning applications that have further extended the clinical utility of ECG signals (13,14). These advanced applications have proven useful for diagnosis and assessment of arrhythmias (40,41), congestive heart failure (42), valvular disease (43), left (26) and right (44) ventricular dysfunction, coronary artery disease (45), myocardial infarction (46,47), and mortality risk (48,49). Resting ECG data was used exclusively in all of these studies. In contrast, the present results also employed stress ECG data, and underline the potential of stress perturbations to reveal important insights relevant to coronary microvascular and vasomotor dysfunction (50).
Ahmad, et al. (51) performed a ground-breaking study on a similar question. Using logistic regression, they built a prediction model using rest ECG data to identify patients with CMD identified with invasive coronary vasomotor testing. Although such invasive testing is precise, it is rarely performed and is subject to substantial referral bias. Given the limited sample size, simpler model architecture, and lack of stress perturbation data, the performance of their prediction model was only moderate (51). We believe the higher performance of our approach may in large part be due to utilization of ECGs at both rest and in response to stress perturbation. Prior studies have demonstrated that broad metabolic and physiologic changes are detectible with a single episode of exercise (50).
Several clinical implications follow from these results. Standard noninvasive SPECT MPI is not a sensitive test for identifying CMD (52). Although direct SPECT measurement of MBF and MFR with contemporary CZT SPECT camera technology is under active development and has shown promise (53–58), the limitations of currently available 99mTc-based perfusion tracers may impede the clinical utility of SPECT-measured MFR (59). However, as shown in Figure 4 (a), ECG-Flow, when combined with standard MPI findings, provided risk assessment approaching that of PET-measured MFR. This suggests that CMD assessment by standard clinical SPECT MPI can potentially be greatly improved without additional data acquisition or alteration of standard MPI protocols.
Second, although the CNN model was developed exclusively using pharmacologic stress PET, the prognostic performance of ECG-predicted MFR in the exploratory SPECT exercise stress cohort (Supplemental Table 2) indicates that our CNN model may find further utility across more diverse patient populations. For example, the longer half-life of the experimental perfusion tracer 18F-flurpiridaz enables exercise stress protocols for PET that are analogous to those of SPECT MPI. However, when exercise is performed, direct MFR measurement is generally not feasible for such PET MPI exams. Consequently, the ECG-Flow approach could be combined with exercise PET to obtain both functional capacity and MFR status. Finally, the present study did not evaluate the diagnostic performance of the CNN model for detection of CAD. We speculate that ECG-predicted MFR impairment could potentially improve the diagnostic accuracy of MPI in the presence of high risk left main or multi-vessel obstructive CAD, as seen for PET-measured MFR in prior studies (39,60,61).
4.1 Future studies
We have demonstrated the feasibility of predicting impaired MFR using a relatively simple CNN with only 72,929 trainable parameters. We expect that CNN model performance may be further improved by additional machine learning technologies such as larger CNN architectures (41), self-supervised model pre-training (62), and transformer-based frameworks (40,63). A further possible extension would be to train the CNN using only rest ECG waveforms or reduced leads which could enable use of ambulatory ECG monitoring data. If successful, such a model could enable evaluation of cohorts from clinical trials of novel cardiovascular therapeutics where resting or ambulatory ECG data were acquired without cardiac imaging.
4.2 Limitations
Global MFR does not distinguish CMD from myocardial perfusion impairments due to focal epicardial or diffuse CAD. However, recent work has shown that regional quantitative PET measures combined with global MFR can be effective at assessing the additive risk of diffuse or microvascular disease (64). Additional work will be necessary to evaluate the utility of ECG-Flow for similarly detailed CMD assessment.
4.3 Conclusion
ECG waveforms at rest and during stress carry important clinical information on coronary vasomotor and microvascular function. We have developed a novel proof-of-concept deep learning CNN model which uses ECG data to identify MFR impairment with reasonable clinical accuracy and strong prognostic value, approaching that of directly measured MFR.
Data Availability
Individual subject level data underlying this article cannot be shared publicly due to medical data privacy regulations.
6 Funding
VLM is supported by grants R01AG059729 from the National Institute on Aging, U01DK123013 from the National Institute of Diabetes and Digestive and Kidney Disease, and R01HL136685 from the National Heart, Lung, and Blood Institute as well as the Melvyn Rubenfire Professorship in Preventive Cardiology.
7 Competing Interests
JBM, APR, JMR, TH, and MDV are employees of INVIA. JMR is a consultant for Jubilant Radiopharma and receives royalties from the licensing of FlowQuant software. EPF is a stockholder in INVIA. VLM has received research grants and consulting fees from Siemens Healthineers and serves as a scientific advisor for Ionetix and owns stock options in the same. He owns stock in GE and Cardinal Health, and has received payments for consulting from INVIA.
8 Ethics Approval
This is an observational study and informed consent was waived under an exemption from the University of Michigan Institutional Review Board.
12 Supplemental Data
12.1 Supplemental Tables
Cox models of ECG-predicted MFR (ECG-Flow) and risk of all-cause mortality for the SPECT exercise stress cohort. Cox models were adjusted for patient age, BMI, sex, diabetes, dyslipidemia, hypertension, known CAD, LVEF, and stress TPD. For each model, ECG-predicted or PET-measured MFR < 2.0 were added to this baseline model.
Subgroup analysis of ECG-predicted MFR (ECG-Flow) and risk of all-cause mortality in the SPECT exercise stress cohort. Models were adjusted for patient age, BMI, sex, diabetes, dyslipidemia, hypertension, known CAD, LVEF, and stress TPD.
Optimization of the ECG-Flow CNN using the PET validation cohort (N=971). Baseline network included input channels for 12-lead rest and stress ECG waveforms. Channel Reduction network reduced the input waveforms to 8-leads by removing redundant leads (III, aVR, aVL, and aVF). In the + Demographics network, demographic data (patient age, sex, BMI) were concatenated before the final layer of the Channel Reduction network. In the + Hemodynamics* network, the heart rate and systolic blood pressure at rest and during stress were concatenated before the final layer of the + Demographics network.
12.2 Supplemental Figures
Kaplan-Meier plot of all-cause mortality incidence in the SPECT exercise stress cohort stratified by ECG-predicted MFR.
Learning rate range test as a function of weight decay for the AdamW optimizer.
5 Acknowledgement
The authors acknowledge the Regents of the University of Michigan for the use of de-identified clinical data for this study.