Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

rECHOmmend: an ECG-based machine-learning approach for identifying patients at high-risk of undiagnosed structural heart disease detectable by echocardiography

Alvaro E. Ulloa-Cerna, Linyuan Jing, John M. Pfeifer, Sushravya Raghunath, Jeffrey A. Ruhl, Daniel B. Rocha, Joseph B. Leader, Noah Zimmerman, Greg Lee, Steven R. Steinhubl, Christopher W. Good, Christopher M. Haggerty, Brandon K. Fornwalt, View ORCID ProfileRuijun Chen
doi: https://doi.org/10.1101/2021.10.06.21264669
Alvaro E. Ulloa-Cerna
1Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Linyuan Jing
1Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John M. Pfeifer
1Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
2Heart and Vascular Center, Evangelical Hospital, Lewisburg, PA, USA
4Tempus Labs Inc, Chicago, IL, USA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sushravya Raghunath
1Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jeffrey A. Ruhl
1Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Daniel B. Rocha
3Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
MM
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joseph B. Leader
3Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
BA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Noah Zimmerman
4Tempus Labs Inc, Chicago, IL, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Greg Lee
4Tempus Labs Inc, Chicago, IL, USA
BS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Steven R. Steinhubl
4Tempus Labs Inc, Chicago, IL, USA
5Scripps Research Translational Institute, La Jolla, CA, USA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher W. Good
1Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
6UPMC Heart and Vascular Institute at UPMC, Hamot, PA, USA
DO
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher M. Haggerty
1Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
7Heart Institute, Geisinger, Danville, PA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Brandon K. Fornwalt
1Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
4Tempus Labs Inc, Chicago, IL, USA
7Heart Institute, Geisinger, Danville, PA, USA
8Department of Radiology, Geisinger, Danville, PA, USA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ruijun Chen
1Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
9Department of Medicine, Geisinger, Danville, PA, USA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ruijun Chen
  • For correspondence: ruijun.chen{at}gmail.com
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Early diagnosis of structural heart disease improves patient outcomes, yet many remain underdiagnosed. While population screening with echocardiography is impractical, electrocardiogram (ECG)-based prediction models can help target high-risk patients. We developed a novel ECG-based machine learning approach to predict multiple structural heart conditions, hypothesizing that a composite model would yield higher prevalence and positive predictive values (PPVs) to facilitate meaningful recommendations for echocardiography.

Methods Using 2,232,130 ECGs linked to electronic health records and echocardiography reports from 484,765 adults between 1984-2021, we trained machine learning models to predict the presence of any of seven echocardiography-confirmed diseases within one year. This composite label included: moderate or severe valvular disease (aortic/mitral stenosis or regurgitation, tricuspid regurgitation), reduced ejection fraction <50%, or interventricular septal thickness >15mm. We tested various combinations of input features (demographics, labs, structured ECG data, ECG traces) and evaluated model performance using 5-fold cross-validation, multi-site validation trained on one clinical site and tested on 11 other independent sites, and simulated retrospective deployment trained on pre-2010 data and deployed in 2010.

Findings Our composite “rECHOmmend” model using age, sex and ECG traces had an area under the receiver operating characteristic curve (AUROC) of 0.91 and a PPV of 42% at 90% sensitivity at a prevalence of 17.9% for our composite label. Individual disease models had AUROCs ranging from 0.86-0.93 and lower PPVs from 1%-31%. The AUROC for models using different input features ranged from 0.80-0.93, increasing with additional features. Multi-site validation showed similar results to the cross-validation, with an aggregate AUROC of 0.91 across our independent test set of 11 clinical sites after training on a separate site. Our simulated retrospective deployment showed that for ECGs acquired in patients without pre-existing known structural heart disease in a single year, 2010, 11% were classified as high-risk, of which 41% developed true, echocardiography-confirmed disease within one year.

Interpretation An ECG-based machine learning model using a composite endpoint can predict previously undiagnosed, clinically significant structural heart disease while outperforming single disease models and improving practical utility with higher PPVs. This approach can facilitate targeted screening with echocardiography to improve under-diagnosis of structural heart disease.

Competing Interest Statement

Geisinger investigators (AUC, LJ, SR, JAR, DBR, JBL, CMH, RC) receive funding from Tempus for ongoing development of predictive modeling technology. Tempus and Geisinger have jointly applied for predictive modeling patents. None of the Geisinger investigators have ownership interest in any of the intellectual property resulting from the partnership. Tempus did not have any input in the design, execution, interpretation of results or decision to publish. JMP, NZ, GL, and BFK are Tempus employees. SRS is a consultant for Tempus. SRS is also an employee of physIQ and reports personal fees from Otsuka and Janssen, outside the submitted work. BKF reports personal fees from Novartis, outside the submitted work.

Funding Statement

This work is supported by a grant from Tempus

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The Institutional Review Board of Geisinger approved this study with a waiver of consent

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

All intermediate, subgroup, and aggregate results are publicly available online as a searchable dashboard. Patient-level data are not available for the Geisinger data set. Requests for code or data can be made to the corresponding author.

http://rechommend.herokuapp.com/

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted October 07, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
rECHOmmend: an ECG-based machine-learning approach for identifying patients at high-risk of undiagnosed structural heart disease detectable by echocardiography
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
rECHOmmend: an ECG-based machine-learning approach for identifying patients at high-risk of undiagnosed structural heart disease detectable by echocardiography
Alvaro E. Ulloa-Cerna, Linyuan Jing, John M. Pfeifer, Sushravya Raghunath, Jeffrey A. Ruhl, Daniel B. Rocha, Joseph B. Leader, Noah Zimmerman, Greg Lee, Steven R. Steinhubl, Christopher W. Good, Christopher M. Haggerty, Brandon K. Fornwalt, Ruijun Chen
medRxiv 2021.10.06.21264669; doi: https://doi.org/10.1101/2021.10.06.21264669
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
rECHOmmend: an ECG-based machine-learning approach for identifying patients at high-risk of undiagnosed structural heart disease detectable by echocardiography
Alvaro E. Ulloa-Cerna, Linyuan Jing, John M. Pfeifer, Sushravya Raghunath, Jeffrey A. Ruhl, Daniel B. Rocha, Joseph B. Leader, Noah Zimmerman, Greg Lee, Steven R. Steinhubl, Christopher W. Good, Christopher M. Haggerty, Brandon K. Fornwalt, Ruijun Chen
medRxiv 2021.10.06.21264669; doi: https://doi.org/10.1101/2021.10.06.21264669

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Cardiovascular Medicine
Subject Areas
All Articles
  • Addiction Medicine (427)
  • Allergy and Immunology (753)
  • Anesthesia (220)
  • Cardiovascular Medicine (3281)
  • Dentistry and Oral Medicine (362)
  • Dermatology (274)
  • Emergency Medicine (478)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1164)
  • Epidemiology (13336)
  • Forensic Medicine (19)
  • Gastroenterology (896)
  • Genetic and Genomic Medicine (5127)
  • Geriatric Medicine (479)
  • Health Economics (780)
  • Health Informatics (3250)
  • Health Policy (1137)
  • Health Systems and Quality Improvement (1189)
  • Hematology (427)
  • HIV/AIDS (1012)
  • Infectious Diseases (except HIV/AIDS) (14611)
  • Intensive Care and Critical Care Medicine (908)
  • Medical Education (475)
  • Medical Ethics (126)
  • Nephrology (521)
  • Neurology (4898)
  • Nursing (261)
  • Nutrition (725)
  • Obstetrics and Gynecology (879)
  • Occupational and Environmental Health (795)
  • Oncology (2515)
  • Ophthalmology (722)
  • Orthopedics (280)
  • Otolaryngology (346)
  • Pain Medicine (323)
  • Palliative Medicine (90)
  • Pathology (537)
  • Pediatrics (1297)
  • Pharmacology and Therapeutics (548)
  • Primary Care Research (554)
  • Psychiatry and Clinical Psychology (4189)
  • Public and Global Health (7482)
  • Radiology and Imaging (1700)
  • Rehabilitation Medicine and Physical Therapy (1010)
  • Respiratory Medicine (979)
  • Rheumatology (478)
  • Sexual and Reproductive Health (493)
  • Sports Medicine (424)
  • Surgery (545)
  • Toxicology (71)
  • Transplantation (235)
  • Urology (203)