ABSTRACT
Objective Artificial intelligence (AI) detects heart disease from images of electrocardiograms (ECGs), however traditional supervised learning is limited by the need for large amounts of labeled data. We report the development of Biometric Contrastive Learning (BCL), a self-supervised pretraining approach for label-efficient deep learning on ECG images.
Materials and Methods Using pairs of ECGs from 78,288 individuals from Yale (2000-2015), we trained a convolutional neural network to identify temporally-separated ECG pairs that varied in layouts from the same patient. We fine-tuned BCL-pretrained models to detect atrial fibrillation (AF), gender, and LVEF<40%, using ECGs from 2015-2021. We externally tested the models in cohorts from Germany and the US. We compared BCL with random initialization and general-purpose self-supervised contrastive learning for images (simCLR).
Results While with 100% labeled training data, BCL performed similarly to other approaches for detecting AF/Gender/LVEF<40% with AUROC of 0.98/0.90/0.90 in the held-out test sets, it consistently outperformed other methods with smaller proportions of labeled data, reaching equivalent performance at 50% of data. With 0.1% data, BCL achieved AUROC of 0.88/0.79/0.75, compared with 0.51/0.52/0.60 (random) and 0.61/0.53/0.49 (simCLR). In external validation, BCL outperformed other methods even at 100% labeled training data, with AUROC of 0.88/0.88 for Gender and LVEF<40% compared with 0.83/0.83 (random) and 0.84/0.83 (simCLR).
Discussion and Conclusion A pretraining strategy that leverages biometric signatures of different ECGs from the same patient enhances the efficiency of developing AI models for ECG images. This represents a major advance in detecting disorders from ECG images with limited labeled data.
Competing Interest Statement
Dr. Mortazavi reported receiving grants from the National Institute of Biomedical Imaging and Bioengineering, National Heart, Lung, and Blood Institute, US Food and Drug Administration, and the US Department of Defense Advanced Research Projects Agency outside the submitted work; in addition, Dr. Mortazavi has a pending patent on predictive models using electronic health records (US20180315507A1). Mr. Sangha and Dr. Khera are the coinventors of U.S. Provisional Patent Application No. 63/346,610, Articles and methods for format-independent detection of hidden cardiovascular disease from printed electrocardiographic images using deep learning. Dr. Khera receives support from the National Heart, Lung, and Blood Institute of the National Institutes of Health (under award K23HL153775) and the Doris Duke Charitable Foundation (under award 2022060). He receives support from the Blavatnik Foundation through the Blavatnik fund for Innovation at Yale. He also receives research support, through Yale, from Bristol-Myers Squibb, and Novo Nordisk. He is an Associate Editor at JAMA. In addition to 63/346,610, Dr. Khera is a coinventor of U.S. Provisional Patent Applications 63/177,117, 63/428,569, and 63/484,426. He is also a founder of Evidence2Health, a precision health platform to improve evidence-based cardiovascular care.
Funding Statement
This study was supported by research funding awarded to Dr. Khera by the Yale School of Medicine and grant support from the National Heart, Lung, and Blood Institute of the National Institutes of Health under the award K23HL153775. The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study was reviewed by the Yale Institutional Review Board, which approved the study protocol and waived the need for informed consent as the study represents secondary analysis of existing data.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The data cannot be shared publicly.