RT Journal Article SR Electronic T1 GenECG: A synthetic image-based ECG dataset to augment artificial intelligence-enhanced algorithm development JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.12.27.23300581 DO 10.1101/2023.12.27.23300581 A1 Bodagh, Neil A1 Tun, Kyaw Soe A1 Barton, Adam A1 Javidi, Malihe A1 Rashid, Darwon A1 Burns, Rachel A1 Kotadia, Irum A1 Klis, Magda A1 Gharaviri, Ali A1 Vigneswaran, Vinush A1 Niederer, Steven A1 O’Neill, Mark A1 Bernabeu, Miguel O A1 Williams, Steven E YR 2023 UL http://medrxiv.org/content/early/2023/12/29/2023.12.27.23300581.abstract AB Artificial intelligence-enhanced electrocardiogram (AI-ECG) analysis has the potential to transform care of cardiovascular disease patients. Most algorithms rely on digitised signal data and are unable to analyse paper-based ECGs, which remain in use in numerous clinical settings. An image-based ECG dataset incorporating artefacts common to paper-based ECGs, which are typically scanned or photographed into electronic health records, could facilitate development of clinically useful image-based algorithms. This paper describes the creation of GenECG, a high-fidelity, synthetic image-based dataset containing 21,799 ECGs with artefacts encountered in routine care. Iterative clinical Turing tests confirmed the realism of the synthetic ECGs: expert observer accuracy of discrimination between real-world and synthetic ECGs fell from 63.9% (95% CI 58.0%- 69.8%) to 53.3% (95% CI: 48.6%-58.1%) over three rounds of testing, indicating that observers could not distinguish between synthetic and real ECGs. GenECG is the first publicly available synthetic image-based ECG dataset to pass a clinical Turing test. The dataset will enable image-based AI-ECG algorithm development, ensuring the translation of AI-ECG research developments to the clinical workspace.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThe study was funded by a University of Edinburgh Wellcome Trust iTPA award. The authors acknowledge the support of the British Heart Foundation Centre for Research Excellence Award III (RE/18/5/34216). The authors acknowledge the support of the British Heart Foundation (RG/20/4/34803). SEW is supported by the British Heart Foundation (FS/20/26/34952). IK is supported by the British Heart Foundation (FS/CRTF/21/24166).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The Ethics committee/IRB of King's College London gave ethical approval for the clinical Turing tests performed as part of this study (LRS-22/23-38259).I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesThe ECG images described in the study were created from the PTB-XL database. The ECG images will be used for a British Heart Foundation Data Science Centre open challenge (https://bhfdatasciencecentre.org/areas/unstructured-data/imaging-open-challenge). Following this challenge, Dataset A and B will be made publicly available via a Creative Commons license.https://bhfdatasciencecentre.org/areas/unstructured-data/imaging-open-challenge