RT Journal Article SR Electronic T1 Utilizing multimodal AI to improve genetic analyses of cardiovascular traits JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2024.03.19.24304547 DO 10.1101/2024.03.19.24304547 A1 Zhou, Yuchen A1 Cosentino, Justin A1 Yun, Taedong A1 Biradar, Mahantesh I. A1 Shreibati, Jacqueline A1 Lai, Dongbing A1 Schwantes-An, Tae-Hwi A1 Luben, Robert A1 McCaw, Zachary A1 Engmann, Jorgen A1 Providencia, Rui A1 Schmidt, Amand Floriaan A1 Munroe, Patricia A1 Yang, Howard A1 Carroll, Andrew A1 Khawaja, Anthony P. A1 McLean, Cory Y. A1 Behsaz, Babak A1 Hormozdiari, Farhad YR 2024 UL http://medrxiv.org/content/early/2024/03/21/2024.03.19.24304547.abstract AB Electronic health records, biobanks, and wearable biosensors contain multiple high-dimensional clinical data (HDCD) modalities (e.g., ECG, Photoplethysmography (PPG), and MRI) for each individual. Access to multimodal HDCD provides a unique opportunity for genetic studies of complex traits because different modalities relevant to a single physiological system (e.g., circulatory system) encode complementary and overlapping information. We propose a novel multimodal deep learning method, M-REGLE, for discovering genetic associations from a joint representation of multiple complementary HDCD modalities. We showcase the effectiveness of this model by applying it to several cardiovascular modalities. M-REGLE jointly learns a lower representation (i.e., latent factors) of multimodal HDCD using a convolutional variational autoencoder, performs genome wide association studies (GWAS) on each latent factor, then combines the results to study the genetics of the underlying system. To validate the advantages of M-REGLE and multimodal learning, we apply it to common cardiovascular modalities (PPG and ECG), and compare its results to unimodal learning methods in which representations are learned from each data modality separately, but the downstream genetic analyses are performed on the combined unimodal representations. M-REGLE identifies 19.3% more loci on the 12-lead ECG dataset, 13.0% more loci on the ECG lead I + PPG dataset, and its genetic risk score significantly outperforms the unimodal risk score at predicting cardiac phenotypes, such as atrial fibrillation (Afib), in multiple biobanks.Competing Interest StatementY.Z., J.C., T.Y., Z.R.M., J.B., H.Y., A.C., C.Y.M., B.B., and F.H. are current or former employees of Google, and own Alphabet stock as part of the standard compensation package. Funding StatementY.Z., J.C., T.Y., Z.R.M., J.B., H.Y., A.C., C.Y.M., B.B., and F.H. are current or former employees of Google and received salary, bonus, and stock awards as part of the standard compensation package. A.P.K. is supported by a UK Research and Innovation Future Leaders Fellowship, an Alcon Research Institute Young Investigator Award and a Lister Institute for Preventive Medicine Award. P.B.M. acknowledges the support of the National Institute for Health and Care Research Barts Biomedical Research Centre (NIHR203330).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Advarra IRB (Columbia, MD) waived ethical approval for this work involving de-identified medical imagery and metadata under 45 CFR 46. Work related to genomics data were additionally reviewed by the respective data sources: UK Biobank, EPIC Norfolk, Indiana Biobank, and British Women's Heart and Health Study. This research has been conducted using the UK Biobank Resource under Application Number 65275.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesDatasets used in this study (UK Biobank, EPIC Norfolk, Indiana Biobank, and British Women's Heart and Health Study) are available to qualified researchers via applying access to each dataset maintainers. Open-source code and trained model weights are available at https://github.com/Google-Health/ genomics-research under the mregle directory. M-REGLE values of UK Biobank individuals will be returned to UK Biobank and will be made available by UK Biobank https://www.ukbiobank.ac.uk/. https://github.com/Google-Health/genomics-research https://www.ukbiobank.ac.uk/