PT - JOURNAL ARTICLE AU - Sanjarbek Hudaiberdiev AU - D. Leland Taylor AU - Wei Song AU - Narisu Narisu AU - Redwan M. Bhuiyan AU - Henry J. Taylor AU - Tingfen Yan AU - Amy J. Swift AU - Lori L. Bonnycastle AU - DIAMANTE Consortium AU - Michael L. Stitzel AU - Michael R. Erdos AU - Ivan Ovcharenko AU - Francis S. Collins TI - Modeling islet enhancers using deep learning identifies candidate causal variants at loci associated with T2D and glycemic traits AID - 10.1101/2022.05.13.22275035 DP - 2022 Jan 01 TA - medRxiv PG - 2022.05.13.22275035 4099 - http://medrxiv.org/content/early/2022/05/16/2022.05.13.22275035.short 4100 - http://medrxiv.org/content/early/2022/05/16/2022.05.13.22275035.full AB - Genetic association studies have identified hundreds of independent signals associated with type 2 diabetes (T2D) and related traits. Despite these successes, the identification of specific causal variants underlying a genetic association signal remains challenging. In this study, we describe a deep learning method to analyze the impact of sequence variants on enhancers. Focusing on pancreatic islets, a T2D relevant tissue, we show that our model learns islet-specific transcription factor (TF) regulatory patterns and can be used to prioritize candidate causal variants. At 101 genetic signals associated with T2D and related glycemic traits where multiple variants occur in linkage disequilibrium, our method nominates a single causal variant for each association signal, including three variants previously shown to alter reporter activity in islet-relevant cell types. For another signal associated with blood glucose levels, we biochemically test all candidate causal variants from statistical fine-mapping using a pancreatic islet beta cell line and show biochemical evidence of allelic effects on TF binding for the model-prioritized variant. To aid in future research, we publicly distribute our model and islet enhancer perturbation scores across ∼67 million genetic variants. We anticipate that deep learning methods like the one presented in this study will enhance the prioritization of candidate causal variants for functional studies.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis research was funded in part by United States National Institutes of Health grants 1-ZIA-HG000024 (to F.S.C.), 1-ZIA-LM200881-12 (to I.O.), R01DK118011 (to M.L.S.), and the Department of Defense Peer-Reviewed Medical Research Program grant W81XWH-18-0401 (to M.L.S.). Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe models from this study (architecture and weights), islet enhancer locations, PAS/DAS locations, and IEP scores for gnomAD SNPs are available through zenodo (https://doi.org/10.5281/zenodo.6463875). https://doi.org/10.5281/zenodo.6463875