PT - JOURNAL ARTICLE AU - Naeimeh Atabaki-Pasdar AU - Mattias Ohlsson AU - Ana Viñuela AU - Francesca Frau AU - Hugo Pomares-Millan AU - Mark Haid AU - Angus G Jones AU - E Louise Thomas AU - Robert W Koivula AU - Azra Kurbasic AU - Pascal M Mutie AU - Hugo Fitipaldi AU - Juan Fernandez AU - Adem Y Dawed AU - Giuseppe N Giordano AU - Ian M Forgie AU - Timothy J McDonald AU - Femke Rutters AU - Henna Cederberg AU - Elizaveta Chabanova AU - Matilda Dale AU - Federico De Masi AU - Cecilia Engel Thomas AU - Kristine H Allin AU - Tue H Hansen AU - Alison Heggie AU - Mun-Gwan Hong AU - Petra JM Elders AU - Gwen Kennedy AU - Tarja Kokkola AU - Helle Krogh Pedersen AU - Anubha Mahajan AU - Donna McEvoy AU - Francois Pattou AU - Violeta Raverdy AU - Ragna S Häussler AU - Sapna Sharma AU - Henrik S Thomsen AU - Jagadish Vangipurapu AU - Henrik Vestergaard AU - Leen M ‘t Hart AU - Jerzy Adamski AU - Petra B Musholt AU - Soren Brage AU - Søren Brunak AU - Emmanouil Dermitzakis AU - Gary Frost AU - Torben Hansen AU - Markku Laakso AU - Oluf Pedersen AU - Martin Ridderstråle AU - Hartmut Ruetten AU - Andrew T Hattersley AU - Mark Walker AU - Joline WJ Beulens AU - Andrea Mari AU - Jochen M Schwenk AU - Ramneek Gupta AU - Mark I McCarthy AU - Ewan R Pearson AU - Jimmy D Bell AU - Imre Pavo AU - Paul W Franks TI - Predicting and elucidating the etiology of fatty liver disease using a machine learning-based approach: an IMI DIRECT study AID - 10.1101/2020.02.10.20021147 DP - 2020 Jan 01 TA - medRxiv PG - 2020.02.10.20021147 4099 - http://medrxiv.org/content/early/2020/02/11/2020.02.10.20021147.short 4100 - http://medrxiv.org/content/early/2020/02/11/2020.02.10.20021147.full AB - Background Non-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in type 2 diabetes (T2D) and beyond. Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and ultimately hepatocellular carcinomas.Methods and Findings Utilizing the baseline data from the IMI DIRECT participants (n=1514) we sought to expand etiological understanding and develop a diagnostic tool for NAFLD using machine learning. Multi-omic (genetic, transcriptomic, proteomic, and metabolomic) and clinical (liver enzymes and other serological biomarkers, anthropometry, and measures of beta-cell function, insulin sensitivity, and lifestyle) data comprised the key input variables. The models were trained on MRI image-derived liver fat content (<5% or ≥5%). We applied LASSO (least absolute shrinkage and selection operator) to select features from the different layers of omics data and Random Forest analysis to develop the models. The prediction models included clinical and omics variables separately or in combination. A model including all omics and clinical variables yielded a cross-validated receiver operator characteristic area under the curve (ROCAUC) of 0.84 (95% confidence interval (CI)=0.82, 0.86), which compared with a ROCAUC of 0.82 (95% CI=0.81, 0.83) for a model including nine clinically-accessible variables. The IMI DIRECT prediction models out-performed existing non-invasive NAFLD prediction tools.Conclusions We have developed clinically useful liver fat prediction models (see: www.predictliverfat.org) and identified biological features that appear to affect liver fat accumulation.Competing Interest StatementThe authors of this manuscript have the following competing interests: PWF is a consultant for Novo Nordisk, Lilly and Zoe Global Ltd and has received research grants from numerous diabetes drug companies. HR is an employee and shareholder of Sanofi. MIM: The views expressed in this article are those of the author(s) and not necessarily those of the NHS, the NIHR, or the Department of Health. MIM has served on advisory panels for Pfizer, NovoNordisk and Zoe Global, has received honoraria from Merck, Pfizer, Novo Nordisk and Eli Lilly, and research funding from Abbvie, Astra Zeneca, Boehringer Ingelheim, Eli Lilly, Janssen, Merck, NovoNordisk, Pfizer, Roche, Sanofi Aventis, Servier, and Takeda. As of June 2019, MIM is an employee of Genentech, and a holder of Roche stock. AM is a consultant for Lilly and has received research grants from several diabetes drug companies.Funding StatementThe work leading to this publication has received support from the Innovative Medicines Initiative Joint Undertaking under grant agreement n°115317 (DIRECT), resources of which are composed of financial contribution from the European Union's Seventh Framework Programme (FP7/2007-2013) and EFPIA companies’ in kind contribution. This work was supported by a European Research Council award ERC- 2015-CoG - 681742_NASCENT. RK was funded by the Novo Nordisk Foundation (NNF18OC0031650) as part of a postdoctoral fellowship. AGJ is supported by an NIHR Clinician Scientist award (17/0005624). TJM is funded by an NIHR clinical senior lecturer fellowship. S.Bru acknowledges support from the Novo Nordisk Foundation (grants NNF17OC0027594 and NNF14CC0001). ATH is a Wellcome Trust Senior Investigator and is also supported by the NIHR Exeter Clinical Research Facility. JMS acknowledges support from Science for Life Laboratory (Plasma Profiling Facility) and Knut and Alice Wallenberg Foundation (Human Protein Atlas), Erling-Persson Foundation (KTH Centre for Precision Medicine). MIM is supported by the following grants; Wellcome (090532, 098381, 106130, 203141, 212259); NIH (U01-DK105535). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesData cannot be shared publicly because of GDPR restrictions on data privacy.