TY - JOUR T1 - Machine Learning Prediction of Biomarkers from SNPs and of Disease Risk from Biomarkers in the UK Biobank JF - medRxiv DO - 10.1101/2021.04.01.21254711 SP - 2021.04.01.21254711 AU - Erik Widen AU - Timothy G. Raben AU - Louis Lello AU - Stephen D.H. Hsu Y1 - 2021/01/01 UR - http://medrxiv.org/content/early/2021/04/05/2021.04.01.21254711.abstract N2 - We use UK Biobank data to train predictors for 48 blood and urine markers such as HDL, LDL, lipoprotein A, glycated haemoglobin, … from SNP genotype. For example, our predictor correlates ∼ 0.76 with lipoprotein A level, which is highly heritable and an independent risk factor for heart disease. This may be the most accurate genomic prediction of a quantitative trait that has yet been produced (specifically, for European ancestry groups). We also train predictors of common disease risk using blood and urine biomarkers alone (no DNA information). Individuals who are at high risk (e.g., odds ratio of > 5x population average) can be identified for conditions such as coronary artery disease (AUC ∼ 0.75), diabetes (AUC ∼ 0.95), hypertension, liver and kidney problems, and cancer using biomarkers alone. Our atherosclerotic cardiovascular disease (ASCVD) predictor uses ∼ 10 biomarkers and performs in UKB evaluation as well as or better than the American College of Cardiology ASCVD Risk Estimator, which uses quite different inputs (age, diagnostic history, BMI, smoking status, statin usage, etc.). We compare polygenic risk scores (risk conditional on genotype: (risk score | SNPs)) for common diseases to the risk predictors which result from the concatenation of learned functions (risk score | biomarkers) and (biomarker | SNPs).Competing Interest StatementStephen Hsu is a founder, shareholder and serves on the Board of Directors of Genomic Prediction, Inc. (GP). Louis Lello is an employee and shareholder of GP. These roles had no impact in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. EW and TR declare no competing interests.Funding StatementThis research received no external funding.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The UK Biobank has ethical approval from the North West Centre for Research Ethics Committee (Application 11/NW/0382), which covers the UK. UK Biobank obtained informed consent from all participants. Full details can be found at https://www.ukbiobank.ac.uk/learn-more-about-uk-biobank/about-us/ethics. The generation and use of the data presented in this paper was approved by the UK Biobank access committee under UK Biobank application number 15326.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe predictors described in the paper are available to other researchers upon request. ER -