Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

An ensemble prediction model for COVID-19 mortality risk

Jie Li, Xin Li, John Hutchinson, Mohammad Asad, Yadong Wang, Edwin Wang
doi: https://doi.org/10.1101/2022.01.10.22268985
Jie Li
1School of Computer Science and Technology, Harbin Institute of Technology, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: jieli@hit.edu.cn edwin.wang@ucalgary.ca
Xin Li
1School of Computer Science and Technology, Harbin Institute of Technology, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John Hutchinson
2Department of Medical Genetics, University of Calgary, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mohammad Asad
2Department of Medical Genetics, University of Calgary, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yadong Wang
1School of Computer Science and Technology, Harbin Institute of Technology, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Edwin Wang
2Department of Medical Genetics, University of Calgary, Canada
3Department of Medicine, McGill University, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: jieli@hit.edu.cn edwin.wang@ucalgary.ca
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background It’s critical to identify COVID-19 patients with a higher death risk at early stage to give them better hospitalization or intensive care. However, thus far, none of the machine learning models has been shown to be successful in an independent cohort. We aim to develop a machine learning model which could accurately predict death risk of COVID-19 patients at an early stage in other independent cohorts.

Methods We used a cohort containing 4711 patients whose clinical features associated with patient physiological conditions or lab test data associated with inflammation, hepatorenal function, cardiovascular function and so on to identify key features. To do so, we first developed a novel data preprocessing approach to clean up clinical features and then developed an ensemble machine learning method to identify key features.

Results Finally, we identified 14 key clinical features whose combination reached a good predictive performance of AUC 0.907. Most importantly, we successfully validated these key features in a large independent cohort containing 15,790 patients.

Conclusions Our study shows that 14 key features are robust and useful in predicting the risk of death in patients confirmed SARS-CoV-2 infection at an early stage, and potentially useful in clinical settings to help in making clinical decisions.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

Alberta Innovates for Health

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

1, The first data is from https://figshare.com/s/79827c396af7df42b3d7. the detail of the first data can be found in the paper:"Altschul DJ, Unda SR, Benton J, de la Garza Ramos R, Cezayirli P, Mehler M, et al. A novel severity score to predict inpatient mortality in COVID-19 patients. Scientific Reports. 2020;10(1):16726. 2, the second data is from UKbiobank, we list two references in our manuscript:" Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779. Barbour V. UK Biobank: a project in search of a protocol? The Lancet. 2003;361(9370):1734-8.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Footnotes

  • ↵# Co-first author

Data Availability

All data produced in the present work are contained in the manuscript

https://figshare.com/s/79827c396af7df42b3d7.

https://www.ukbiobank.ac.uk/

  • List of Abbreviations

    OsSats
    oxygen saturation
    Temp
    temperature
    MAP
    mean arterial pressure
    Ddimer
    D-dimer
    Plts
    platelets
    INR
    international normalized ratio
    BUN
    blood urea nitrogen
    AST
    aspartate aminotransferase
    ALT
    alanine aminotransferase
    WBC
    white blood cells
    Lympho
    lymphocytes
    IL-6
    interleukin-6
    CrctProtein
    C-reactive protein
    KNN
    k-nearest neighbor method
    GBDT
    Gradient Boosted Decision Tree
    XGBoost
    Extreme Gradient Boosting
    RF
    Random Forest
    LR
    Logistic Regression
    SVM
    Support Vector Machine
    EM
    Ensemble Model
    ROC
    Receiver Operating Characteristic
    AUC
    Area Under ROC Curve
    TP
    True Positive
    FP
    False Positive, TN: True Negative
    FN
    False Negative
    CSS
    COVID-19 severity scores
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
    Back to top
    PreviousNext
    Posted January 13, 2022.
    Download PDF

    Supplementary Material

    Data/Code
    Email

    Thank you for your interest in spreading the word about medRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    An ensemble prediction model for COVID-19 mortality risk
    (Your Name) has forwarded a page to you from medRxiv
    (Your Name) thought you would like to see this page from the medRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    An ensemble prediction model for COVID-19 mortality risk
    Jie Li, Xin Li, John Hutchinson, Mohammad Asad, Yadong Wang, Edwin Wang
    medRxiv 2022.01.10.22268985; doi: https://doi.org/10.1101/2022.01.10.22268985
    Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
    Citation Tools
    An ensemble prediction model for COVID-19 mortality risk
    Jie Li, Xin Li, John Hutchinson, Mohammad Asad, Yadong Wang, Edwin Wang
    medRxiv 2022.01.10.22268985; doi: https://doi.org/10.1101/2022.01.10.22268985

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Intensive Care and Critical Care Medicine
    Subject Areas
    All Articles
    • Addiction Medicine (161)
    • Allergy and Immunology (414)
    • Anesthesia (90)
    • Cardiovascular Medicine (857)
    • Dentistry and Oral Medicine (159)
    • Dermatology (97)
    • Emergency Medicine (248)
    • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (393)
    • Epidemiology (8557)
    • Forensic Medicine (4)
    • Gastroenterology (383)
    • Genetic and Genomic Medicine (1749)
    • Geriatric Medicine (167)
    • Health Economics (372)
    • Health Informatics (1239)
    • Health Policy (620)
    • Health Systems and Quality Improvement (467)
    • Hematology (196)
    • HIV/AIDS (372)
    • Infectious Diseases (except HIV/AIDS) (10292)
    • Intensive Care and Critical Care Medicine (553)
    • Medical Education (192)
    • Medical Ethics (51)
    • Nephrology (211)
    • Neurology (1676)
    • Nursing (97)
    • Nutrition (249)
    • Obstetrics and Gynecology (326)
    • Occupational and Environmental Health (450)
    • Oncology (928)
    • Ophthalmology (263)
    • Orthopedics (101)
    • Otolaryngology (172)
    • Pain Medicine (112)
    • Palliative Medicine (40)
    • Pathology (252)
    • Pediatrics (534)
    • Pharmacology and Therapeutics (248)
    • Primary Care Research (207)
    • Psychiatry and Clinical Psychology (1765)
    • Public and Global Health (3835)
    • Radiology and Imaging (623)
    • Rehabilitation Medicine and Physical Therapy (320)
    • Respiratory Medicine (520)
    • Rheumatology (208)
    • Sexual and Reproductive Health (166)
    • Sports Medicine (158)
    • Surgery (190)
    • Toxicology (36)
    • Transplantation (101)
    • Urology (76)