Abstract
The SARS-CoV2 virus behind the COVID-19 pandemic is manifesting itself in different ways among infected people. While many are experiencing mild flue-like symptoms or are even remaining asymptomatic after infection, the virus has also led to serious complications, overloading ICUs while claiming more than 2.6 million lives world-wide. In this work, we apply AI methods to better understand factors that drive the severity of the disease. From the UK BioBank dataset we analyzed both clinical and genomic data of patients infected by this virus. Leveraging positive-unlabeled machine learning algorithms coupled with RubricOE, a state-of-the-art genomic analysis framework for genomic feature extraction, we propose severity prediction algorithms with high F1 score. Furthermore, we extracted insights on clinical and genomic factors driving the severity prediction. We also report on how these factors have evolved during the pandemic w.r.t. significant events such as the emergence of the B.1.1.7 SARS-CoV2 virus strain.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Nothing to declare
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Data analysis was performed under UK Biobank application 50658 using existing publicly available and deidentified data and was IRB exempt.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The data that support the findings of this study are available from the UK Biobank upon reasonable request. [1] 1. UK Biobank ACCESS PROCEDURES: Application and review procedures for access to the UK Biobank ResourcAccess procedures. Published 2011. https://www.ukbiobank.ac.uk/wp-content/uploads/2012/09/Access-Procedures-2011-1.pdf