Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Artificial intelligence driven assessment of routinely collected healthcare data is an effective screening test for COVID-19 in patients presenting to hospital

View ORCID ProfileAndrew AS Soltan, View ORCID ProfileSamaneh Kouchaki, View ORCID ProfileTingting Zhu, View ORCID ProfileDani Kiyasseh, Thomas Taylor, View ORCID ProfileZaamin B. Hussain, Tim Peto, Andrew J Brent, View ORCID ProfileDavid W. Eyre, David Clifton
doi: https://doi.org/10.1101/2020.07.07.20148361
Andrew AS Soltan
1John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust
2Oxford University Clinical Academic Graduate School, University of Oxford
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrew AS Soltan
  • For correspondence: David.Clifton@eng.ox.ac.uk Andrew.Soltan@medsci.ox.ac.uk
Samaneh Kouchaki
3Institute of Biomedical Engineering, Dept. Engineering Science, University of Oxford
6Centre for Vision, Speech and Signal Processing, University of Surrey
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Samaneh Kouchaki
Tingting Zhu
3Institute of Biomedical Engineering, Dept. Engineering Science, University of Oxford
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tingting Zhu
Dani Kiyasseh
3Institute of Biomedical Engineering, Dept. Engineering Science, University of Oxford
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Dani Kiyasseh
Thomas Taylor
3Institute of Biomedical Engineering, Dept. Engineering Science, University of Oxford
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Zaamin B. Hussain
4Harvard Graduate School of Education, Harvard University
5Harvard T.H. Chan School of Public Health, Harvard University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Zaamin B. Hussain
Tim Peto
1John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust
7Nuffield Department of Medicine, University of Oxford
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrew J Brent
1John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust
7Nuffield Department of Medicine, University of Oxford
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David W. Eyre
1John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust
8Big Data Institute, Nuffield Department of Population Health, University of Oxford
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David W. Eyre
David Clifton
3Institute of Biomedical Engineering, Dept. Engineering Science, University of Oxford
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: David.Clifton@eng.ox.ac.uk Andrew.Soltan@medsci.ox.ac.uk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Rapid identification of COVID-19 is important for delivering care expediently and maintaining infection control. The early clinical course of SARS-CoV-2 infection can be difficult to distinguish from other undifferentiated medical presentations to hospital, however for operational reasons SARS-CoV-2 PCR testing can take up to 48 hours. Artificial Intelligence (AI) methods, trained using routinely collected clinical data, may allow front-door screening for COVID-19 within the first hour of presentation.

Methods Demographic, routine and prior clinical data were extracted for 170,510 sequential presentations to emergency and acute medical departments at a large UK teaching hospital group. We applied multivariate logistic regression, random forests and extreme gradient boosted trees to distinguish emergency department (ED) presentations and admissions due to COVID-19 from pre-pandemic controls. We performed stepwise addition of clinical feature sets and assessed performance using stratified 10-fold cross validation. Models were calibrated during training to achieve sensitivities of 70, 80 and 90% for identifying patients with COVID-19. To simulate real-world performance at different stages of an epidemic, we generated test sets with varying prevalences of COVID-19 and assessed predictive values. We prospectively validated our models for all patients presenting or admitted to our hospital group between 20th April and 6th May 2020, comparing model predictions to PCR test results.

Results Presentation laboratory blood tests, point of care blood gas, and vital signs measurements for 115,394 emergency presentations and 72,310 admissions were analysed. Presentation laboratory tests and vital signs were most predictive of COVID-19 (maximum area under ROC curve [AUROC] 0.904 and 0.823, respectively). Sequential addition of informative variables improved model performance to AUROC 0.942.

We developed two early-detection models to identify COVID-19, achieving sensitivities and specificities of 77.4% and 95.7% for our ED model amongst patients attending hospital, and 77.4% and 94.8% for our Admissions model amongst patients being admitted. Both models offer high negative predictive values (>99%) across a range of prevalences (<5%). In a two-week prospective validation period, our ED and Admissions models demonstrated 92.3% and 92.5% accuracy (AUROC 0.881 and 0.871 respectively) for all patients presenting or admitted to a large UK teaching hospital group. A sensitivity analysis to account for uncertainty in negative PCR results improves apparent accuracy (95.1% and 94.1%) and NPV (99.0% and 98.5%). Three laboratory blood markers, Eosinophils, Basophils, and C-Reactive Protein, alongside Calcium measured on blood-gas, and presentation Oxygen requirement were the most informative variables in our models.

Conclusion Artificial intelligence techniques perform effectively as a screening test for COVID-19 in emergency departments and hospital admission units. Our models support rapid exclusion of the illness using routinely collected and readily available clinical measurements, guiding streaming of patients during the early phase of admission.

Brief The early clinical course of SARS-CoV-2 infection can be difficult to distinguish from other undifferentiated medical presentations to hospital, however viral specific real-time polymerase chain reaction (RT-PCR) testing has limited sensitivity and can take up to 48 hours for operational reasons. In this study, we develop two early-detection models to identify COVID-19 using routinely collected data typically available within one hour (laboratory tests, blood gas and vital signs) during 115,394 emergency presentations and 72,310 admissions to hospital. Our emergency department (ED) model achieved 77.4% sensitivity and 95.7% specificity (AUROC 0.939) for COVID-19 amongst all patients attending hospital, and Admissions model achieved 77.4% sensitivity and 94.8% specificity (AUROC 0.940) for the subset admitted to hospital. Both models achieve high negative predictive values (>99%) across a range of prevalences (<5%), facilitating rapid exclusion during triage to guide infection control. We prospectively validated our models across all patients presenting and admitted to a large UK teaching hospital group in a two-week test period, achieving 92.3% (n= 3,326, NPV: 97.6%, AUROC: 0.881) and 92.5% accuracy (n=1,715, NPV: 97.7%, AUROC: 0.871) in comparison to RT-PCR results. Sensitivity analyses to account for uncertainty in negative PCR results improves apparent accuracy (95.1% and 94.1%) and NPV (99.0% and 98.5%). Our artificial intelligence models perform effectively as a screening test for COVID-19 in emergency departments and hospital admission units, offering high impact in settings where rapid testing is unavailable.

Competing Interest Statement

DWE reports lecture fees from Gilead, outside the submitted work. DC reports Consultancy for Oxford University Innovation, Biobeats, and Sensyne Health. No other authors report any conflicts of interest.

Funding Statement

AS is an NIHR Academic Clinical Fellow. DWE is a Robertson Foundation Fellow and an NIHR Oxford Biomedical Research Centre Senior Fellow. This research was supported by the Engineering and Physical Sciences Research Council (EPSRC) via grants EP/P009824/1 and EP/N020774/1.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Approved by the UK National Health Service (NHS) Human Research Authority (HRA). NHS HRA Reference number (IRAS ID): 281832.

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

The data studied are available from the Infections in Oxfordshire Research Database, subject to an application meeting the ethical and governance requirements of the Database.

https://oxfordbrc.nihr.ac.uk/research-themes-overview/antimicrobial-resistance-and-modernising-microbiology/infections-in-oxfordshire-research-database-iord/

  • Abbreviations

    AI
    Artificial Intelligence
    AUROC
    Area under receiver operating characteristic curve
    COVID-19
    Coronavirus Disease 2019
    CCI
    Charlson Comorbidity Index
    CRP
    C-Reactive Protein
    EHR
    Electronic Health Records
    LR
    Logistic Regression
    NPV
    Negative Predictive Value
    OUH
    Oxford University Hospitals NHS Foundation Trust
    POCT
    Point of Care Test
    PPV
    Positive Predictive Value
    RF
    Random Forest
    RT-PCR
    Real Time Polymerase Chain Reaction
    SARS-CoV-2
    Severe Acute Respiratory Syndrome Coronavirus 2
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
    Back to top
    PreviousNext
    Posted July 08, 2020.
    Download PDF

    Supplementary Material

    Data/Code
    Email

    Thank you for your interest in spreading the word about medRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    Artificial intelligence driven assessment of routinely collected healthcare data is an effective screening test for COVID-19 in patients presenting to hospital
    (Your Name) has forwarded a page to you from medRxiv
    (Your Name) thought you would like to see this page from the medRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    Artificial intelligence driven assessment of routinely collected healthcare data is an effective screening test for COVID-19 in patients presenting to hospital
    Andrew AS Soltan, Samaneh Kouchaki, Tingting Zhu, Dani Kiyasseh, Thomas Taylor, Zaamin B. Hussain, Tim Peto, Andrew J Brent, David W. Eyre, David Clifton
    medRxiv 2020.07.07.20148361; doi: https://doi.org/10.1101/2020.07.07.20148361
    Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
    Citation Tools
    Artificial intelligence driven assessment of routinely collected healthcare data is an effective screening test for COVID-19 in patients presenting to hospital
    Andrew AS Soltan, Samaneh Kouchaki, Tingting Zhu, Dani Kiyasseh, Thomas Taylor, Zaamin B. Hussain, Tim Peto, Andrew J Brent, David W. Eyre, David Clifton
    medRxiv 2020.07.07.20148361; doi: https://doi.org/10.1101/2020.07.07.20148361

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Health Informatics
    Subject Areas
    All Articles
    • Addiction Medicine (174)
    • Allergy and Immunology (421)
    • Anesthesia (97)
    • Cardiovascular Medicine (901)
    • Dentistry and Oral Medicine (170)
    • Dermatology (102)
    • Emergency Medicine (257)
    • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (407)
    • Epidemiology (8789)
    • Forensic Medicine (4)
    • Gastroenterology (405)
    • Genetic and Genomic Medicine (1863)
    • Geriatric Medicine (179)
    • Health Economics (388)
    • Health Informatics (1292)
    • Health Policy (644)
    • Health Systems and Quality Improvement (492)
    • Hematology (207)
    • HIV/AIDS (394)
    • Infectious Diseases (except HIV/AIDS) (10565)
    • Intensive Care and Critical Care Medicine (564)
    • Medical Education (193)
    • Medical Ethics (52)
    • Nephrology (218)
    • Neurology (1756)
    • Nursing (103)
    • Nutrition (266)
    • Obstetrics and Gynecology (343)
    • Occupational and Environmental Health (461)
    • Oncology (965)
    • Ophthalmology (283)
    • Orthopedics (107)
    • Otolaryngology (177)
    • Pain Medicine (118)
    • Palliative Medicine (43)
    • Pathology (264)
    • Pediatrics (557)
    • Pharmacology and Therapeutics (265)
    • Primary Care Research (219)
    • Psychiatry and Clinical Psychology (1845)
    • Public and Global Health (3986)
    • Radiology and Imaging (655)
    • Rehabilitation Medicine and Physical Therapy (344)
    • Respiratory Medicine (535)
    • Rheumatology (215)
    • Sexual and Reproductive Health (178)
    • Sports Medicine (166)
    • Surgery (197)
    • Toxicology (37)
    • Transplantation (106)
    • Urology (80)