Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

ConceptWAS: a high-throughput method for early identification of COVID-19 presenting symptoms

View ORCID ProfileJuan Zhao, Monika E Grabowska, View ORCID ProfileVern Eric Kerchberger, Joshua C. Smith, H. Nur Eken, View ORCID ProfileQiPing Feng, Josh F. Peterson, S. Trent Rosenbloom, Kevin B. Johnson, Wei-Qi Wei
doi: https://doi.org/10.1101/2020.11.06.20227165
Juan Zhao
1Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Juan Zhao
Monika E Grabowska
2Medical Scientist Training Program, Vanderbilt University School of Medicine, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Vern Eric Kerchberger
3Department of Medicine, Division of Allergy, Pulmonary & Critical Care Medicine, Vanderbilt University Medical Center, Nashville, TN
1Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vern Eric Kerchberger
Joshua C. Smith
1Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
H. Nur Eken
4Vanderbilt University School of Medicine, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
QiPing Feng
5Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for QiPing Feng
Josh F. Peterson
1Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
S. Trent Rosenbloom
1Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kevin B. Johnson
1Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wei-Qi Wei
1Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: wei-qi.wei@vumc.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Objective Identifying symptoms highly specific to COVID-19 would improve the clinical and public health response to infectious outbreaks. Here, we describe a high-throughput approach – Concept-Wide Association Study (ConceptWAS) that systematically scans a disease’s clinical manifestations from clinical notes. We used this method to identify symptoms specific to COVID-19 early in the course of the pandemic.

Methods Using the Vanderbilt University Medical Center (VUMC) EHR, we parsed clinical notes through a natural language processing pipeline to extract clinical concepts. We examined the difference in concepts derived from the notes of COVID-19-positive and COVID-19-negative patients on the PCR testing date. We performed ConceptWAS using the cumulative data every two weeks for early identifying specific COVID-19 symptoms.

Results We processed 87,753 notes 19,692 patients (1,483 COVID-19-positive) subjected to COVID-19 PCR testing between March 8, 2020, and May 27, 2020. We found 68 clinical concepts significantly associated with COVID-19. We identified symptoms associated with increasing risk of COVID-19, including “absent sense of smell” (odds ratio [OR] = 4.97, 95% confidence interval [CI] = 3.21–7.50), “fever” (OR = 1.43, 95% CI = 1.28–1.59), “with cough fever” (OR = 2.29, 95% CI = 1.75–2.96), and “ageusia” (OR = 5.18, 95% CI = 3.02–8.58). Using ConceptWAS, we were able to detect loss sense of smell or taste three weeks prior to their inclusion as symptoms of the disease by the Centers for Disease Control and Prevention (CDC).

Conclusion ConceptWAS is a high-throughput approach for exploring specific symptoms of a disease like COVID-19, with a promise for enabling EHR-powered early disease manifestations identification.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

The project was supported by the National Institutes of Health (NIH), National Institute of General Medical Studies (P50 GM115305), National Heart, Lung, and Blood Institute (R01 HL133786), National Library of Medicine (T15 LM007450, R01 LM010685), and the American Heart Association (18AMTG34280063), as well as the Vanderbilt Biomedical Informatics Training Program, Vanderbilt Faculty Research Scholar Fund, and the Vanderbilt Medical Scientist Training Program. The datasets used for the analyses described were obtained from Vanderbilt University Medical Center's resources and the Synthetic Derivative, which are supported by institutional funding and by the Vanderbilt National Center for Advancing Translational Science grant (UL1 TR000445) from NCATS/NIH. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

RE: IRB #200731 "EHR Patterns of COVID-19" Dear Wei-qi Wei: A designee of the Institutional Review Board reviewed the Request for Exemption application identified above. It was determined the study poses minimal risk to participants. This study meets 45 CFR 46.104 (d) category (4) for Exempt Review. Any changes to this proposal that may alter its exempt status should be presented to the IRB for approval prior to implementation of the changes. DATE OF IRB APPROVAL: 5/21/2020 Sincerely, Erin Leigh Johnson BS, CIP Institutional Review Board Behavioral Sciences Committee Electronic Signature: Erin Leigh Johnson/VUMC/Vanderbilt : (2c8072da56409aeae2a38cff74965eed) Signed On: 05/21/2020 3:24:30 PM CDT

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

The study used data from patients represented in the Vanderbilt University Medical Center.The study was approved by the Vanderbilt University Medical Center Institutional Review Board (IRB #200512). The summary statistics derived from the EHRs are enclosed within the manuscript.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted November 10, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
ConceptWAS: a high-throughput method for early identification of COVID-19 presenting symptoms
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
ConceptWAS: a high-throughput method for early identification of COVID-19 presenting symptoms
Juan Zhao, Monika E Grabowska, Vern Eric Kerchberger, Joshua C. Smith, H. Nur Eken, QiPing Feng, Josh F. Peterson, S. Trent Rosenbloom, Kevin B. Johnson, Wei-Qi Wei
medRxiv 2020.11.06.20227165; doi: https://doi.org/10.1101/2020.11.06.20227165
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
ConceptWAS: a high-throughput method for early identification of COVID-19 presenting symptoms
Juan Zhao, Monika E Grabowska, Vern Eric Kerchberger, Joshua C. Smith, H. Nur Eken, QiPing Feng, Josh F. Peterson, S. Trent Rosenbloom, Kevin B. Johnson, Wei-Qi Wei
medRxiv 2020.11.06.20227165; doi: https://doi.org/10.1101/2020.11.06.20227165

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Infectious Diseases (except HIV/AIDS)
Subject Areas
All Articles
  • Addiction Medicine (215)
  • Allergy and Immunology (495)
  • Anesthesia (106)
  • Cardiovascular Medicine (1093)
  • Dentistry and Oral Medicine (195)
  • Dermatology (141)
  • Emergency Medicine (274)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (499)
  • Epidemiology (9758)
  • Forensic Medicine (5)
  • Gastroenterology (480)
  • Genetic and Genomic Medicine (2304)
  • Geriatric Medicine (222)
  • Health Economics (462)
  • Health Informatics (1554)
  • Health Policy (732)
  • Health Systems and Quality Improvement (602)
  • Hematology (236)
  • HIV/AIDS (501)
  • Infectious Diseases (except HIV/AIDS) (11634)
  • Intensive Care and Critical Care Medicine (616)
  • Medical Education (236)
  • Medical Ethics (67)
  • Nephrology (257)
  • Neurology (2140)
  • Nursing (134)
  • Nutrition (335)
  • Obstetrics and Gynecology (426)
  • Occupational and Environmental Health (517)
  • Oncology (1172)
  • Ophthalmology (363)
  • Orthopedics (128)
  • Otolaryngology (220)
  • Pain Medicine (145)
  • Palliative Medicine (50)
  • Pathology (309)
  • Pediatrics (694)
  • Pharmacology and Therapeutics (298)
  • Primary Care Research (265)
  • Psychiatry and Clinical Psychology (2173)
  • Public and Global Health (4648)
  • Radiology and Imaging (775)
  • Rehabilitation Medicine and Physical Therapy (455)
  • Respiratory Medicine (623)
  • Rheumatology (274)
  • Sexual and Reproductive Health (225)
  • Sports Medicine (210)
  • Surgery (250)
  • Toxicology (43)
  • Transplantation (120)
  • Urology (94)