Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

A machine learning-based phenotype for long COVID in children: an EHR-based study from the RECOVER program

View ORCID ProfileVitaly Lorman, View ORCID ProfileHanieh Razzaghi, Xing Song, View ORCID ProfileKeith Morse, View ORCID ProfileLevon Utidjian, View ORCID ProfileAndrea J. Allen, View ORCID ProfileSuchitra Rao, View ORCID ProfileColin Rogerson, View ORCID ProfileTellen D. Bennett, View ORCID ProfileHiroki Morizono, Daniel Eckrich, View ORCID ProfileRavi Jhaveri, View ORCID ProfileYungui Huang, Daksha Ranade, View ORCID ProfileNathan Pajor, Grace M. Lee, View ORCID ProfileChristopher B. Forrest, View ORCID ProfileL. Charles Bailey
doi: https://doi.org/10.1101/2022.12.22.22283791
Vitaly Lorman
1Applied Clinical Research Center, Children’s Hospital of Philadelphia, Philadelphia, PA, United States
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vitaly Lorman
  • For correspondence: lormanv@chop.edu
Hanieh Razzaghi
1Applied Clinical Research Center, Children’s Hospital of Philadelphia, Philadelphia, PA, United States
MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Hanieh Razzaghi
Xing Song
2Department of Health Management and Informatics, University of Missouri School of Medicine, Columbia, MO, United States
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Keith Morse
3Division of Pediatric Hospital Medicine, Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States
MD, MBA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Keith Morse
Levon Utidjian
1Applied Clinical Research Center, Children’s Hospital of Philadelphia, Philadelphia, PA, United States
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Levon Utidjian
Andrea J. Allen
1Applied Clinical Research Center, Children’s Hospital of Philadelphia, Philadelphia, PA, United States
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrea J. Allen
Suchitra Rao
4Department of Pediatrics, University of Colorado School of Medicine and Children’s Hospital of Colorado, Aurora, CO, United States
MBBS, MSCS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Suchitra Rao
Colin Rogerson
5Division of Critical Care, Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, United States
MD, MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Colin Rogerson
Tellen D. Bennett
6Departments of Biomedical Informatics and Pediatrics, University of Colorado School of Medicine and Children’s Hospital Colorado, Aurora, CO, United States
MD, MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tellen D. Bennett
Hiroki Morizono
7Center for Genetic Medicine Research, Children’s National Hospital, Washington DC, United States
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Hiroki Morizono
Daniel Eckrich
8Biomedical Research Informatics Center, Nemours Children’s Health, Wilmington, DE, United States
MLIS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ravi Jhaveri
9Division of Infectious Diseases, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL, United States
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ravi Jhaveri
Yungui Huang
10IT Research and Innovation, The Research Institute at Nationwide Children’s Hospital, Columbus, OH, United States
PhD, MBA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yungui Huang
Daksha Ranade
11Research Informatics Department, Seattle Children’s Hospital, Seattle, WA, United States
MPH, MBA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nathan Pajor
12Division of Pulmonary Medicine, Cincinnati Children’s Hospital Medical Center and University of Cincinnati College of Medicine, Cincinnati, OH, United States
MD, MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nathan Pajor
Grace M. Lee
13Division of Infectious Diseases, Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, United States
MD, MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher B. Forrest
1Applied Clinical Research Center, Children’s Hospital of Philadelphia, Philadelphia, PA, United States
MD, PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Christopher B. Forrest
L. Charles Bailey
1Applied Clinical Research Center, Children’s Hospital of Philadelphia, Philadelphia, PA, United States
MD, PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for L. Charles Bailey
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background As clinical understanding of pediatric Post-Acute Sequelae of SARS CoV-2 (PASC) develops, and hence the clinical definition evolves, it is desirable to have a method to reliably identify patients who are likely to have post-acute sequelae of SARS CoV-2 (PASC) in health systems data.

Methods and Findings In this study, we developed and validated a machine learning algorithm to classify which patients have PASC (distinguishing between Multisystem Inflammatory Syndrome in Children (MIS-C) and non-MIS-C variants) from a cohort of patients with positive SARS-CoV-2 test results in pediatric health systems within the PEDSnet EHR network. Patient features included in the model were selected from conditions, procedures, performance of diagnostic testing, and medications using a tree-based scan statistic approach. We used an XGboost model, with hyperparameters selected through cross-validated grid search, and model performance was assessed using 5-fold cross-validation. Model predictions and feature importance were evaluated using Shapley Additive exPlanation (SHAP) values.

Conclusions The model provides a tool for identifying patients with PASC and an approach to characterizing PASC using diagnosis, medication, laboratory, and procedure features in health systems data. Using appropriate threshold settings, the model can be used to identify PASC patients in health systems data at higher precision for inclusion in studies or at higher recall in screening for clinical trials, especially in settings where PASC diagnosis codes are used less frequently or less reliably. Analysis of how specific features contribute to the classification process may assist in gaining a better understanding of features that are associated with PASC diagnoses.

Funding Source This research was funded by the National Institutes of Health (NIH) Agreement OT2HL161847-01 as part of the Researching COVID to Enhance Recovery (RECOVER) program of research.

Disclaimer The content is solely the responsibility of the authors and does not necessarily represent the official views of the RECOVER Program, the NIH or other funders.

Competing Interest Statement

Dr. Rao reports prior grant support from GSK and Biofire and is a consultant for Sequiris. Dr. Jhaveri is a consultant for AstraZeneca, Seqirus, Dynavax, receives an editorial stipend from Elsevier and Pediatric Infectious Diseases Society and royalties from Up To Date/Wolters Kluwer. Dr. Lee serves on the PASC Advisory Board for United Health Group. Dr Bailey has received grants from Patient-Centered Outcomes Research Institute All other authors have nothing to disclose.

Funding Statement

This research was funded by the National Institutes of Health (NIH) Agreement OT2HL161847-01 as part of the Researching COVID to Enhance Recovery (RECOVER) program of research.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

BRANY IRB gave ethical approval for this work and waived documentation of informed consent.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

All data produced in the present study are available upon reasonable request to the authors.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted December 26, 2022.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A machine learning-based phenotype for long COVID in children: an EHR-based study from the RECOVER program
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A machine learning-based phenotype for long COVID in children: an EHR-based study from the RECOVER program
Vitaly Lorman, Hanieh Razzaghi, Xing Song, Keith Morse, Levon Utidjian, Andrea J. Allen, Suchitra Rao, Colin Rogerson, Tellen D. Bennett, Hiroki Morizono, Daniel Eckrich, Ravi Jhaveri, Yungui Huang, Daksha Ranade, Nathan Pajor, Grace M. Lee, Christopher B. Forrest, L. Charles Bailey
medRxiv 2022.12.22.22283791; doi: https://doi.org/10.1101/2022.12.22.22283791
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
A machine learning-based phenotype for long COVID in children: an EHR-based study from the RECOVER program
Vitaly Lorman, Hanieh Razzaghi, Xing Song, Keith Morse, Levon Utidjian, Andrea J. Allen, Suchitra Rao, Colin Rogerson, Tellen D. Bennett, Hiroki Morizono, Daniel Eckrich, Ravi Jhaveri, Yungui Huang, Daksha Ranade, Nathan Pajor, Grace M. Lee, Christopher B. Forrest, L. Charles Bailey
medRxiv 2022.12.22.22283791; doi: https://doi.org/10.1101/2022.12.22.22283791

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Epidemiology
Subject Areas
All Articles
  • Addiction Medicine (280)
  • Allergy and Immunology (579)
  • Anesthesia (140)
  • Cardiovascular Medicine (1947)
  • Dentistry and Oral Medicine (252)
  • Dermatology (184)
  • Emergency Medicine (333)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (699)
  • Epidemiology (11102)
  • Forensic Medicine (8)
  • Gastroenterology (624)
  • Genetic and Genomic Medicine (3168)
  • Geriatric Medicine (309)
  • Health Economics (561)
  • Health Informatics (2042)
  • Health Policy (863)
  • Health Systems and Quality Improvement (782)
  • Hematology (310)
  • HIV/AIDS (682)
  • Infectious Diseases (except HIV/AIDS) (12720)
  • Intensive Care and Critical Care Medicine (707)
  • Medical Education (317)
  • Medical Ethics (92)
  • Nephrology (334)
  • Neurology (2986)
  • Nursing (164)
  • Nutrition (463)
  • Obstetrics and Gynecology (589)
  • Occupational and Environmental Health (614)
  • Oncology (1552)
  • Ophthalmology (477)
  • Orthopedics (185)
  • Otolaryngology (266)
  • Pain Medicine (202)
  • Palliative Medicine (57)
  • Pathology (403)
  • Pediatrics (912)
  • Pharmacology and Therapeutics (381)
  • Primary Care Research (355)
  • Psychiatry and Clinical Psychology (2785)
  • Public and Global Health (5591)
  • Radiology and Imaging (1094)
  • Rehabilitation Medicine and Physical Therapy (635)
  • Respiratory Medicine (760)
  • Rheumatology (339)
  • Sexual and Reproductive Health (311)
  • Sports Medicine (289)
  • Surgery (343)
  • Toxicology (48)
  • Transplantation (159)
  • Urology (132)