Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Open-source computational pipeline automatically flags instances of acute respiratory distress syndrome from electronic health records

View ORCID ProfileFélix L. Morales, View ORCID ProfileFeihong Xu, View ORCID ProfileHyojun Ada Lee, View ORCID ProfileHelio Tejedor Navarro, View ORCID ProfileMeagan A. Bechel, Eryn L. Cameron, Jesse Kelso, View ORCID ProfileCurtis H. Weiss, View ORCID ProfileLuís A. Nunes Amaral
doi: https://doi.org/10.1101/2024.05.21.24307715
Félix L. Morales
1Department of Engineering Science and Applied Mathematics, Northwestern University, Evanston, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Félix L. Morales
Feihong Xu
2Interdepartmental Biological Sciences Program, Northwestern University, Evanston, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Feihong Xu
Hyojun Ada Lee
1Department of Engineering Science and Applied Mathematics, Northwestern University, Evanston, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Hyojun Ada Lee
Helio Tejedor Navarro
1Department of Engineering Science and Applied Mathematics, Northwestern University, Evanston, IL
3Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Helio Tejedor Navarro
Meagan A. Bechel
4Medical Scientist Training Program, Northwestern University Feinberg School of Medicine, Chicago, IL
5Department of Radiology, Emory University, Atlanta, GA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Meagan A. Bechel
Eryn L. Cameron
6Department of Medicine, Endeavor Health, Evanston, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jesse Kelso
6Department of Medicine, Endeavor Health, Evanston, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Curtis H. Weiss
1Department of Engineering Science and Applied Mathematics, Northwestern University, Evanston, IL
7Division of Pulmonary and Critical Care Medicine, Endeavor Health, Evanston, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Curtis H. Weiss
Luís A. Nunes Amaral
1Department of Engineering Science and Applied Mathematics, Northwestern University, Evanston, IL
3Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL
8Department of Physics and Astronomy, Northwestern University, Evanston, IL
9Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL
10NSF-Simons National Institute for Theoretical and Mathematical Biology, Chicago, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Luís A. Nunes Amaral
  • For correspondence: amaral{at}northwestern.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Physicians, particularly intensivists, face information overload and decision fatigue, underscoring the need for automated diagnostic tools. Acute Respiratory Distress Syndrome (ARDS) affects over 10% of critical care patients, with over 40% mortality rate, yet is only recognized in 30-70% of cases in clinical settings. We present a reproducible computational pipeline that automates ARDS adjudication in retrospective datasets of mechanically ventilated adults, implementing the Berlin Definition via natural language processing and classification algorithms. We used labeled chest imaging reports from two hospitals to train an XGBoost model to detect bilateral infiltrates, and a labeled subset of attending physician notes from one hospital to train another XGBoost model to detect a pneumonia diagnosis. Both models achieve high discriminative performance on test sets—an area under the receiver operating characteristic curve (AUROC) of 0.88 for adjudicating bilateral infiltrates on chest imaging reports, and an AUROC of 0.87 for detecting pneumonia on attending physician notes. We integrated these models with rule-based components and validated the entire pipeline on a subset of healthcare encounters from a third hospital (MIMIC-III). We find a sensitivity of 93.5% in adjudicating ARDS — far surpassing the 22.6% ARDS documentation rate we found for this cohort — along with a false positive rate of 17.4%. We conclude that our reproducible, automated pipeline holds promise for improving ARDS recognition and could aid clinical practice through real-time EHR integration.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

Feihong Xu was supported in part by the National Institutes of Health Training Grant (T32GM008449) through Northwestern University's Biotechnology Training Program. Curtis H. Weiss was supported by the National Heart Lung and Blood Institute (R01HL140362 and K23HL118139). Luís A. Nunes Amaral was supported by the National Heart Lung and Blood Institute (R01HL140362). Luís A. Nunes Amaral and Feihong Xu are supported by the National Institute of Allergy and Infectious Diseases (U19AI135964).

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Institutional Review Board of Northwestern University gave ethical approval for this work (STU00208049). Institutional Review Board of Endeavor Health gave ethical approval for this work (EH17-325).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • 1. Clarify we used of ARDS documentation rate instead of ARDS recognition rate. 2. Add tables showing how the pipeline performs with optimal probability cutoffs. 3. Clarify what our stance is with the false positive vs. false negative tradeoff, and that this is a choice for every health system to make. 4. Switch from using the term "whole training set" to "development set" to refer to the data used for developing the model for bilateral infiltrates. 5. Clarify how we calculated the PF ratio for Hospital A (2013) and MIMIC (2001-12). 6. Explain our rationale for a 48-h window between hypoxemia and bilateral infiltrates. 7. Caveat that we do not include patients on High Flow oxygenation due to dataset predating the 2024 Berlin Definition update. 8. Clarify role of Table 1. 9. Sharpen what our contribution is to the literature. 10. Address the major limitation of not having chest image data available for the study, and only having one radiologist report per image study (unavailability of a better gold standard). 11. Clarify that we had chest computed tomography and X-ray image reports available, not just chest X-rays. Also, that we did not include lung ultrasound images or reports. 12. Demonstrate that the Bilateral Infiltrates model does well at a patient-level (not just at a report-level). 13. Demonstrate that the pipeline does flag early ARDS (median time post-intubation: 3.9 hours). 14. Demonstrate that the pipeline differentiates a patient population with clinical traits consistent with ARDS. 14. Explore reasons for our physician raters disagreeing in 8% of their labels. 15. Change layout of confusion matrices.

Data Availability

MIMIC-III is available on PhysioNet (https://doi.org/10.13026/C2XW26). Data from Hospital A (2013), Hospital A (2016), and Hospital B (2017-18) are IRB-protected. Upon publication, we will only release de-identified data from the Hospital (2013) cohort needed to reproduce Figure 8 at the ARCH repository hosted by Northwestern University (https://arch.library.northwestern.edu).

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted March 01, 2025.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Open-source computational pipeline automatically flags instances of acute respiratory distress syndrome from electronic health records
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Open-source computational pipeline automatically flags instances of acute respiratory distress syndrome from electronic health records
Félix L. Morales, Feihong Xu, Hyojun Ada Lee, Helio Tejedor Navarro, Meagan A. Bechel, Eryn L. Cameron, Jesse Kelso, Curtis H. Weiss, Luís A. Nunes Amaral
medRxiv 2024.05.21.24307715; doi: https://doi.org/10.1101/2024.05.21.24307715
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Open-source computational pipeline automatically flags instances of acute respiratory distress syndrome from electronic health records
Félix L. Morales, Feihong Xu, Hyojun Ada Lee, Helio Tejedor Navarro, Meagan A. Bechel, Eryn L. Cameron, Jesse Kelso, Curtis H. Weiss, Luís A. Nunes Amaral
medRxiv 2024.05.21.24307715; doi: https://doi.org/10.1101/2024.05.21.24307715

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (576)
  • Allergy and Immunology (867)
  • Anesthesia (306)
  • Cardiovascular Medicine (4480)
  • Dentistry and Oral Medicine (449)
  • Dermatology (385)
  • Emergency Medicine (614)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1528)
  • Epidemiology (15275)
  • Forensic Medicine (31)
  • Gastroenterology (1133)
  • Genetic and Genomic Medicine (6643)
  • Geriatric Medicine (671)
  • Health Economics (1006)
  • Health Informatics (4602)
  • Health Policy (1378)
  • Health Systems and Quality Improvement (1622)
  • Hematology (544)
  • HIV/AIDS (1275)
  • Infectious Diseases (except HIV/AIDS) (15959)
  • Intensive Care and Critical Care Medicine (1110)
  • Medical Education (626)
  • Medical Ethics (147)
  • Nephrology (674)
  • Neurology (6691)
  • Nursing (346)
  • Nutrition (1006)
  • Obstetrics and Gynecology (1152)
  • Occupational and Environmental Health (961)
  • Oncology (3369)
  • Ophthalmology (988)
  • Orthopedics (370)
  • Otolaryngology (421)
  • Pain Medicine (437)
  • Palliative Medicine (131)
  • Pathology (668)
  • Pediatrics (1703)
  • Pharmacology and Therapeutics (699)
  • Primary Care Research (717)
  • Psychiatry and Clinical Psychology (5494)
  • Public and Global Health (9284)
  • Radiology and Imaging (2223)
  • Rehabilitation Medicine and Physical Therapy (1375)
  • Respiratory Medicine (1201)
  • Rheumatology (598)
  • Sexual and Reproductive Health (720)
  • Sports Medicine (535)
  • Surgery (720)
  • Toxicology (100)
  • Transplantation (290)
  • Urology (266)