Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Development of a Claims-Based Computable Phenotype for Ulcerative Colitis Flares

View ORCID ProfileDaniel Copeland, View ORCID ProfileJayson S. Marwaha, Daniel Wong, William Yuan, View ORCID ProfileMichelle N. Fakler, View ORCID ProfileChris J. Kennedy, Brendin Beaulieu-Jones, View ORCID ProfileVitaliy Poylin, View ORCID ProfileJoseph Feuerstein, View ORCID ProfileGabriel A. Brat
doi: https://doi.org/10.1101/2025.01.26.25321138
Daniel Copeland
1Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA
MD, MSc
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Daniel Copeland
Jayson S. Marwaha
1Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA
2Department of Biomedical Informatics, Harvard Medical School, Boston, MA
MD, MSc
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jayson S. Marwaha
Daniel Wong
1Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
William Yuan
1Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA
2Department of Biomedical Informatics, Harvard Medical School, Boston, MA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michelle N. Fakler
1Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michelle N. Fakler
Chris J. Kennedy
1Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA
2Department of Biomedical Informatics, Harvard Medical School, Boston, MA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Chris J. Kennedy
Brendin Beaulieu-Jones
1Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA
2Department of Biomedical Informatics, Harvard Medical School, Boston, MA
MD, MBA, MBI
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Vitaliy Poylin
3Division of Colon and Rectal Surgery, Department of Surgery, Northwestern University, Chicago, IL
MD, FACS, FASCRS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vitaliy Poylin
Joseph Feuerstein
4Division of Gastroenterology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Joseph Feuerstein
Gabriel A. Brat
1Department of Surgery, Beth Israel Deaconess Medical Center, Boston, MA
2Department of Biomedical Informatics, Harvard Medical School, Boston, MA
MD, MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gabriel A. Brat
  • For correspondence: gbrat{at}bidmc.harvard.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Background and Aims Several conditions exist that do not have their own unique diagnosis code in widely-used clinical terminologies, making them difficult to track and study. Acute severe ulcerative colitis (ASUC) is one such condition. There is no automated method to identify patients admitted for ASUC from observational data, nor any specific billing or diagnosis code for ASUC. Accurate, automated, large-scale identification of hospital admissions for non-coded conditions like ASUC may enable further research into them.

Methods We performed a retrospective cohort study of patients with a history of ulcerative colitis (UC) admitted to a single academic institution from 2014-2019. Clinicians at our institution performed a chart review of these admissions to determine if each was due to a true episode of ASUC or not. Logistic regression, random forest (RF), and support vector machine (SVM) models were trained upon administrative claims data for all admissions.

Results 268 ASUC admissions and 3,725 non-ASUC admissions among UC patients were included. Our RF model exhibited the best performance, correctly classifying 95.5% of admissions as either ASUC or non-ASUC, with a validation AUROC of 0.96 (95% CI 0.94-0.98; AUPRC 0.73). The model had a sensitivity of 81.5% and specificity of 96.5%. The five most important features in the model were endoscopy of sigmoid colon, length of stay, age, endoscopy of rectum, and abdominal x-ray.

Conclusions There is currently no modality by which ASUC, which does not have its own unique diagnosis code, can be identified from claims databases in a scalable fashion for research or clinical purposes. We have developed a machine learning-based model that identifies clinically significant ASUC and reliably distinguishes them from admissions for non-ASUC reasons among UC patients. The ability to automatically curate large, accurate datasets of non-coded conditions like ASUC episodes can serve as the basis of large-scale analyses to maximize our ability to learn from real-world data, enable future research, and better understand these diseases.

Summary There is currently no accurate way to identify, track, or study acute severe ulcerative colitis (ASUC) using administrative claims datasets. We have built a machine learning model to identify ASUC from claims data to enable large-scale studies on this condition.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

JM is supported by T15LM007092 from the NLM/NIH and the Biomedical Informatics and Data Science Research Training (BIRT) Program of Harvard University. WY is supported by T32HD040128 from the NICHD/NIH.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This study was reviewed and approved by the Institutional Review Board (IRB) of Beth Israel Deaconess Medical Center (BIDMC).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • Abbreviations: Acute severe ulcerative colitis (ASUC); Electronic health record (EHR)

  • Conference Presentation: 14th Annual Academic Surgical Congress, Houston, TX, USA

  • Conflicts of Interest and Financial Disclosures: None

  • Support: National Library of Medicine/National Institutes of Health; National Institute of Child Health and Human Development/National Institutes of Health; Harvard University

Data Availability Statement

The data underlying this article cannot be shared publicly for the privacy of the patients who were included in this study, due to the data containing protected health information on these patients.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted January 28, 2025.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Development of a Claims-Based Computable Phenotype for Ulcerative Colitis Flares
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Development of a Claims-Based Computable Phenotype for Ulcerative Colitis Flares
Daniel Copeland, Jayson S. Marwaha, Daniel Wong, William Yuan, Michelle N. Fakler, Chris J. Kennedy, Brendin Beaulieu-Jones, Vitaliy Poylin, Joseph Feuerstein, Gabriel A. Brat
medRxiv 2025.01.26.25321138; doi: https://doi.org/10.1101/2025.01.26.25321138
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Development of a Claims-Based Computable Phenotype for Ulcerative Colitis Flares
Daniel Copeland, Jayson S. Marwaha, Daniel Wong, William Yuan, Michelle N. Fakler, Chris J. Kennedy, Brendin Beaulieu-Jones, Vitaliy Poylin, Joseph Feuerstein, Gabriel A. Brat
medRxiv 2025.01.26.25321138; doi: https://doi.org/10.1101/2025.01.26.25321138

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (576)
  • Allergy and Immunology (868)
  • Anesthesia (306)
  • Cardiovascular Medicine (4483)
  • Dentistry and Oral Medicine (449)
  • Dermatology (385)
  • Emergency Medicine (615)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1528)
  • Epidemiology (15283)
  • Forensic Medicine (31)
  • Gastroenterology (1134)
  • Genetic and Genomic Medicine (6651)
  • Geriatric Medicine (671)
  • Health Economics (1006)
  • Health Informatics (4606)
  • Health Policy (1378)
  • Health Systems and Quality Improvement (1624)
  • Hematology (545)
  • HIV/AIDS (1276)
  • Infectious Diseases (except HIV/AIDS) (15965)
  • Intensive Care and Critical Care Medicine (1111)
  • Medical Education (626)
  • Medical Ethics (147)
  • Nephrology (675)
  • Neurology (6699)
  • Nursing (346)
  • Nutrition (1006)
  • Obstetrics and Gynecology (1153)
  • Occupational and Environmental Health (961)
  • Oncology (3370)
  • Ophthalmology (989)
  • Orthopedics (370)
  • Otolaryngology (421)
  • Pain Medicine (437)
  • Palliative Medicine (131)
  • Pathology (670)
  • Pediatrics (1704)
  • Pharmacology and Therapeutics (700)
  • Primary Care Research (717)
  • Psychiatry and Clinical Psychology (5497)
  • Public and Global Health (9288)
  • Radiology and Imaging (2225)
  • Rehabilitation Medicine and Physical Therapy (1375)
  • Respiratory Medicine (1202)
  • Rheumatology (598)
  • Sexual and Reproductive Health (721)
  • Sports Medicine (536)
  • Surgery (722)
  • Toxicology (100)
  • Transplantation (290)
  • Urology (267)