Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Characterizing Variability of EHR-Driven Phenotype Definitions

View ORCID ProfilePascal S. Brandt, Abel Kho, Yuan Luo, Jennifer A. Pacheco, View ORCID ProfileTheresa L. Walunas, Hakon Hakonarson, George Hripcsak, Cong Liu, Ning Shang, Chunhua Weng, Nephi Walton, David S. Carrell, Paul K. Crane, Eric Larson, Christopher G. Chute, Iftikhar Kullo, Robert Carroll, Josh Denny, Andrea Ramirez, Wei-Qi Wei, Jyoti Pathak, View ORCID ProfileLaura K. Wiley, Rachel Richesson, Justin B. Starren, View ORCID ProfileLuke V. Rasmussen
doi: https://doi.org/10.1101/2022.07.10.22277390
Pascal S. Brandt
1Department of Biomedical and Medical Education, University of Washington, Seattle, WA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pascal S. Brandt
  • For correspondence: psbrandt{at}gmail.com
Abel Kho
2Northwestern University Feinberg School of Medicine, Chicago, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yuan Luo
2Northwestern University Feinberg School of Medicine, Chicago, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jennifer A. Pacheco
2Northwestern University Feinberg School of Medicine, Chicago, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Theresa L. Walunas
2Northwestern University Feinberg School of Medicine, Chicago, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Theresa L. Walunas
Hakon Hakonarson
3Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
George Hripcsak
4Department of Biomedical Informatics, Columbia University, New York, NY
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Cong Liu
4Department of Biomedical Informatics, Columbia University, New York, NY
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ning Shang
4Department of Biomedical Informatics, Columbia University, New York, NY
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chunhua Weng
4Department of Biomedical Informatics, Columbia University, New York, NY
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nephi Walton
5Intermountain Precision Genomics, Intermountain Healthcare, St George, UT
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David S. Carrell
6Kaiser Permanente Washington Health Research Institute, Seattle, WA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Paul K. Crane
7Department of Medicine, University of Washington, Seattle, WA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eric Larson
8Institute for Learning and Brain Sciences, University of Washington, Seattle, WA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher G. Chute
9Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Iftikhar Kullo
10Department of Medicine, Mayo Clinic, Rochester, MN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Robert Carroll
11Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Josh Denny
12All of Us Research Program, National Institutes of Health, Bethesda, MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrea Ramirez
11Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wei-Qi Wei
13Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jyoti Pathak
14Weill Cornell Medicine, New York, NY
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Laura K. Wiley
15Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Laura K. Wiley
Rachel Richesson
16Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI
12All of Us Research Program, National Institutes of Health, Bethesda, MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Justin B. Starren
2Northwestern University Feinberg School of Medicine, Chicago, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Luke V. Rasmussen
2Northwestern University Feinberg School of Medicine, Chicago, IL
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Luke V. Rasmussen
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Objective Analyze a publicly available sample of rule-based phenotype definitions to characterize and evaluate the types of logical constructs used.

Materials & Methods A sample of 33 phenotype definitions used in research and published to the Phenotype KnowledgeBase (PheKB), that are represented using Fast Healthcare Interoperability Resources (FHIR) and Clinical Quality Language (CQL) was analyzed using automated analysis of the computable representation of the CQL libraries.

Results Most of the phenotype definitions include narrative descriptions and flowcharts, while few provide pseudocode or executable artifacts. Most use 4 or fewer medical terminologies. The number of codes used ranges from 5 to 6865, and value sets from 1 to 19. We found the most common expressions used were literal, data, and logical expressions. Aggregate and arithmetic expressions are the least common. Expression depth ranges from 4 to 27.

Discussion Despite the range of conditions, we found that all of the phenotype definitions consisted of logical criteria, representing both clinical and operational logic, and tabular data, consisting of codes from standard terminologies and keywords for natural language processing. The total number and variety of expressions is low, which may be to simplify implementation, or authors may limit complexity due to data availability constraints.

Conclusion The phenotypes analyzed show significant variation in specific logical, arithmetic and other operators, but are all composed of the same high-level components, namely tabular data and logical expressions. A standard representation for phenotype definitions should support these formats and be modular to support localization and shared logic.

Competing Interest Statement

Pascal S. Brandt was a consultant for Commure, Inc. while completing the work presented.

Funding Statement

This work was conducted during the third phase of the eMERGE Network, which was initiated and funded by the NHGRI through the following grants: U01HG008657 (Group Health Cooperative/University of Washington); U01HG008685 (Brigham and Women's Hospital); U01HG008672 (Vanderbilt University Medical Center); U01HG008666 (Cincinnati Children's Hospital Medical Center); U01HG006379 (Mayo Clinic); U01HG008679 (Geisinger Clinic); U01HG008680 (Columbia University Health Sciences); U01HG008684 (Children's Hospital of Philadelphia); U01HG008673 (Northwestern University); U01HG008701 (Vanderbilt University Medical Center serving as the Coordinating Center); U01HG008676 (Partners Healthcare/Broad Institute); U01HG008664 (Baylor College of Medicine); and U54MD007593 (Meharry Medical College). LVR, JBS, AK, YL, JAP, and TLW received additional support from NHGRI grant U01HG011169. PSB was funded by the Fulbright Foreign Student Program and the South African National Research Foundation.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

All data produced in the present study are available upon reasonable request to the authors

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted July 10, 2022.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Characterizing Variability of EHR-Driven Phenotype Definitions
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Characterizing Variability of EHR-Driven Phenotype Definitions
Pascal S. Brandt, Abel Kho, Yuan Luo, Jennifer A. Pacheco, Theresa L. Walunas, Hakon Hakonarson, George Hripcsak, Cong Liu, Ning Shang, Chunhua Weng, Nephi Walton, David S. Carrell, Paul K. Crane, Eric Larson, Christopher G. Chute, Iftikhar Kullo, Robert Carroll, Josh Denny, Andrea Ramirez, Wei-Qi Wei, Jyoti Pathak, Laura K. Wiley, Rachel Richesson, Justin B. Starren, Luke V. Rasmussen
medRxiv 2022.07.10.22277390; doi: https://doi.org/10.1101/2022.07.10.22277390
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Characterizing Variability of EHR-Driven Phenotype Definitions
Pascal S. Brandt, Abel Kho, Yuan Luo, Jennifer A. Pacheco, Theresa L. Walunas, Hakon Hakonarson, George Hripcsak, Cong Liu, Ning Shang, Chunhua Weng, Nephi Walton, David S. Carrell, Paul K. Crane, Eric Larson, Christopher G. Chute, Iftikhar Kullo, Robert Carroll, Josh Denny, Andrea Ramirez, Wei-Qi Wei, Jyoti Pathak, Laura K. Wiley, Rachel Richesson, Justin B. Starren, Luke V. Rasmussen
medRxiv 2022.07.10.22277390; doi: https://doi.org/10.1101/2022.07.10.22277390

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (434)
  • Allergy and Immunology (760)
  • Anesthesia (222)
  • Cardiovascular Medicine (3316)
  • Dentistry and Oral Medicine (366)
  • Dermatology (282)
  • Emergency Medicine (480)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1175)
  • Epidemiology (13403)
  • Forensic Medicine (19)
  • Gastroenterology (900)
  • Genetic and Genomic Medicine (5182)
  • Geriatric Medicine (483)
  • Health Economics (786)
  • Health Informatics (3286)
  • Health Policy (1146)
  • Health Systems and Quality Improvement (1199)
  • Hematology (432)
  • HIV/AIDS (1024)
  • Infectious Diseases (except HIV/AIDS) (14657)
  • Intensive Care and Critical Care Medicine (917)
  • Medical Education (478)
  • Medical Ethics (128)
  • Nephrology (526)
  • Neurology (4957)
  • Nursing (263)
  • Nutrition (735)
  • Obstetrics and Gynecology (889)
  • Occupational and Environmental Health (797)
  • Oncology (2531)
  • Ophthalmology (730)
  • Orthopedics (284)
  • Otolaryngology (348)
  • Pain Medicine (323)
  • Palliative Medicine (90)
  • Pathology (547)
  • Pediatrics (1308)
  • Pharmacology and Therapeutics (552)
  • Primary Care Research (559)
  • Psychiatry and Clinical Psychology (4225)
  • Public and Global Health (7526)
  • Radiology and Imaging (1717)
  • Rehabilitation Medicine and Physical Therapy (1022)
  • Respiratory Medicine (982)
  • Rheumatology (480)
  • Sexual and Reproductive Health (500)
  • Sports Medicine (425)
  • Surgery (551)
  • Toxicology (73)
  • Transplantation (237)
  • Urology (206)