Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Machine learning models to identify patient and microbial genetic factors associated with carbapenem-resistant Klebsiella pneumoniae infection

View ORCID ProfileZena Lapp, View ORCID ProfileJennifer H Han, View ORCID ProfileJenna Wiens, View ORCID ProfileEllie JC Goldstein, View ORCID ProfileEbbing Lautenbach, View ORCID ProfileEvan S Snitkin
doi: https://doi.org/10.1101/2020.07.06.20147306
Zena Lapp
aDepartment of Computational Medicine and Bioinformatics, University of Michigan; 1510A MSRB I, 1150 W. Medical Center Dr., Ann Arbor, MI, 48109-5680
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Zena Lapp
Jennifer H Han
bGlaxoSmithKline; 14200 Shady Grove Road, Rockville, MD 20850
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jennifer H Han
Jenna Wiens
cDepartment of Electrical Engineering and Computer Science, University of Michigan; 3765 Beyster, 2260 Hayward Street, Ann Arbor, MI 48109
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jenna Wiens
Ellie JC Goldstein
dR M Alden Research Laboratory
eDavid Geffen School of Medicine, University of California, Los Angeles; 2021 Santa Monica Blvd. #640-E, Santa Monica, CA 90404-2208
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ellie JC Goldstein
Ebbing Lautenbach
fDepartment of Medicine (Infectious Diseases), Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania; 502A Johnson Pavilion, 3610 Hamilton Walk, Philadelphia, PA 19104-6073
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ebbing Lautenbach
Evan S Snitkin
gDepartment of Microbiology and Immunology, Department of Internal Medicine/Division of Infectious Diseases, University of Michigan, Ann Arbor, Michigan; 1520D MSRB I, 1150 W. Medical Center Dr., Ann Arbor, MI, 48109-5680
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Evan S Snitkin
  • For correspondence: esnitkin@med.umich.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Carbapenem-resistant Klebsiella pneumoniae (CRKP) is a critical-priority antibiotic resistance threat that has emerged over the past several decades, spread across the globe, and accumulated resistance to last-line antibiotic agents. While CRKP infections are associated with high mortality, only a small subset of patients acquiring CRKP colonization will develop clinical infection. Here, we sought to determine the relative importance of patient characteristics and CRKP genetic background in determining patient risk of infection. Machine learning models classifying colonization vs. infection were built using whole-genome sequences and clinical metadata from a comprehensive set of 331 CRKP isolates collected across 21 long-term acute care hospitals over the course of a year. Model performance was evaluated based on area under the receiver operating characteristics curve (AUROC) on held-out test data. We found that patient and genomic features were predictive of clinical CRKP infection to similar extents (AUROC IQRs: patient=0.59-0.68, genomic=0.55-0.61, combined=0.62-0.68). Patient predictors of infection included the presence of indwelling devices, kidney disease and length of stay. Genomic predictors of infection included presence of the ICEKp10 mobile genetic element carrying the yersiniabactin iron acquisition system, and disruption of an O-antigen biosynthetic gene in a sub-lineage of the epidemic ST258 clone. Altered O-antigen biosynthesis increased association with the respiratory tract, and subsequent ICEKp10 acquisition was associated with increased virulence. These results highlight the potential of integrated models including both patient and microbial features to provide a more holistic understanding of patient clinical trajectories.

Importance Multidrug resistant organisms, such as carbapenem-resistant Klebsiella pneumoniae (CRKP), colonize alarmingly large fractions of patients in endemic regions, but only a subset of patients develop life-threatening infections. While patient characteristics influence risk for infection, the relative contribution of microbial genetic background to patient risk remains unclear. We used machine learning to determine whether patient and/or microbial characteristics can discriminate between CRKP colonization vs. infection across multiple healthcare facilities and found that both patient and microbial factors were predictive. Examination of informative microbial genetic features revealed features associated with respiratory colonization and higher rates of infection. The methods and findings presented here provide a foundation for future epidemiological, clinical, and biological studies to better understand bacterial infections and clinical outcomes.

Competing Interest Statement

JHH was employed at the University of Pennsylvania during the conduct of this study. She is currently an employee of, and holds shares in, the GSK group of companies.

Funding Statement

This research was supported by a CDC Cooperative Agreement FOA #CK16-004-Epicenters for the Prevention of Healthcare Associated Infections, and the National Institutes of Health R01 AI139240-01 and 1R01 AI148259-01. ZL received support from the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE 1256260. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The funding bodies had no role in the design of the study or collection, analysis, and interpretation of data, or in writing the manuscript.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The study that collected this data was reviewed and approved by the Institutional Review Board of the University of Pennsylvania with a waiver of informed consent.

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

All code and data that is not protected health information is on GitHub.

https://github.com/Snitkin-Lab-Umich/ml-crkp-infection-manuscript

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted November 04, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Machine learning models to identify patient and microbial genetic factors associated with carbapenem-resistant Klebsiella pneumoniae infection
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Machine learning models to identify patient and microbial genetic factors associated with carbapenem-resistant Klebsiella pneumoniae infection
Zena Lapp, Jennifer H Han, Jenna Wiens, Ellie JC Goldstein, Ebbing Lautenbach, Evan S Snitkin
medRxiv 2020.07.06.20147306; doi: https://doi.org/10.1101/2020.07.06.20147306
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Machine learning models to identify patient and microbial genetic factors associated with carbapenem-resistant Klebsiella pneumoniae infection
Zena Lapp, Jennifer H Han, Jenna Wiens, Ellie JC Goldstein, Ebbing Lautenbach, Evan S Snitkin
medRxiv 2020.07.06.20147306; doi: https://doi.org/10.1101/2020.07.06.20147306

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Infectious Diseases (except HIV/AIDS)
Subject Areas
All Articles
  • Addiction Medicine (70)
  • Allergy and Immunology (166)
  • Anesthesia (49)
  • Cardiovascular Medicine (448)
  • Dentistry and Oral Medicine (80)
  • Dermatology (55)
  • Emergency Medicine (157)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (188)
  • Epidemiology (5199)
  • Forensic Medicine (3)
  • Gastroenterology (193)
  • Genetic and Genomic Medicine (746)
  • Geriatric Medicine (76)
  • Health Economics (211)
  • Health Informatics (690)
  • Health Policy (350)
  • Health Systems and Quality Improvement (221)
  • Hematology (98)
  • HIV/AIDS (161)
  • Infectious Diseases (except HIV/AIDS) (5793)
  • Intensive Care and Critical Care Medicine (355)
  • Medical Education (101)
  • Medical Ethics (25)
  • Nephrology (80)
  • Neurology (754)
  • Nursing (43)
  • Nutrition (129)
  • Obstetrics and Gynecology (141)
  • Occupational and Environmental Health (230)
  • Oncology (473)
  • Ophthalmology (149)
  • Orthopedics (38)
  • Otolaryngology (93)
  • Pain Medicine (39)
  • Palliative Medicine (19)
  • Pathology (138)
  • Pediatrics (223)
  • Pharmacology and Therapeutics (135)
  • Primary Care Research (96)
  • Psychiatry and Clinical Psychology (851)
  • Public and Global Health (1986)
  • Radiology and Imaging (342)
  • Rehabilitation Medicine and Physical Therapy (154)
  • Respiratory Medicine (282)
  • Rheumatology (93)
  • Sexual and Reproductive Health (72)
  • Sports Medicine (74)
  • Surgery (107)
  • Toxicology (25)
  • Transplantation (29)
  • Urology (39)