TY - JOUR T1 - Predicting antibiotic resistance in hospitalized patients by applying machine learning to electronic medical records JF - medRxiv DO - 10.1101/2020.06.03.20120535 SP - 2020.06.03.20120535 AU - Ohad Lewin-Epstein AU - Shoham Baruch AU - Lilach Hadany AU - Gideon Y Stein AU - Uri Obolski Y1 - 2020/01/01 UR - http://medrxiv.org/content/early/2020/06/04/2020.06.03.20120535.abstract N2 - Background Computerized decision support systems are becoming increasingly prevalent with advances in data collection and machine learning algorithms. However, they are scarcely used for empiric antibiotic therapy. Here we accurately predict the antibiotic resistance profiles of bacterial infections of hospitalized patients using machine learning algorithms applied to patients’ electronic medical records.Methods The data included antibiotic resistance results of bacterial cultures from hospitalized patients, alongside their electronic medical records. Five antibiotics were examined: Ceftazidime (n=2942), Gentamicin (n=4360), Imipenem (n=2235), Ofloxacin (n=3117) and Sulfamethoxazole-Trimethoprim (n=3544). We applied lasso logistic regression, neural networks, gradient boosted trees, and an ensemble combining all three algorithms, to predict antibiotic resistance. Variable influence was gauged by permutation tests and Shapely Additive Explanations analysis.Results The ensemble model outperformed the separate models and produced accurate predictions on a test set data. When no knowledge regarding the infecting bacterial species was assumed, the ensemble model yielded area under the receiver-operating-characteristic (auROC) scores of 0.73-0.79, for different antibiotics. Including information regarding the bacterial species improved the auROCs to 0.8-0.88. The effects of different variables on the predictions were assessed and found consistent with previously identified risk factors for antibiotic resistance.Conclusions Our study demonstrates the potential of machine learning models to accurately predict antibiotic resistance of bacterial infections of hospitalized patients. Moreover, we show that rapid information regarding the infecting bacterial species can improve predictions substantially. The implementation of such systems should be seriously considered by clinicians to aid correct empiric therapy and to potentially reduce antibiotic misuse.40-word summary Machine learning models were applied to large and diverse datasets of medical records of hospitalized patients, to predict antibiotic resistance profiles of bacterial infections. The models achieved high accuracy predictions and interpretable results regarding the drivers of antibiotic resistance.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis project was supported by the Clore Foundation Scholars Programme (OLE). This research was supported by GPUs donated by the NVIDIA corporation grant program (LH and UO). Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study was approved by the Helsinki Committee of Rabin Medical Center. All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesPatient data is not made available. Data regarding the analysis is fully available in the supplementary material. ER -