Predicting Infection-related Consultations on Intensive Care Units - Development of a Machine Learning Prediction Model

Objectives: Infection-related consultations on intensive care units (ICU) build an important cornerstone in the care for critically ill patients with (suspected) infections. The positive impact of consultations on quality of care and clinical outcome has previously been demonstrated. However, timing is essential and to date consultations are typically event-triggered and reactive. Here, we investigate a proactive approach by predicting infection-related consultations using machine learning models and routine electronic health records (EHR). Methods: We used data from a mixed ICU at a large academic tertiary care hospital including 9684 admissions. EHR data comprised demographics, laboratory results, point-of-care tests, vital signs, line placements, and prescriptions. Consultations were performed by clinical microbiologists. The predicted target outcome (occurrence of a consultation) was modelled using random forest (RF), gradient boosting machines (RF), and long short-term memory neural networks (LSTM). Results: Overall, 7.8 % of all admission received a consultation. Time-sensitive modelling approaches and increasing numbers of patient features (parameters) performed better than static approaches in predicting infection-related consultations at the ICU. Splitting a patient admission into eight-hour intervals and using LSTM resulted in the accurate prediction of consultations up to eight hours in advance with an area under the receiver operator curve of 0.921 and an area under precision recall curve of 0.673. Conclusion: We could successfully predict of infection-related consultations on an ICU up to eight hours in advance, even without using classical triggers, such as (interim) microbiology reports. Predicting this key event can potentially streamline ICU and consultant workflows and improve care and outcome for critically ill patients with (suspected) infections.


Introduction
Intensive care units (ICU) are hospital departments where very complex care is delivered.
This complexity unfolds on multiple levels. Foremost, on the medical level, ICUs are designed for the most severely ill patients. This requires advanced medical technology for interventions, e.g. mechanical ventilation and continuous patient monitoring. The complexity in patient care is furthermore increased by the possibility that the patient's status can quickly deteriorate. This requires prompt actions, such as the immediate administration of fluids and antimicrobials within the first hour in suspected sepsis patients [1]. On the organizational level, highly skilled ICU personnel are needed to run these units and to care for patients in continuously rotating shifts. In addition, external specialists come to the ICU for consultations. For patients with a (suspected) infection additional input on clinical microbiology, infectious diseases (ID), or antimicrobial stewardship is usually consultationbased.
The positive impact of infection-related consultations on patient care and outcome has been shown in several scenarios. A systematic review and meta-analysis of observational studies demonstrated an improvement of quality of care and reduction of mortality in patients with Staphylococcus aureus bacteraemia through infection-related consultations [2]. Face-toface/bedside visits showed an improved quality of care for bacteraemic patients compared to phone-based consultations alone [3,4]. Consultations in a face-to-face or handshake approach were also found to be positively associated with quality of care in antimicrobial stewardship studies [5,6]. However, the optimal timing and trigger for infection-related . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021. identification of a need for a consultation by the ICU team followed by contacting the specialist (scheduled or urgent consultation); ii) identification of a need for a consultation in the microbiological laboratory, e.g. isolation of a multi-drug resistant pathogen from a patient's specimen and notification of the respective specialist; iii) by-catch of a patient in need for consultation during an already ongoing consultation round; iv) routine monitoring of newly admitted patients by a clinical microbiologist/ID specialist. The timeliness of a consultation is a key element of high-quality consultations [7]. This can be supported and improved through process automation. Automatic notification from the microbiology laboratory to trigger ID consultations can result in a significantly decreased delay to consultation and improved quality of care [8,9]. However, while a laboratory-based approach is usually easy to implement and already part of the routine in many settings, automating the identification of patients in need of a consultation on the ICU side is more challenging.
The identification of a need for a consultation by the ICU team is based on a plethora of available information. A patient's history, lab results, vital signs, clinical examination, clinical risk scores, and the patient's development over time produce large amounts of data points.
All of this information is taken into account by the ICU team with their expertise and experience for any clinical decision-making process. The framework of this process and its complexity is under research and the list above is far from complete [10][11][12][13]. On average intensivists make 8.9 decisions per patient per day [12]. The decision to initiate an external, infection-related consultation could, for a simplified example, be driven by several combined factors: changes in infection-related lab results such as an increase in C-reactive protein (CRP); deteriorating vital signs (e.g. increase in heart rate and decrease in blood pressure); infection-suspected results in imaging (e.g. pulmonary infiltrates in chest x-ray images); and the lack of a (timely) response to administered antimicrobial therapy. While individual clinical reasoning of ICU team members is more difficult to store in electronic health records (EHR), large numbers of data points, such as the example above, are generated at ICUs every second. This data could be used to automate, inform, and support the process of notification and triggering of infection-related consultations.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 6, 2021. ; https://doi.org/10.1101/2021.03.31.21254530 doi: medRxiv preprint Machine learning, statistical tools to identify patterns in large amounts of data, could be ideally suited for the task to support the triggering of infection-related consultations. The use of machine learning in infectious diseases and microbiology is increasing, covers a wide range of infection-related aspects, and is often based on ICU data [14,15]. A potential utility of machine learning was established for detecting bacteraemia and sepsis or post-surgery complications [14,15]. However, the notification, initiation, or triggering of infection-related consultations has not yet been the subject of machine learning research. Therefore, this study aimed at identifying and predicting the need for an infection-related consultation in ICU patients by developing a machine learning model using data routinely collected in the EHR.

Study setting
This study was performed at the University Medical Center Groningen (UMCG), a 1,339-bed academic tertiary care hospital in the North of the Netherlands. Ethical approval was obtained from the institutional review board of the UMCG and informed consent was waived due to the retrospective observational nature of the study (METc 2018/081). The study included patients admitted to the 42-bed multidisciplinary adult ICU at the UMCG between March 3, 2014 and December 2, 2017, based on the use and database availability of the local EHR system during this time. Patients were included in this study if they were registered in this EHR system, did not object to their use of data in the UMCG objection registry, and were at least 18 years old at the time of ICU admission. All patient data was anonymised before analysis. The study design including the data processing is shown in . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 6, 2021. ; https://doi.org/10.1101/2021.03.31.21254530 doi: medRxiv preprint Figure 1. Study design and data processing for three different data sources (hospital database, ICU database, and microbiology database). Of note: (interim) microbiology reports from were not included in the modelling process. Standard data cleaning processes are not shown.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021. ; https://doi.org/10.1101/2021.03.31.21254530 doi: medRxiv preprint Raw vital signs were cleaned for physiologically impossible values (e.g. systolic arterial blood pressure smaller than diastolic arterial blood pressure), which usually indicated faulty measurements (e.g. through kinked lines). Line placement data was transformed to a binary feature indicating the presence or absence of an intravenous or arterial line per minute and line type. Prescription data were filtered to include prescriptions of the categories: antimicrobials (identified through agent specific codes in the EHR), blood products, circulatory/diuresis, colloids, crystalloids, haemostasis, inhalation, cardiopulmonary resuscitation, and sedatives. Selective digestive decontamination (SDD) is a standard procedure in our hospital and was thus indicated by a distinct variable to avoid confusion with antimicrobial agents for other purposes [16]. All prescriptions were transformed to binary features indicating the presence of a prescription per minute, agent, and type of administration. Additional binary features were introduced indicating the presence of a prescription per prescription category and type of administration.
Missing values were filled with the last available data point. This carry-forward imputation process was used to mimic common physician's behaviour. Remaining missing numeric values were imputed using the median of the feature's overall distribution. Near zero variance predictors (features not showing a significant variance) were dropped from the dataset at a ratio of 95/5 for the most common to the second most common value. All data were merged per minute of ICU stay. Each patient's stay was treated as an independent stay but a readmission feature was introduced.

Cohort investigation
Descriptive analyses were stratified by consultation status (consultation vs. no consultation).
Baseline patient characteristics were assessed and compared with Fisher's exact test for categorical features and Student's t-test for continuous features. Logistic regression was used to create an explanatory model for clinical microbiology consultations using the baseline features. Odds ratio with 95% confidence interval were used in the results presentation.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021. ; https://doi.org/10.1101/2021.03.31.21254530 doi: medRxiv preprint Modelling process Three different modelling approaches were used and evaluated in this study to predict an ICU patient receiving a clinical microbiology consultation. The first approach (at-the-door model) was applied using patient features available at the time of admission: gender, age, body mass index, weekend admission, mechanical ventilation at admission, sending speciality, planned admission, readmission, and admission via the operation room. Random forest (RF) and gradient boosting machines (GBM) were used to model the occurrence of a consultation.
The second modelling approach (collapsed model) also used RF and GBM and additional procedural features such as the presence of medication, lines, or performed diagnostics (see Appendix 1). Data was aggregated taking the mean and the standard deviation over the available time series to predict the target event (consultation). Both approaches applied a 80-20 split to the dataset to generate the training and test set. 10-fold cross validation was used to select the optimal hyperparameters of the models. The final performance was evaluated against the held-out test set.
The third approach used a long short-term memory neural network (LSTM) to model the target outcome (time-series model). LSTM is an artificial recurrent neural network with the advantage to include memory and feedback into the model architecture. This time-aware nature (and its similarity to clinical reasoning) in addition to previous reports on the beneficial use of LSTM in the field of infection management with EHR built the background for the choice of this methodology [17][18][19]. Using an LSTM brings the advantages that the data did not need to be collapsed to be used in the model but all available information could be used without the need for additional feature pre-processing. The LSTM model also included all available features including vital signs (see Appendix 1). For the LSTM model a 60-20-20 split to the data was applied with 60% of the data used for training, 20% for selecting the hyperparameters, and 20% for testing and reporting the model final model performance.
Model performances were evaluated and compared using the area under the receiver operator curve (AUROC) and the area under the precision recall curve (AUPRC).
All data processing and analyses were performed using tidyverse in R and numpy, pandas, caret, scikit-learn, keras, and tensorflow in Python [20,21]. The data is registered in the Groningen Data Catalogue (https://groningendatacatalogus.nl). The study followed the . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021.

Patient population
In total, 8 270 unique patients and 9 684 admission were included in the study (Table 1). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021.

Consultations
The proportion of patients with a consultation did not show a significant trend over time and centred around the mean of 7.9% (SD: 1.4) of all patients over all quarters of the study period ( Figure 2). An explanatory multiple logistic regression analysis using available basic patient and administrative features was performed to identify characteristic of the consultation cohort. Several features showed a significant positive association with consultations: age, mechanical ventilation at ICU admission, and readmission ( Table 2). A significant negative association with consultations was found for admission via the operation room and planned admissions.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021.

Predicting consultations
The developed prediction models targeted at predicting a consultation reached moderate to high diagnostic accuracy (AUROC range: 0.698-0.922) depending on the underlying concept, available features, and the applied model technique (Table 3). Model performance increased with the number of available features (at-the-door < collapsed < time-series).
Using the same set of features in the at-the-door concept, RF and GBM performed similarly with an AUROC of 0.716 and 0.698, respectively. Similar performance differences were found for RF and GBM using additional features in the collapsed concept but AUROC and AUPRC results improved compared to the at-the-door models (AUROC for RF models: 0.716 vs. 0.873). Best performance was found using LSTM with all available features (timeseries concept) and time-aware data aggregation (AUROC: 0.922; AUPRC: 0.675). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021. Plotting the AUROC revealed the superiority of the LSTM model ( Figure 4). The AUROC also demonstrated the importance of the number of available features while the model technique (RF vs. GBM) resulted in similar performances.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021. ; https://doi.org/10.1101/2021.03.31.21254530 doi: medRxiv preprint Given the imbalanced dataset of this study, i.e., the majority of admission (92.2%) did not receive a consultation, assessing the AUPRC provides further information on model performance ( Figure 5) [23]. The trade-off between precision (or positive predictive value) and recall (or sensitivity) can be observed and model performance can be compared to the baseline (0.078) which reflects the occurrence of consultations in the study cohort. The LSTM model was also identified to be superior in AUPRC metrics. Discussion . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021. Initiating diagnostic procedures and adequate treatment, antimicrobial therapy in particular, and the correct and timely ordering of those are essential in the care for critically ill patients.
Rapid diagnostics in the (microbiological) laboratory or at point-of-care have evolved greatly over recent years and have the potential to reduce turnaround time substantially, i.e. time from order to clinical action [24,25]. However, technical solutions in the laboratory are costly and do not solve the problem of optimal resource allocation across ICU patients. Potential pre-analytical time gains are often less considered when discussing the concept of rapid diagnostics but the vital aspects of the pre-analytical time and workflow are well established [26]. These considerations also apply to clinical workflows of physicians working in multidisciplinary teams at the ICU including infection-related consultations. Moreover, our study did not include microbiological data beyond the consultation time stamps and consultations . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021. ; https://doi.org/10.1101/2021.03.31.21254530 doi: medRxiv preprint were predicted using only routine ICU data. Thus, one useful case scenario of this prediction model could also be the support of (timely) initiations of microbiological diagnostics.
We demonstrated the successful prediction of consultations in a time shift-based approach with a maximum of eight hours in advance. This could support early identification of patients requiring a consultation and support the initiation subsequent clinical steps (e.g., notifying consultants, performing diagnostics). Our incremental approach showed that an at-the-door model with baseline patient characteristics is not sufficient. Although achieving a moderate AUROC, the AUCPR of this model was only little above the baseline in this imbalanced dataset (7.8% of all patients received a consultation). Additional features markedly improved model performance. Using a machine learning technique, LSTM, that can work with timeseries data such as EHR was shown to be superior in both AUROC and AUCPR measures.
The suitability of LSTM for EHR data was also confirmed in other infection-related studies with similar model performances yet different target events [17][18][19]. Although model performances are difficult to compare for different outcomes, our results show excellent performance [27]. Moreover, LSTM models offer the advantage of a reduced feature preprocessing necessity.
In clinical practice a time gain of up to eight hours could be beneficial for the patient and all actors involved, intensivists and microbiologists, and help to streamline clinical workflows and allocating resources at the right time for the right patients. The positive impact of infection-related consultations was demonstrated in previous studies as mentioned above [2][3][4][5][6]. First studies on automating consultations also showed promising findings [8,9].
Despite their reactive nature triggered by the occurrence of clinical events, the results pointed towards the feasibility of data-driven consultation workflows. Our proactive approach, i.e., shifting emphasis from reacting to clinical events to predicting an outcome or event, has the potential to bring this to a next level by leveraging existing technology and data.
Our study is subject to limitations. Although we tested our model against a held-out dataset, validation and usability in a real-time scenario need yet to be demonstrated in clinical practice. Machine learning models are known to degrade in performance once deployed and optimal maintenance is important. Our approach might also be affected by this phenomenon.
We worked with data that reflect human behaviour (clinical procedures as features, consultation as target outcome). In practice this could influence the model performance as human behaviour might change based on this model's predictions. In addition, both training and test datasets were used from a single institution and external validation is needed.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2021. ; However, all features used in the final model are generic features routinely available at ICUs (see Appendix 1). This can facilitate external validation and potential implementation in local EHR systems. Although microbiology reports are also part of routine EHR, they were not included in this study by design. However, the inclusion could yield valuable insights if they influence model performance, which could be explored in future studies. An additional limitation to our approach is the interpretability of the LSTM model. By nature, LSTM and other deep learning techniques do not offer straightforward and interpretable measures such as, for example, feature importance, partial dependence, or individual conditional expectation. If this limitation hinders acceptances by clinicians needs further exploration. We argue that familiarising clinicians with sophisticated machine learning concepts through workflow-centred approaches (e.g., the one presented here) in contrast to more implicationheavy models (e.g., sepsis prediction) could help bridging machine learning technology and development to clinical deployment and acceptance.

Conclusion
This study demonstrated the feasibility of predicting infection-related consultations for ICU patients up to eight hours in advance. Using routine EHR data and applying a LSTM machine learning model resulted in excellent prediction model performance. In a real-life scenario this approach could help to streamline ICU and clinical microbiology workflows, to allocate scarce resources, and to perform timelier diagnostic and therapeutic interventions in a multi-disciplinary manner. Yet, external validation and further research into acceptance of sophisticated machine learning models by intensivists and consultants is needed. In summary, machine learning supported approaches, such as the one presented here, have great potential to further improve patient care and clinical outcome for critically ill patients with (suspected) infections. T  h  i  s  s  t  u  d  y  w  a  s  p  a  r  t  o  f  a  p  r  o  j  e  c  t  f  u  n  d  e  d  b  y  t  h  e  E  u  r  o  p  e  a  n  C  o  m  m  i  s  s  i  o  n  H  o  r  i  z  o  n  2  0  2  0  F  r  a  m  e  w  o  r  k   M  a  r  i  e  S  k  ł  o  d  o  w  s  k  a  -C  u  r  i  e  A  c  t  i  o  n  s  (  g  r  a  n  t  a  g  r  e  e  m  e  n  t  n  u  m  b  e  r  :  7  1  3  6  6  0  -P  R  O  N  K  J  E  W  A  I  L  -H  2  0  2  0  -M  S  C  A  -C  O  F  U  N  D  -2  0  1  5 ) .

Funding
Culture (throat) X X Urea 24 hours (urine) mmol * X Urea (blood) mmol/l * X Respiration rate X Temperature X * low/normal/high according to reference range . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 6, 2021. ; https://doi.org/10.1101/2021.03.31.21254530 doi: medRxiv preprint