Summary
Background Sepsis is vital in critical care medicine, and early detection and intervention are key to survival. We aimed to establish an early warning system for sepsis based on a data integration system that can be implemented in the intensive care unit (ICU).
Methods We trained the LightGBM and multilayer perceptron on the open-source database Medical Information Mart for Intensive Care for sepsis prediction. An ensemble sepsis prediction model was established based on the transfer learning and ensemble learning technique on the private dataset of Ruijin Hospital. The Shapley Additive Explanations analysis was applied to present feature importance on the prediction inference. With the development of data-integrating hub to collect and transmit data from different brands of ICU medical devices, the data integration system was established to receive, integrate, standardize, and store the real-time clinical data. In this way, the sepsis prediction model developed in the ICU of the Ruijin Hospital for the real-world study of sepsis early warning on ICU management. The trial was registered with ClinicalTrials.gov (NCT05088850).
Findings Our best early warning model achieved an area under the receiver operating characteristic curve (AUC) of 0·9833 in the task of detecting sepsis in 4-h preceding on the open-source database, while our ensemble model achieved an AUC of 0·9065−0·9436 in the retrospective research from 1−5-h preceding on the private database, and 0·8636−0·8992 in real-time real-world studies using the data integration system in the ICU of the Ruijin Hospital. In the continuous early warning process of patients admitted to the ICU, 22 patients who met the diagnostic criteria for sepsis during hospitalization were predicted as positive cases; 29 patients without sepsis were predicted as negative cases. Additionally, 17 patients were predicted as false-positive cases; in six patients with sepsis during ICU stay, the predicted probabilities at different time nodes were all less than the warning threshold 0·7 and predicted as false-negative cases.
Interpretation Machine learning models could allow accurate and real-time inference to detect sepsis onset within 5-h preceding at most with the help of the data integration system. We identified the features such as age, antibiotics, ventilation, and net balance to be important for the sepsis prediction inference. We argue that this system has promising potential to improve ICU management by helping medical practitioners identify at-sepsis-risk patients and prepare for timely diagnosis and intervention.
Funding Shanghai Municipal Science and Technology Major Project, the ZHANGJIANG LAB, and the Science and Technology Commission of Shanghai Municipality.
Introduction
Sepsis, an infection-induced syndrome of physiological, pathological, and biochemical abnormalities, is a global healthcare issue that is associated with unacceptably high mortality and long-term morbidity among patients in an intensive care unit (ICU),1,2 and is responsible for a substantial cost burden on health care resources.3 Early detection and timely administration of appropriate antibiotics may be the most important factors to improve the prognosis of patients with sepsis.4 However, nonspecific symptoms of patients with sepsis lead to delayed diagnosis and delayed intervention.5
Machine learning has emerged as a promising tool for early detection of sepsis occurrence through intensive management based on electronic medical records, laboratory data, and biomedical signals.6,7 In 2016, Singer et al. proposed a new definition (Sepsis-3) of sepsis.2 According to this, many recent studies on sepsis prediction defined sepsis by Sequential Organ Failure Assessment (SOFA) and infection instead of Systemic Inflammatory Response Syndrome (SIRS).8-12 Prospective studies have shown that implementation of machine learning-based sepsis prediction algorithms in hospitals can reduce in-hospital mortality and length of stay.13,14 In addition, many machine learning models provide superior model performance at the cost of transparency and interpretability, which has become a barrier to clinical application. Algorithms based on gradients, attention mechanism, and Shapley values are used to interpret the machine learning models.15-17
Most studies on sepsis detection used historical medical data, such as the Medical Information Mart for Intensive Care (MIMIC).18 However, the implementation of the detection model in the ICU for real-time prediction is complex. The raw data needed for model inference, such as bedside data, laboratory data, demographic data, and doctor’s orders, usually come from different devices. Moreover, the information cannot interact directly due to differences in the data transfer protocols between devices. Efforts have been made to integrate bedside medical devices.19-21 However, these data integration systems integrate a more limited number of devices and data types to present the complete perspective of a doctor. Moreover, they were mainly focused on data collection and presentation and lacked further functionality, such as real-time alerts. Meanwhile, previous studies on the prediction of sepsis were mainly retrospective, and prospective studies used only relatively simple variables, deployment methods, and models.
In this study, we aim to develop a data integration system for IntelliVue Information Center, Ventilators, Philips ICCA system, Laboratory Information System (LIS), and Hospital Information System (HIS), an ensemble machine learning model for the prediction of sepsis based on Sepsis-3 and establish a real-time early warning system for sepsis in the ICU, named SEpsis PREdiction System (SEPRES). In this way, we have developed the SEPRES in the ICU in the Ruijin Hospital and conducted real-world studies to analyze the performance of this system in the management of ICU patients.
Methods
SEPRES includes a data integration system equipped with a sepsis early warning module. The data integration system can collect, store, process, and display medical data. These functions were completed through the data integration machine, physical server, and network server. The sepsis early warning module included a sepsis prediction model and an interpretative tool. The sepsis prediction model is an ensemble of multiple machine learning models and utilizes the transfer learning technique to predict sepsis. The interpretative tool provides information on how the model works by assigning importance to the input features. Our research was approved by the Ruijin Hospital Ethics Committee (No. 2020 [140]).
Medical device integration hub
We developed a medical device integration hub that can acquire and transmit data from different brands of medical devices. The medical device integration hub consists of customized device connection lines, a hub, and an integrated data receiver. The identification module containing encoding is inserted into each medical device, enabling the hub to identify the type of online device and collect data automatically according to the communication protocol. The integrated data receiver receives and translates the raw data and uploads them to the integration server through the local area network. The medical device integration hub has the following functions:
Device online services: Detect device connections and start a data reading program corresponding to the device.
Decoding: Parsing raw data into structured data for further processing.
Storage: Storing parsed data in native memory.
Remote Settings: support remote system setup and send system status.
Uploading: Upload data received to the specified database. Details of the data extraction can be found in Appendix I.
The framework of a data-integrating system
As shown in Figure 1, the system includes a physical server with the PostgreSQL database to store the sepsis warning data, and the webserver deploys the system’s user access portal. The architecture can be divided into the following parts.
Device side: The medical device integration hub transmits the device and HIS data to the data integration system through the local area network.
Data management side: Heterogeneous data are integrated into the data integration system. The interface data, service data, and model predictions are stored and managed by the Structured Query Language (SQL) server, while the parts needed for the sepsis early warning module are sent to the PostgreSQL database. The Message Queuing Telemetry Transport (MQTT) server sends real-time data from the data integration system to the browser.
Data server side: The web server responds to the browser’s request and calls the sepsis early warning module. Data preprocessing and model inference are then executed, and the predictions are stored in the PostgreSQL database. The data server side includes some related services (real-time calculation of the SOFA score, determination of suspected infection, data statistics, data charts, and historical data query).
Application side: The user’s request is passed to the webserver in this layer, and the processing results are displayed in the system. The Java Script program is used for dynamic HTML page development, and the AJAX interface is used for data interaction with the webserver. Spring MVC is used to build full-featured MVC modules for web applications, combined with NODEJS to provide an elegant and highly maintainable method for creating templates. Users can use the system anytime and anywhere with a browser in various ways, such as PCs and mobile terminals.
System deployment
Figure 2 shows the medical device integration hub installed at Ruijin Hospital. The hub was placed at the bedside, receiving data from multiple devices via different interfaces shown at the bottom of the figure, storing the last 72 h of data in native memory, and transmitting data with a time delay of less than 10 s. Interfaces distributed on the two sides of the hub include two universal network interfaces, four USB interfaces for the mouse, keyboard, and U disk, two HDMI and one VGA for extended display, and eight or 16 USB and Ethernet multiplexing interfaces for medical devices. The hub can integrate data from the monitor, ventilator, infusion pump, and dialysis machine. The processed data are then transmitted to the data integration system.
Sepsis prediction model
Our goal was to develop a sepsis prediction model that can run in real-time in hospitals. To avoid insufficient data in the specific hospital for training, we first trained the models in the open-source database MIMIC and then retrained them in private hospital databases using transfer learning techniques to improve the performance. The final sepsis prediction model was obtained by integrating multiple transferred models using ensemble learning techniques.
Data acquisition
Data sources and inclusion criteria
Our study used the MIMIC-III database (version 1·4)18 and the private Ruijin Hospital historical (RJ) database. MIMIC encompasses approximately 40,000 patients admitted to the ICU at Beth Israel Deaconess Medical Center in Boston from 2001 to 2012. Two tasks were established: inference on the MIMIC dataset by models trained on the MIMIC dataset and inference on the RJ dataset by models trained on the MIMIC and RJ datasets. The first task was to facilitate comparison with other articles, and the second was to apply the models clinically in Ruijin Hospital.
Patients who met all the following criteria were included in the case group:
At least 14 years old.
Sepsis onset at least 5 h after admission to the ICU.
Sepsis onset is the first instance since admission to the hospital.
Patients who met all the following criteria were included in the control group:
At least 14 years old.
Patients who stayed in the ICU for at least 5 h and have not had sepsis at this time.
Patients without ICD-9 codes for sepsis (785·52, 995·91, and 995·92).
SOFA score changes of no more than 1 point in an arbitrary continuous 72 h in the ICU stay.
The third criterion was excluded from the RJ database because ICD-9 codes were not recorded. Sepsis-label definitions. Patients were followed throughout their stay in the ICU until discharge or development of sepsis according to the definition of the Third International Consensus for sepsis (Sepsis-3).2 Specifically, if the timestamp of antibiotics (tabx) and blood cultures (tculture) meet the condition tabx - 24 h ⩽ tculture ⩽ tabx + 72 h, the earlier timestamp of tabx and tculture is defined as the timestamp of suspected infection (tsus). The SOFA score was evaluated per hour within the time window [tsus - 48 h, tsus + 24 h]. The first hour with two or more points of increase in the SOFA score than the lowest prior score is defined as the onset of sepsis (tonset).
Feature extraction
We extracted 78 and 63 patient variables from the MIMIC and RJ databases, respectively. After data cleaning, we extracted these variables as features, i.e., maximum, average, median, and minimum, at hourly intervals, and the missing data were padded by the nearest value before or a preset default value. After filtering, we obtained the MIMIC dataset with 1057 positive and 5834 negative episodes and the RJ dataset with 115 positive and 239 negative episodes. We used a 5-h time window from the episodes to predict sepsis. These two datasets were divided into training, validation, and test sets. See Appendix II for further details.
Prediction model
Machine learning model
Multiple models were tested on the MIMIC dataset, including support vector machine (SVM), multilayer perceptron (MLP), gradient boosting machine (GBM), and long short-term memory (LSTM). For GBM, we used XGBoost22 and LightGBM23 as implementations. A detailed introduction to these models can be found in Appendix V. Training method. Some redundant features were removed to accelerate SVM and MLP training in the tasks. Data were standardized (i.e., each feature’s value range was linearly scaled between 0 and 1) before training to eliminate magnitude differences between features. The hyperparameters and structures of each model were tuned according to the effects on the validation set. See Appendix VI for further details.
Transfer learning and ensemble learning
To ensure sufficient patient cases, we first trained LightGBM and MLP models in MIMIC and later transferred them to the RJ dataset in Task 2. These models were selected based on the performance of Task 1 and were used as representatives of traditional machine learning models and neural network models. They needed to be retrained from Task 1 because some variables were not available in the RJ database. Due to the population differences between the MIMIC and RJ databases, the data were standardized separately as in Task 1. Additionally, similar features (the maximum, average, median, and minimum values of variables with low recording frequency at the same hour) are reduced to one feature to reduce the dimension and help the transfer. The parameters of each model were shared as initial parameters and tuned again during training on RJ database. Finally, we integrated the MLP and LightGBM models by taking the average to make the results more robust and accurate. This result was used as the final output of the sepsis prediction model.
Role of the funding source
The funders of the study had no role in the study design, data collection, data analysis, data interpretation, or writing of the report.
Results
Sepsis prediction model
We evaluated our models based on accuracy, the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity of the test set.
Performance on the MIMIC-III dataset
In Task 1, the AUCs of XGBoost and LightGBM were the highest among these sepsis prediction models, followed by MLP and LSTM, and SVM performed the worst. Appendix VII shows the full performance of the five models. Although the GBM structure is relatively simple, it outperforms artificial neural network models. We compared our models with other models trained on the same open-source database, MIMIC, using the Sepsis-3 criteria, and reported prediction results within 5 h before the onset of sepsis, including InSight,8 AISE,9 MGP-TCN, DTW-KNN,10 MLA,11 DSPA,12 and MGP-AttTCN.16 Table 1 shows that our models basically outperform the others. However, it should be highlighted that although these models all use the MIMIC-III database, there were still differences in the training and test sets due to differences in the details of case extraction and sepsis criteria; therefore, the comparison is not as standard as most machine learning comparison works based on benchmark data.
Performance on the RJ dataset
In Task 2, after transfer learning and ensemble learning, the final performance of the sepsis prediction model is shown in Table 2. The overall performance was similar to that of LightGBM or MLP transferred models, and the average value of AUC was slightly higher than that of LightGBM or MLP. Detailed results of transfer learning can be found in Appendix VIII.
Feature interpretability
We used Shapley additive explanation (SHAP)24 analysis to explore the importance of the features. For LightGBM models in SEPRES, antibiotics, respiratory rate, total positive end-expiratory pressure level, fibrinogen level, temperature, net balance, and age were important in most models. For MLP models, age, respiratory rate, ventilation, heart rate, antibiotics, and temperature were important in most models. Some of these features (antibiotics, respiratory rate, temperature, ventilation, and heart rate) were related to the definition of Sepsis-3 or SIRS, while there is also literature arguing for an association between some of these features (respiratory rate,25 fibrinogen,26 net balance,27 and age28) and the severity or mortality of sepsis. Detailed results are provided in Appendix IX.
SEPRES
Model inference in SEPRES
The detailed steps of model inference are as follows:
Obtain real-time features of patients using SQL query statements.
The features were standardized by calling the scaler obtained in the training set.
Call the trained model to get the prediction results.
Call the interpretive tool to get the importance of the features based on the prediction results.
Output and store prediction results and interpretations in a standard format.
System operation
SEPRES provides predictions and explanations for every patient in the ICU every hour, including the risk of sepsis onset in the next 5 h, the influence of features on the predictions calculated by SHAP, and SOFA predictions. It has been operating at Ruijin Hospital for several months, providing hourly early warning services for over 100 patients in the ICU.
The PC terminal of the user interface is a layout in the ICU common room. Figure 3 presents an example of the display board for all patients in the ICU, including predictions of sepsis onset and SOFA changes, where high and low risks are shown by red and blue bars, respectively. Figure 4 shows the data details for a specific patient to observe the current and past status of the patient.
Real-time performance of the sepsis prediction model in the ICU of the Ruijin Hospital
We extracted a total of 67 ICU stays from February 2021 to June 2021 from the system. Each stay was labeled by the change in SOFA score and the doctor’s examination for infection (based on antibiotics or blood culture), and 40 of these stays were labeled as having at least one sepsis onset at a threshold of 0·5. Data from the control group and near onset of sepsis in the case group were included in the analysis. The statistical results of the predictions are presented in Table 3.
Case studies
We discussed several of these cases, including true-positive, true-negative, false-positive, and false-negative cases. Figure 5 illustrates the model predictions for positive cases near the onset of sepsis or negative cases over a random period (see Appendix X for more details).
Our model can detect sepsis early in most cases, although there are a small number of false-negative and false-positive cases. We propose that the possible reason for false-negatives is that our models tend to give lower predictions when the collected data are limited. The early prediction of sepsis occurrence by our model effectively guided medical practitioners to pay more attention appropriately to this patient, leading to early diagnosis of sepsis and more efficient management of ICU patients.
Discussion
Machine learning methods have been considered a promising method for early warning of sepsis in the ICU.6-17 Early diagnosis and timely management of septic patients can effectively improve prognosis.29 However, sepsis may not be diagnosed in time in the clinic due to the doctor’s shift and day-night shift of the medical staff. Therefore, an accurate and efficient early prediction system for sepsis at the bedside is important.
In this study, we established an ICU bedside sepsis early warning system, SEPRES, to conduct real-time sepsis prediction for patients in the ICU through a data integration system. In contrast to most studies on machine-learning sepsis prediction in the open-source database, SEPRES has been developed and conducted real-time inference and analysis in the ICU of the Ruijin Hospital by the integration of IntelliVue Information Center, Ventilators, Philips ICCA system, LIS, and HIS data. Additionally, the system can display the patient’s historical data in the user interface to help doctors intuitively obtain changes in the patient’s condition. Although SEPRES could not provide a definitive basis for our therapeutic regime, the probability of sepsis occurrence allows us to pay more attention to specific patients. Furthermore, weight analysis of medical factors can provide insights into the use of therapeutic regimens.
To avoid the influence of the insufficiency of data size and inhomogeneity distribution of the data in different medical centers on training and inferring machine learning models, we deployed the transfer learning technique to improve the performance of the specific medical center. In particular, MIMIC-III mainly enrolled white patients (40996 of 58976 hospital admissions), in contrast to the private dataset of the Ruijin Hospital that is mainly composed of the Chinese population, which has a significant difference in certain features of the sepsis prediction model (See Appendix XI). In this way, the transfer learning process improved the prediction AUC of LightGBM models from 0·8613−0·8913 to 0·8964−0·9348 on the historical data of the Ruijin Hospital.
The interpretive tool may help medical practitioners identify risk factors. In the SHAP analysis of the models loaded in SEPRES, we paid special attention to the insights brought about by the importance of net balance. As shown in Supplementary Figure 2, a positive net balance indicates a higher risk of sepsis. Because the net balance is nursing data that are difficult to collect, in the MIMIC dataset, 4079 out of 6891 episodes have no balance data for colloid and crystalloid solutions, while in the RJ dataset, only 25 out of 329 episodes had no net balance data. Therefore, net balance has not been considered a feature for most machine learning models or has been analyzed as an important factor for sepsis prediction inference based on MIMIC datasets. Our SHAP analysis showed that a negative net balance tended to decrease the predicted probability of sepsis. Indeed, positive cumulative fluid balance has been reported to be an independent predictor of ICU mortality.27 Furthermore, Lin et al. have shown that patients with an early positive fluid balance have an increased risk of developing venous thromboembolism.30 The weight of net balance, as shown in our SHAP analysis, further emphasized the importance of careful fluid management in critically ill patients. Therefore, we argue that including the net balance in the prediction model may improve not only the performance but also the clinical management in the ICU.
Our model has certain limitations. First, we enrolled only patients who were non-septic during the entire period in the ICU as negative controls. The enrollment condition may be too pure to be used to establish a model to predict the onset period of sepsis, which may cause false-positive cases. Second, as we observed in consecutive case studies, patients diagnosed with sepsis shortly after being transferred to the ICU were difficult to predict by the model. That is, a short period of data recording may cause false-negative cases. Finally, our model incorporates variables, such as antibiotics and mechanical ventilation, resulting in the predictions of the model being influenced by the subjective behavior of the doctor.
These limitations will be addressed in future work through diverse methods, including fine-grained labeling, inclusion of data collection from the ICU, and data augmentation.
Moreover, this workflow applies to disease warnings other than sepsis in the ICU, such as disseminated intravascular coagulation and acute kidney injury, with the help of the data integration system to collect the necessary features and data for model construction.
Contributors
WL, YT, LL, QC, RL, and Lin C conceived and designed the study. YT, LL, QC, RL, DC, HQ, and YH acquired the data. WL, Lin C, and Lai C implemented quality control of data and algorithms. WL, QC, RL, and Lai C had full access to and verified all data in the study. QC developed, trained, and applied machine-learning models. Lai C developed a data integration system. YT, LL, RL, DC, and HQ performed the consecutive case studies. QC and RL prepared the first draft of the manuscript. WL, LL, and YT revised the manuscript. All authors contributed to the preparation of the manuscript.
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Declaration of Interest
We report no competing interests.
Data Sharing
The MIMIC-III database can be accessed at https://physionet.org/content/mimiciii/1.4/ after becoming a member of Physionet (https://physionet.org/). The RJ database used in this study is not publicly available. The code used to develop the model in this manuscript is available from the corresponding author upon reasonable request.
Ethics Committee Approval
Our study was approved by the Ruijin Hospital Ethics Committee [ethics committee reference number: (2020) Linlunshen No. (140)].
Research in context
Evidence before this study
We searched PubMed from inception to September 30, 2021, using the keywords “machine learning” or “deep learning” or “artificial intelligence” and “sepsis,” and using the keywords “critical care” or “ICU” or “sepsis” and “data integration system” or “data acquisition system” or “integrated,” without language restrictions. Previous studies built various prediction models based on different sepsis definitions and datasets or built data acquisition systems to integrate some types of data. However, most of them did not combine the two together to get a real-time prediction and describe the whole process in detail.
Added value of this study
We developed an entire sepsis prediction system in the ICU, including a set of procedures that can be applied in practice. The machine learning models achieved high AUC scores in the two databases, and we interpreted the predictions. In addition, we examined our system through consecutive case studies. Moreover, this workflow is also applicable to other disease warnings, not only for sepsis.
Implications of all the available evidence
Our system can be applied to display patients’ conditions in real-time, identify patients more likely to suffer from sepsis, help medical practitioners focus on them, and help with future research.
Acknowledgments
The authors acknowledge Shanghai Electric Group Co., Ltd. Central Academe for their support during the development of the data integration system.