Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

SEPRES: Sepsis prediction via a clinical data integration system and real-world studies in the intensive care unit

Qiyu Chen, Ranran Li, ChihChe Lin, Chiming Lai, Dechang Chen, Hongping Qu, Yaling Huang, Wenlian Lu, Yaoqing Tang, Lei Li
doi: https://doi.org/10.1101/2021.05.13.21256281
Qiyu Chen
1Department of Applied Mathematics, Fudan University, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ranran Li
2Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
ChihChe Lin
3Shanghai Electric Group Co., Ltd. Central Academe, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chiming Lai
3Shanghai Electric Group Co., Ltd. Central Academe, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dechang Chen
2Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hongping Qu
2Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yaling Huang
3Shanghai Electric Group Co., Ltd. Central Academe, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wenlian Lu
1Department of Applied Mathematics, Fudan University, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: lileiys1023@yeah.net tangyaoqing@126.com wenlian@fudan.edu.cn
Yaoqing Tang
2Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: lileiys1023@yeah.net tangyaoqing@126.com wenlian@fudan.edu.cn
Lei Li
2Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: lileiys1023@yeah.net tangyaoqing@126.com wenlian@fudan.edu.cn
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Summary

Background Sepsis is vital in critical care medicine, and early detection and intervention are key to survival. We aimed to establish an early warning system for sepsis based on a data integration system that can be implemented in the intensive care unit (ICU).

Methods We trained the LightGBM and multilayer perceptron on the open-source database Medical Information Mart for Intensive Care for sepsis prediction. An ensemble sepsis prediction model was established based on the transfer learning and ensemble learning technique on the private dataset of Ruijin Hospital. The Shapley Additive Explanations analysis was applied to present feature importance on the prediction inference. With the development of data-integrating hub to collect and transmit data from different brands of ICU medical devices, the data integration system was established to receive, integrate, standardize, and store the real-time clinical data. In this way, the sepsis prediction model developed in the ICU of the Ruijin Hospital for the real-world study of sepsis early warning on ICU management. The trial was registered with ClinicalTrials.gov (NCT05088850).

Findings Our best early warning model achieved an area under the receiver operating characteristic curve (AUC) of 0·9833 in the task of detecting sepsis in 4-h preceding on the open-source database, while our ensemble model achieved an AUC of 0·9065−0·9436 in the retrospective research from 1−5-h preceding on the private database, and 0·8636−0·8992 in real-time real-world studies using the data integration system in the ICU of the Ruijin Hospital. In the continuous early warning process of patients admitted to the ICU, 22 patients who met the diagnostic criteria for sepsis during hospitalization were predicted as positive cases; 29 patients without sepsis were predicted as negative cases. Additionally, 17 patients were predicted as false-positive cases; in six patients with sepsis during ICU stay, the predicted probabilities at different time nodes were all less than the warning threshold 0·7 and predicted as false-negative cases.

Interpretation Machine learning models could allow accurate and real-time inference to detect sepsis onset within 5-h preceding at most with the help of the data integration system. We identified the features such as age, antibiotics, ventilation, and net balance to be important for the sepsis prediction inference. We argue that this system has promising potential to improve ICU management by helping medical practitioners identify at-sepsis-risk patients and prepare for timely diagnosis and intervention.

Funding Shanghai Municipal Science and Technology Major Project, the ZHANGJIANG LAB, and the Science and Technology Commission of Shanghai Municipality.

Introduction

Sepsis, an infection-induced syndrome of physiological, pathological, and biochemical abnormalities, is a global healthcare issue that is associated with unacceptably high mortality and long-term morbidity among patients in an intensive care unit (ICU),1,2 and is responsible for a substantial cost burden on health care resources.3 Early detection and timely administration of appropriate antibiotics may be the most important factors to improve the prognosis of patients with sepsis.4 However, nonspecific symptoms of patients with sepsis lead to delayed diagnosis and delayed intervention.5

Machine learning has emerged as a promising tool for early detection of sepsis occurrence through intensive management based on electronic medical records, laboratory data, and biomedical signals.6,7 In 2016, Singer et al. proposed a new definition (Sepsis-3) of sepsis.2 According to this, many recent studies on sepsis prediction defined sepsis by Sequential Organ Failure Assessment (SOFA) and infection instead of Systemic Inflammatory Response Syndrome (SIRS).8-12 Prospective studies have shown that implementation of machine learning-based sepsis prediction algorithms in hospitals can reduce in-hospital mortality and length of stay.13,14 In addition, many machine learning models provide superior model performance at the cost of transparency and interpretability, which has become a barrier to clinical application. Algorithms based on gradients, attention mechanism, and Shapley values are used to interpret the machine learning models.15-17

Most studies on sepsis detection used historical medical data, such as the Medical Information Mart for Intensive Care (MIMIC).18 However, the implementation of the detection model in the ICU for real-time prediction is complex. The raw data needed for model inference, such as bedside data, laboratory data, demographic data, and doctor’s orders, usually come from different devices. Moreover, the information cannot interact directly due to differences in the data transfer protocols between devices. Efforts have been made to integrate bedside medical devices.19-21 However, these data integration systems integrate a more limited number of devices and data types to present the complete perspective of a doctor. Moreover, they were mainly focused on data collection and presentation and lacked further functionality, such as real-time alerts. Meanwhile, previous studies on the prediction of sepsis were mainly retrospective, and prospective studies used only relatively simple variables, deployment methods, and models.

In this study, we aim to develop a data integration system for IntelliVue Information Center, Ventilators, Philips ICCA system, Laboratory Information System (LIS), and Hospital Information System (HIS), an ensemble machine learning model for the prediction of sepsis based on Sepsis-3 and establish a real-time early warning system for sepsis in the ICU, named SEpsis PREdiction System (SEPRES). In this way, we have developed the SEPRES in the ICU in the Ruijin Hospital and conducted real-world studies to analyze the performance of this system in the management of ICU patients.

Methods

SEPRES includes a data integration system equipped with a sepsis early warning module. The data integration system can collect, store, process, and display medical data. These functions were completed through the data integration machine, physical server, and network server. The sepsis early warning module included a sepsis prediction model and an interpretative tool. The sepsis prediction model is an ensemble of multiple machine learning models and utilizes the transfer learning technique to predict sepsis. The interpretative tool provides information on how the model works by assigning importance to the input features. Our research was approved by the Ruijin Hospital Ethics Committee (No. 2020 [140]).

Medical device integration hub

We developed a medical device integration hub that can acquire and transmit data from different brands of medical devices. The medical device integration hub consists of customized device connection lines, a hub, and an integrated data receiver. The identification module containing encoding is inserted into each medical device, enabling the hub to identify the type of online device and collect data automatically according to the communication protocol. The integrated data receiver receives and translates the raw data and uploads them to the integration server through the local area network. The medical device integration hub has the following functions:

  • Device online services: Detect device connections and start a data reading program corresponding to the device.

  • Decoding: Parsing raw data into structured data for further processing.

  • Storage: Storing parsed data in native memory.

  • Remote Settings: support remote system setup and send system status.

  • Uploading: Upload data received to the specified database. Details of the data extraction can be found in Appendix I.

The framework of a data-integrating system

As shown in Figure 1, the system includes a physical server with the PostgreSQL database to store the sepsis warning data, and the webserver deploys the system’s user access portal. The architecture can be divided into the following parts.

Figure 1
  • Download figure
  • Open in new tab
Figure 1 System Deployment Framework.

The web release system of the sepsis prediction system (SEPRES) applies browser/server architecture.

  • Device side: The medical device integration hub transmits the device and HIS data to the data integration system through the local area network.

  • Data management side: Heterogeneous data are integrated into the data integration system. The interface data, service data, and model predictions are stored and managed by the Structured Query Language (SQL) server, while the parts needed for the sepsis early warning module are sent to the PostgreSQL database. The Message Queuing Telemetry Transport (MQTT) server sends real-time data from the data integration system to the browser.

  • Data server side: The web server responds to the browser’s request and calls the sepsis early warning module. Data preprocessing and model inference are then executed, and the predictions are stored in the PostgreSQL database. The data server side includes some related services (real-time calculation of the SOFA score, determination of suspected infection, data statistics, data charts, and historical data query).

  • Application side: The user’s request is passed to the webserver in this layer, and the processing results are displayed in the system. The Java Script program is used for dynamic HTML page development, and the AJAX interface is used for data interaction with the webserver. Spring MVC is used to build full-featured MVC modules for web applications, combined with NODEJS to provide an elegant and highly maintainable method for creating templates. Users can use the system anytime and anywhere with a browser in various ways, such as PCs and mobile terminals.

System deployment

Figure 2 shows the medical device integration hub installed at Ruijin Hospital. The hub was placed at the bedside, receiving data from multiple devices via different interfaces shown at the bottom of the figure, storing the last 72 h of data in native memory, and transmitting data with a time delay of less than 10 s. Interfaces distributed on the two sides of the hub include two universal network interfaces, four USB interfaces for the mouse, keyboard, and U disk, two HDMI and one VGA for extended display, and eight or 16 USB and Ethernet multiplexing interfaces for medical devices. The hub can integrate data from the monitor, ventilator, infusion pump, and dialysis machine. The processed data are then transmitted to the data integration system.

Figure 2
  • Download figure
  • Open in new tab
Figure 2 Medical device integration hub

Sepsis prediction model

Our goal was to develop a sepsis prediction model that can run in real-time in hospitals. To avoid insufficient data in the specific hospital for training, we first trained the models in the open-source database MIMIC and then retrained them in private hospital databases using transfer learning techniques to improve the performance. The final sepsis prediction model was obtained by integrating multiple transferred models using ensemble learning techniques.

Data acquisition

Data sources and inclusion criteria

Our study used the MIMIC-III database (version 1·4)18 and the private Ruijin Hospital historical (RJ) database. MIMIC encompasses approximately 40,000 patients admitted to the ICU at Beth Israel Deaconess Medical Center in Boston from 2001 to 2012. Two tasks were established: inference on the MIMIC dataset by models trained on the MIMIC dataset and inference on the RJ dataset by models trained on the MIMIC and RJ datasets. The first task was to facilitate comparison with other articles, and the second was to apply the models clinically in Ruijin Hospital.

Patients who met all the following criteria were included in the case group:

  1. At least 14 years old.

  2. Sepsis onset at least 5 h after admission to the ICU.

  3. Sepsis onset is the first instance since admission to the hospital.

Patients who met all the following criteria were included in the control group:

  1. At least 14 years old.

  2. Patients who stayed in the ICU for at least 5 h and have not had sepsis at this time.

  3. Patients without ICD-9 codes for sepsis (785·52, 995·91, and 995·92).

  4. SOFA score changes of no more than 1 point in an arbitrary continuous 72 h in the ICU stay.

The third criterion was excluded from the RJ database because ICD-9 codes were not recorded. Sepsis-label definitions. Patients were followed throughout their stay in the ICU until discharge or development of sepsis according to the definition of the Third International Consensus for sepsis (Sepsis-3).2 Specifically, if the timestamp of antibiotics (tabx) and blood cultures (tculture) meet the condition tabx - 24 h ⩽ tculture ⩽ tabx + 72 h, the earlier timestamp of tabx and tculture is defined as the timestamp of suspected infection (tsus). The SOFA score was evaluated per hour within the time window [tsus - 48 h, tsus + 24 h]. The first hour with two or more points of increase in the SOFA score than the lowest prior score is defined as the onset of sepsis (tonset).

Feature extraction

We extracted 78 and 63 patient variables from the MIMIC and RJ databases, respectively. After data cleaning, we extracted these variables as features, i.e., maximum, average, median, and minimum, at hourly intervals, and the missing data were padded by the nearest value before or a preset default value. After filtering, we obtained the MIMIC dataset with 1057 positive and 5834 negative episodes and the RJ dataset with 115 positive and 239 negative episodes. We used a 5-h time window from the episodes to predict sepsis. These two datasets were divided into training, validation, and test sets. See Appendix II for further details.

Prediction model

Machine learning model

Multiple models were tested on the MIMIC dataset, including support vector machine (SVM), multilayer perceptron (MLP), gradient boosting machine (GBM), and long short-term memory (LSTM). For GBM, we used XGBoost22 and LightGBM23 as implementations. A detailed introduction to these models can be found in Appendix V. Training method. Some redundant features were removed to accelerate SVM and MLP training in the tasks. Data were standardized (i.e., each feature’s value range was linearly scaled between 0 and 1) before training to eliminate magnitude differences between features. The hyperparameters and structures of each model were tuned according to the effects on the validation set. See Appendix VI for further details.

Transfer learning and ensemble learning

To ensure sufficient patient cases, we first trained LightGBM and MLP models in MIMIC and later transferred them to the RJ dataset in Task 2. These models were selected based on the performance of Task 1 and were used as representatives of traditional machine learning models and neural network models. They needed to be retrained from Task 1 because some variables were not available in the RJ database. Due to the population differences between the MIMIC and RJ databases, the data were standardized separately as in Task 1. Additionally, similar features (the maximum, average, median, and minimum values of variables with low recording frequency at the same hour) are reduced to one feature to reduce the dimension and help the transfer. The parameters of each model were shared as initial parameters and tuned again during training on RJ database. Finally, we integrated the MLP and LightGBM models by taking the average to make the results more robust and accurate. This result was used as the final output of the sepsis prediction model.

Role of the funding source

The funders of the study had no role in the study design, data collection, data analysis, data interpretation, or writing of the report.

Results

Sepsis prediction model

We evaluated our models based on accuracy, the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity of the test set.

Performance on the MIMIC-III dataset

In Task 1, the AUCs of XGBoost and LightGBM were the highest among these sepsis prediction models, followed by MLP and LSTM, and SVM performed the worst. Appendix VII shows the full performance of the five models. Although the GBM structure is relatively simple, it outperforms artificial neural network models. We compared our models with other models trained on the same open-source database, MIMIC, using the Sepsis-3 criteria, and reported prediction results within 5 h before the onset of sepsis, including InSight,8 AISE,9 MGP-TCN, DTW-KNN,10 MLA,11 DSPA,12 and MGP-AttTCN.16 Table 1 shows that our models basically outperform the others. However, it should be highlighted that although these models all use the MIMIC-III database, there were still differences in the training and test sets due to differences in the details of case extraction and sepsis criteria; therefore, the comparison is not as standard as most machine learning comparison works based on benchmark data.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1 The results of different models on the MIMIC-III dataset

Performance on the RJ dataset

In Task 2, after transfer learning and ensemble learning, the final performance of the sepsis prediction model is shown in Table 2. The overall performance was similar to that of LightGBM or MLP transferred models, and the average value of AUC was slightly higher than that of LightGBM or MLP. Detailed results of transfer learning can be found in Appendix VIII.

View this table:
  • View inline
  • View popup
Table 2 The results of ensemble model in Task 2

Feature interpretability

We used Shapley additive explanation (SHAP)24 analysis to explore the importance of the features. For LightGBM models in SEPRES, antibiotics, respiratory rate, total positive end-expiratory pressure level, fibrinogen level, temperature, net balance, and age were important in most models. For MLP models, age, respiratory rate, ventilation, heart rate, antibiotics, and temperature were important in most models. Some of these features (antibiotics, respiratory rate, temperature, ventilation, and heart rate) were related to the definition of Sepsis-3 or SIRS, while there is also literature arguing for an association between some of these features (respiratory rate,25 fibrinogen,26 net balance,27 and age28) and the severity or mortality of sepsis. Detailed results are provided in Appendix IX.

SEPRES

Model inference in SEPRES

The detailed steps of model inference are as follows:

  1. Obtain real-time features of patients using SQL query statements.

  2. The features were standardized by calling the scaler obtained in the training set.

  3. Call the trained model to get the prediction results.

  4. Call the interpretive tool to get the importance of the features based on the prediction results.

  5. Output and store prediction results and interpretations in a standard format.

System operation

SEPRES provides predictions and explanations for every patient in the ICU every hour, including the risk of sepsis onset in the next 5 h, the influence of features on the predictions calculated by SHAP, and SOFA predictions. It has been operating at Ruijin Hospital for several months, providing hourly early warning services for over 100 patients in the ICU.

The PC terminal of the user interface is a layout in the ICU common room. Figure 3 presents an example of the display board for all patients in the ICU, including predictions of sepsis onset and SOFA changes, where high and low risks are shown by red and blue bars, respectively. Figure 4 shows the data details for a specific patient to observe the current and past status of the patient.

Figure 3
  • Download figure
  • Open in new tab
Figure 3 The user interface example.

Each subplot shows the change in SOFA score and sepsis-onset prediction for a patient. It is subtitled with patient information, and the upper right corner shows the maximum and minimum Sequential Organ Failure Assessment scores and sepsis-onset predictions within 24 h. The original figure has been translated and the patient identifying information has been removed.

Figure 4
  • Download figure
  • Open in new tab
Figure 4 Historical data review for individual patients.

The title is patient information, below the title are filter criteria, on the left side are optional data types, and the rest is the table and chart for the selected data. The original figure has been translated and the patient identifying information has been removed.

Real-time performance of the sepsis prediction model in the ICU of the Ruijin Hospital

We extracted a total of 67 ICU stays from February 2021 to June 2021 from the system. Each stay was labeled by the change in SOFA score and the doctor’s examination for infection (based on antibiotics or blood culture), and 40 of these stays were labeled as having at least one sepsis onset at a threshold of 0·5. Data from the control group and near onset of sepsis in the case group were included in the analysis. The statistical results of the predictions are presented in Table 3.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3 The results of real-time data

Case studies

We discussed several of these cases, including true-positive, true-negative, false-positive, and false-negative cases. Figure 5 illustrates the model predictions for positive cases near the onset of sepsis or negative cases over a random period (see Appendix X for more details).

Figure 5
  • Download figure
  • Open in new tab
Figure 5 some illustrative examples of the prediction.

Each subplot describes the confidence index (CI) for multiple models (Y-axis) at the target time (X-axis). (i) The condition of the patient aggravated in the early morning, with multiple organ dysfunction, and the patient was diagnosed with sepsis at 12:00 AM. Our model prediction exceeded the warning threshold of 0·7 for the prediction at 9:00 AM. (ii) Despite the high Sequential Organ Failure Assessment (SOFA) score (7·0), there was no evidence of ΔSOFA ≥2 within 72 h. Consistently, the predictions were all lower than the threshold. (iii) Although the patient’s SOFA score was stable at 6·0, our model made incorrect predictions. (iv) The SOFA score showed an increase from 6·0 to 9·0 at 06:00 PM. In combination with evidence of infection, the patient was diagnosed with sepsis. However, the prediction was below the warning threshold.

Our model can detect sepsis early in most cases, although there are a small number of false-negative and false-positive cases. We propose that the possible reason for false-negatives is that our models tend to give lower predictions when the collected data are limited. The early prediction of sepsis occurrence by our model effectively guided medical practitioners to pay more attention appropriately to this patient, leading to early diagnosis of sepsis and more efficient management of ICU patients.

Discussion

Machine learning methods have been considered a promising method for early warning of sepsis in the ICU.6-17 Early diagnosis and timely management of septic patients can effectively improve prognosis.29 However, sepsis may not be diagnosed in time in the clinic due to the doctor’s shift and day-night shift of the medical staff. Therefore, an accurate and efficient early prediction system for sepsis at the bedside is important.

In this study, we established an ICU bedside sepsis early warning system, SEPRES, to conduct real-time sepsis prediction for patients in the ICU through a data integration system. In contrast to most studies on machine-learning sepsis prediction in the open-source database, SEPRES has been developed and conducted real-time inference and analysis in the ICU of the Ruijin Hospital by the integration of IntelliVue Information Center, Ventilators, Philips ICCA system, LIS, and HIS data. Additionally, the system can display the patient’s historical data in the user interface to help doctors intuitively obtain changes in the patient’s condition. Although SEPRES could not provide a definitive basis for our therapeutic regime, the probability of sepsis occurrence allows us to pay more attention to specific patients. Furthermore, weight analysis of medical factors can provide insights into the use of therapeutic regimens.

To avoid the influence of the insufficiency of data size and inhomogeneity distribution of the data in different medical centers on training and inferring machine learning models, we deployed the transfer learning technique to improve the performance of the specific medical center. In particular, MIMIC-III mainly enrolled white patients (40996 of 58976 hospital admissions), in contrast to the private dataset of the Ruijin Hospital that is mainly composed of the Chinese population, which has a significant difference in certain features of the sepsis prediction model (See Appendix XI). In this way, the transfer learning process improved the prediction AUC of LightGBM models from 0·8613−0·8913 to 0·8964−0·9348 on the historical data of the Ruijin Hospital.

The interpretive tool may help medical practitioners identify risk factors. In the SHAP analysis of the models loaded in SEPRES, we paid special attention to the insights brought about by the importance of net balance. As shown in Supplementary Figure 2, a positive net balance indicates a higher risk of sepsis. Because the net balance is nursing data that are difficult to collect, in the MIMIC dataset, 4079 out of 6891 episodes have no balance data for colloid and crystalloid solutions, while in the RJ dataset, only 25 out of 329 episodes had no net balance data. Therefore, net balance has not been considered a feature for most machine learning models or has been analyzed as an important factor for sepsis prediction inference based on MIMIC datasets. Our SHAP analysis showed that a negative net balance tended to decrease the predicted probability of sepsis. Indeed, positive cumulative fluid balance has been reported to be an independent predictor of ICU mortality.27 Furthermore, Lin et al. have shown that patients with an early positive fluid balance have an increased risk of developing venous thromboembolism.30 The weight of net balance, as shown in our SHAP analysis, further emphasized the importance of careful fluid management in critically ill patients. Therefore, we argue that including the net balance in the prediction model may improve not only the performance but also the clinical management in the ICU.

Our model has certain limitations. First, we enrolled only patients who were non-septic during the entire period in the ICU as negative controls. The enrollment condition may be too pure to be used to establish a model to predict the onset period of sepsis, which may cause false-positive cases. Second, as we observed in consecutive case studies, patients diagnosed with sepsis shortly after being transferred to the ICU were difficult to predict by the model. That is, a short period of data recording may cause false-negative cases. Finally, our model incorporates variables, such as antibiotics and mechanical ventilation, resulting in the predictions of the model being influenced by the subjective behavior of the doctor.

These limitations will be addressed in future work through diverse methods, including fine-grained labeling, inclusion of data collection from the ICU, and data augmentation.

Moreover, this workflow applies to disease warnings other than sepsis in the ICU, such as disseminated intravascular coagulation and acute kidney injury, with the help of the data integration system to collect the necessary features and data for model construction.

Contributors

WL, YT, LL, QC, RL, and Lin C conceived and designed the study. YT, LL, QC, RL, DC, HQ, and YH acquired the data. WL, Lin C, and Lai C implemented quality control of data and algorithms. WL, QC, RL, and Lai C had full access to and verified all data in the study. QC developed, trained, and applied machine-learning models. Lai C developed a data integration system. YT, LL, RL, DC, and HQ performed the consecutive case studies. QC and RL prepared the first draft of the manuscript. WL, LL, and YT revised the manuscript. All authors contributed to the preparation of the manuscript.

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Declaration of Interest

We report no competing interests.

Data Sharing

The MIMIC-III database can be accessed at https://physionet.org/content/mimiciii/1.4/ after becoming a member of Physionet (https://physionet.org/). The RJ database used in this study is not publicly available. The code used to develop the model in this manuscript is available from the corresponding author upon reasonable request.

Ethics Committee Approval

Our study was approved by the Ruijin Hospital Ethics Committee [ethics committee reference number: (2020) Linlunshen No. (140)].

Research in context

Evidence before this study

We searched PubMed from inception to September 30, 2021, using the keywords “machine learning” or “deep learning” or “artificial intelligence” and “sepsis,” and using the keywords “critical care” or “ICU” or “sepsis” and “data integration system” or “data acquisition system” or “integrated,” without language restrictions. Previous studies built various prediction models based on different sepsis definitions and datasets or built data acquisition systems to integrate some types of data. However, most of them did not combine the two together to get a real-time prediction and describe the whole process in detail.

Added value of this study

We developed an entire sepsis prediction system in the ICU, including a set of procedures that can be applied in practice. The machine learning models achieved high AUC scores in the two databases, and we interpreted the predictions. In addition, we examined our system through consecutive case studies. Moreover, this workflow is also applicable to other disease warnings, not only for sepsis.

Implications of all the available evidence

Our system can be applied to display patients’ conditions in real-time, identify patients more likely to suffer from sepsis, help medical practitioners focus on them, and help with future research.

Acknowledgments

The authors acknowledge Shanghai Electric Group Co., Ltd. Central Academe for their support during the development of the data integration system.

References

  1. ↵
    Cecconi M, Evans L, Levy M, Rhodes A. Sepsis and septic shock. Lancet 2018; 392: 75–87.
    OpenUrlCrossRefPubMed
  2. ↵
    Singer M, Deutschman CS, Seymour CW, et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3). Jama 2016; 315: 801–10.
    OpenUrlCrossRefPubMed
  3. ↵
    Angus DC, Linde-Zwirble WT, Lidicker J, Clermont G, Carcillo J, Pinsky MR. Epidemiology of severe sepsis in the United States: Analysis of incidence, outcome, and associated costs of care. Crit Care Med 2001; 29: 1303–10.
    OpenUrlCrossRefPubMedWeb of Science
  4. ↵
    Marik PE, Farkas JD. The changing paradigm of sepsis: early diagnosis, early antibiotics, early pressors, and early adjuvant treatment. Crit Care Med 2018; 46: 1690–2.
    OpenUrl
  5. ↵
    Filbin MR, Lynch J, Gillingham TD, et al. Presenting symptoms independently predict mortality in septic shock: Importance of a previously unmeasured confounder. Crit Care Med 2018; 46: 1592–9.
    OpenUrlCrossRefPubMed
  6. ↵
    Henry KE, Hager DN, Pronovost PJ, Saria S. A targeted real-time early warning score (TREWScore) for septic shock. Sci Transl Med 2015; 7: 299ra122–299ra122.
    OpenUrlAbstract/FREE Full Text
  7. ↵
    Lauritsen SM, Kalør ME, Kongsgaard EL, et al. Early detection of sepsis utilizing deep learning on electronic health record event sequences. Artif Intell Med 2020; 104: 101820.
    OpenUrlPubMed
  8. ↵
    Desautels T, Calvert J, Hoffman J, et al. Prediction of sepsis in the intensive care unit with minimal electronic health record data: A machine learning approach. JMIR Med Inform 2016; 4: e5909.
    OpenUrl
  9. ↵
    Nemati S, Holder A, Razmi F, Stanley MD, Clifford GD, Buchman TG. An interpretable machine learning model for accurate prediction of sepsis in the ICU. Crit Care Med 2018; 46: 547.
    OpenUrlCrossRefPubMed
  10. ↵
    Moor M, Horn M, Rieck B, Roqueiro D, Borgwardt K. Early recognition of sepsis with Gaussian process temporal convolutional networks and dynamic time warping. Machine Learning for Healthcare Conference; 2019: PMLR.
  11. ↵
    Barton C, Chettipally U, Zhou Y, et al. Evaluation of a machine learning algorithm for up to 48-hour advance prediction of sepsis using six vital signs. Comput Biol Med 2019; 109: 79–84.
    OpenUrl
  12. ↵
    Asuroglu T, Ogul H. A deep learning approach for sepsis monitoring via severity score estimation. Comput Methods Programs Biomed 2021; 198: 105816.
    OpenUrl
  13. ↵
    McCoy A, Das R. Reducing patient mortality, length of stay and readmissions through machine learning-based sepsis prediction in the emergency department, intensive care unit and hospital floor units. BMJ Open Qual 2017; 6: e000158.
    OpenUrlAbstract/FREE Full Text
  14. ↵
    Shimabukuro DW, Barton CW, Feldman MD, Mataraso SJ, Das R. Effect of a machine learning-based severe sepsis prediction algorithm on patient survival and hospital length of stay: A randomised clinical trial. BMJ Open Respir Res 2017; 4: e000234.
    OpenUrlAbstract/FREE Full Text
  15. ↵
    Zhang D, Yin C, Hunold KM, Jiang X, Caterino JM, Zhang P. An interpretable deeplearning model for early prediction of sepsis in the emergency department. Patterns (N Y) 2021; 2: 100196.
    OpenUrl
  16. ↵
    Rosnati M, Fortuin V. MGP-AttTCN: An interpretable machine learning model for the prediction of sepsis. PLoS One 2021; 16: e0251248.
    OpenUrl
  17. ↵
    Oei SP, van Sloun RJ, van der Ven M, Korsten HH, Mischi M. Towards early sepsis detection from measurements at the general ward through deep learning. Intell Based Med 2021; 5: 100042.
    OpenUrl
  18. ↵
    Johnson AE, Pollard TJ, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data 2016; 3: 1–9.
    OpenUrl
  19. ↵
    Sorani MD, Hemphill JC, Morabito D, Rosenthal G, Manley GT. New approaches to physiological informatics in neurocritical care. Neurocrit Care 2007; 7: 45–52.
    OpenUrlCrossRefPubMed
  20. Goldstein B, McNames J, McDonald BA, et al. Physiologic data acquisition system and database for the study of disease dynamics in the intensive care unit. Crit Care Med 2003; 31: 433–41.
    OpenUrlPubMed
  21. ↵
    Sun Y, Guo F, Kaffashi F, Jacono FJ, DeGeorgia M, Loparo KA. INSMA: An integrated system for multimodal data acquisition and analysis in the intensive care unit. J Biomed Inform 2020; 106: 103434.
    OpenUrl
  22. ↵
    Chen T, Guestrin C. Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016: 785–94.
  23. ↵
    Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems 2017; 30: 3146–54.
    OpenUrl
  24. ↵
    Lundberg S, Lee SI. A Unified approach to interpreting model predictions. Proceedings of the 31st international conference on neural information processing systems 2017: 4768–77.
  25. ↵
    Kenzaka T, Okayama M, Kuroki S, et al. Importance of vital signs to the early diagnosis and severity of sepsis: Association between vital signs and sequential organ failure assessment score in patients with sepsis. Intern Med 2012; 51: 871–6.
    OpenUrlPubMed
  26. ↵
    Matsubara T, Yamakawa K, Umemura Y, et al. Significance of plasma fibrinogen level and antithrombin activity in sepsis: A multicenter cohort study using a cubic spline model. Thromb Res 2019; 181: 17–23.
    OpenUrl
  27. ↵
    Brotfain E, Koyfman L, Toledano R, et al. Positive fluid balance as a major predictor of clinical outcome of patients with sepsis/septic shock after ICU discharge. Am J Emerg Med 2016; 34: 2122–6.
    OpenUrl
  28. ↵
    Martin GS, Mannino DM, Moss M. The effect of age on the development and outcome of adult sepsis. Crit Care Med 2006; 34: 15–21.
    OpenUrlCrossRefPubMedWeb of Science
  29. ↵
    Burdick H, Pino E, Gabel-Comeau D, et al. Effect of a sepsis prediction algorithm on patient mortality, length of stay and readmission: A prospective multicentre clinical outcomes evaluation of real-world patient data from US hospitals. BMJ Health Care Inform 2020; 27: e100109.
    OpenUrlPubMed
  30. ↵
    Lin T-L, Dhillon NK, Conde G, et al. Early positive fluid balance is predictive for venous thromboembolism in critically ill surgical patients. Am J Surg 2021; 222: 220–6.
    OpenUrl
Back to top
PreviousNext
Posted November 25, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
SEPRES: Sepsis prediction via a clinical data integration system and real-world studies in the intensive care unit
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
SEPRES: Sepsis prediction via a clinical data integration system and real-world studies in the intensive care unit
Qiyu Chen, Ranran Li, ChihChe Lin, Chiming Lai, Dechang Chen, Hongping Qu, Yaling Huang, Wenlian Lu, Yaoqing Tang, Lei Li
medRxiv 2021.05.13.21256281; doi: https://doi.org/10.1101/2021.05.13.21256281
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
SEPRES: Sepsis prediction via a clinical data integration system and real-world studies in the intensive care unit
Qiyu Chen, Ranran Li, ChihChe Lin, Chiming Lai, Dechang Chen, Hongping Qu, Yaling Huang, Wenlian Lu, Yaoqing Tang, Lei Li
medRxiv 2021.05.13.21256281; doi: https://doi.org/10.1101/2021.05.13.21256281

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Intensive Care and Critical Care Medicine
Subject Areas
All Articles
  • Addiction Medicine (157)
  • Allergy and Immunology (412)
  • Anesthesia (90)
  • Cardiovascular Medicine (852)
  • Dentistry and Oral Medicine (156)
  • Dermatology (97)
  • Emergency Medicine (247)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (392)
  • Epidemiology (8527)
  • Forensic Medicine (4)
  • Gastroenterology (381)
  • Genetic and Genomic Medicine (1731)
  • Geriatric Medicine (166)
  • Health Economics (370)
  • Health Informatics (1232)
  • Health Policy (615)
  • Health Systems and Quality Improvement (465)
  • Hematology (196)
  • HIV/AIDS (367)
  • Infectious Diseases (except HIV/AIDS) (10258)
  • Intensive Care and Critical Care Medicine (550)
  • Medical Education (192)
  • Medical Ethics (51)
  • Nephrology (209)
  • Neurology (1662)
  • Nursing (97)
  • Nutrition (247)
  • Obstetrics and Gynecology (321)
  • Occupational and Environmental Health (450)
  • Oncology (918)
  • Ophthalmology (261)
  • Orthopedics (99)
  • Otolaryngology (172)
  • Pain Medicine (110)
  • Palliative Medicine (40)
  • Pathology (249)
  • Pediatrics (532)
  • Pharmacology and Therapeutics (245)
  • Primary Care Research (205)
  • Psychiatry and Clinical Psychology (1751)
  • Public and Global Health (3814)
  • Radiology and Imaging (620)
  • Rehabilitation Medicine and Physical Therapy (316)
  • Respiratory Medicine (518)
  • Rheumatology (204)
  • Sexual and Reproductive Health (164)
  • Sports Medicine (156)
  • Surgery (188)
  • Toxicology (36)
  • Transplantation (99)
  • Urology (74)