TY - JOUR T1 - Development and validation of a machine learning model for predicting illness trajectory and hospital resource utilization of COVID-19 hospitalized patients – a nationwide study JF - medRxiv DO - 10.1101/2020.09.04.20185645 SP - 2020.09.04.20185645 AU - Michael Roimi AU - Rom Gutman AU - Jonathan Somer AU - Asaf Ben Arie AU - Ido Calman AU - Yaron Bar-Lavie AU - Udi Gelbshtein AU - Sigal Liverant-Taub AU - Arnona Ziv AU - Danny Eytan AU - Malka Gorfine AU - Uri Shalit Y1 - 2020/01/01 UR - http://medrxiv.org/content/early/2020/09/22/2020.09.04.20185645.abstract N2 - Background The spread of COVID-19 has led to a severe strain on hospital capacity in many countries. There is a need for a model to help planners assess expected COVID-19 hospital resource utilization.Methods Retrospective nationwide cohort study following the day-by-day clinical status of all hospitalized COVID-19 patients in Israel from March 1st to May 2nd, 2020. Patient clinical course was modelled with a machine learning approach based on a set of multistate Cox regression-based models with adjustments for right censoring, recurrent events, competing events, left truncation, and time-dependent covariates. The model predicts the patient’s entire disease course in terms of clinical states, from which we derive the patient’s hospital length-of-stay, length-of-stay in critical state, the risk of in-hospital mortality, and total and critical care hospital-bed utilization. Accuracy assessed over eight cross-validation cohorts of size 330, using per-day Mean Absolute Error (MAE) of predicted hospital utilization averaged over 64 days; and area under the receiver operating characteristics (AUROC) for individual risk of critical illness and in-hospital mortality, assessed on the first day of hospitalization. We present predicted hospital utilization under hypothetical incoming patient scenarios.Findings During the study period, 2,703 confirmed COVID-19 patients were hospitalized in Israel. The per-day MAEs for total and critical-care hospital-bed utilization, were 4·72 ± 1·07 and 1·68 ± 0·40 respectively; the AUROCs for prediction of the probabilities of critical illness and in-hospital mortality were 0·88 ± 0·04 and 0·96 ± 0·04, respectively. We further present the impact of several scenarios of patient influx on healthcare system utilization, and provide an R software package for predicting hospital-bed utilization.Interpretation We developed a model that, given basic easily obtained data as input, accurately predicts total and critical care hospital utilization. The model enables evaluating the impact of various patient influx scenarios on hospital utilization and planning ahead of hospital resource allocation.Funding The work was funded by the Israeli Ministry of Health. M.G. received support from the U.S.-Israel Binational Science Foundation (BSF, 2016126).Research in contextEvidence before this studyEvidence before this study COVID19 outbreaks are known to lead to severe case load in hospital systems, stretching resources, partially due to the long hospitalizations needed for some of the patients. There is a crucial need for tools helping planners assess future hospitalization load, taking into account the specific characteristics and heterogeneity of currently hospitalized COVID19 patients, as well as the characteristics of incoming patients. We searched PubMed for articles published up to September 9, 2020, containing the words “COVID19” and combinations of “hospital”, “utilization”, “resource”, “capacity” and “predict”. We found 145 studies; out of them, several included models that predict the future trend of hospitalizations using compartment models (e.g. SIR models), or by using exponential or logistic models. We discuss two of the more prominent ones, which model explicitly the passage of patients through the ICU. These models (i) do not take into account individual patient characteristics; (ii) do not consider length-of-stay heterogeneity, despite the fact that bed utilization is in part determined by a long tail of patients requiring significantly longer stays than others; (iii) do not correct for competing risks bias. We further searched for studies containing the words “COVID19” and “multistate”, and “COVID19” and “length” and “stay”. Out of 317 papers, we found two using multistate models focusing only on patients undergoing ECMO treatment.Added value of this studyAdded value of this study We present the first model predicting hospital load based on the individual characteristics of hospitalized patients: age, sex, clinical state, and time already spent in-hospital. We combine this with scenarios for incoming patients, allowing for variations by age, sex and clinical state. The model’s precise predictions are based on a large sample of complete, day-by-day disease trajectories of patients, with a full coverage of the entire COVID-19 hospitalized population in Israel up to early May, 2020 (n =2, 703). We provide the model, as well as software for fitting such a model to local data, and an anonymized version of the dataset used to create the model.Implications of all the available evidenceImplications of all the available evidence Accurate predictions for hospital utilization can be made based on easy to obtain patient data: age, sex, and patient clinical state (moderate, severe or critical). The model allows hospital-and regional-level planners to allocate resources in a timely manner, preparing for different patient influx scenarios.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThe work was funded by the Israeli Ministry of Health, as well as by The Gertner Institute for Epidemiology and Health Policy Research. M.G. received support from the U.S.-Israel Binational Science Foundation (BSF, 2016126).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:An exemption from institutional review board approval was determined by the Israeli Ministry of Health as part of an active epidemiological investigation, based on use of anonymous data only and no medical intervention.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAn anonymized version of the data used for this study will be made available in the near future. ER -