PT - JOURNAL ARTICLE AU - Davide, Ferrari AU - Jovana, Milic AU - Roberto, Tonelli AU - Francesco, Ghinelli AU - Marianna, Meschiari AU - Sara, Volpi AU - Matteo, Faltoni AU - Giacomo, Franceschi AU - Vittorio, Iadisernia AU - Dina, Yaacoub AU - Giacomo, Ciusa AU - Erica, Bacca AU - Carlotta, Rogati AU - Marco, Tutone AU - Giulia, Burastero AU - Alessandro, Raimondi AU - Marianna, Menozzi AU - Erica, Franceschini AU - Gianluca, Cuomo AU - Luca, Corradi AU - Gabriella, Orlando AU - Antonella, Santoro AU - Margherita, Di Gaetano AU - Cinzia, Puzzolante AU - Federica, Carli AU - Andrea, Bedini AU - Riccardo, Fantini AU - Luca, Tabbì AU - Ivana, Castaniere AU - Stefano, Busani AU - Enrico, Clini AU - Massimo, Girardis AU - Mario, Sarti AU - Andrea, Cossarizza AU - Cristina, Mussini AU - Federica, Mandreoli AU - Paolo, Missier AU - Giovanni, Guaraldi TI - Machine learning in predicting respiratory failure in patients with COVID-19 pneumonia - challenges, strengths, and opportunities in a global health emergency AID - 10.1101/2020.05.30.20107888 DP - 2020 Jan 01 TA - medRxiv PG - 2020.05.30.20107888 4099 - http://medrxiv.org/content/early/2020/06/02/2020.05.30.20107888.short 4100 - http://medrxiv.org/content/early/2020/06/02/2020.05.30.20107888.full AB - Aims The aim of this study was to estimate a 48 hour prediction of moderate to severe respiratory failure, requiring mechanical ventilation, in hospitalized patients with COVID-19 pneumonia.Methods This was an observational study that comprised consecutive patients with COVID-19 pneumonia admitted to hospital from 21 February to 6 April 2020. The patients’ medical history, demographic, epidemiologic and clinical data were collected in an electronic patient chart. The dataset was used to train predictive models using an established machine learning framework leveraging a hybrid approach where clinical expertise is applied alongside a data-driven analysis. The study outcome was the onset of moderate to severe respiratory failure defined as PaO2/FiO2 ratio <150 mmHg in at least one of two consecutive arterial blood gas analyses in the following 48 hours. Shapley Additive exPlanations values were used to quantify the positive or negative impact of each variable included in each model on the predicted outcome.Results A total of 198 patients contributed to generate 1068 usable observations which allowed to build 3 predictive models based respectively on 31-variables signs and symptoms, 39-variables laboratory biomarkers and 91-variables as a composition of the two. A fourth “boosted mixed model” included 20 variables was selected from the model 3, achieved the best predictive performance (AUC=0.84) without worsening the FN rate. Its clinical performance was applied in a narrative case report as an example.Conclusion This study developed a machine model with 84% prediction accuracy, which is able to assist clinicians in decision making process and contribute to develop new analytics to improve care at high technology readiness levels.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNo dedicated funding has yet been obtained for this work.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Comitato Etico Area Vasta Emilia Nord Via del Pozzo 71 41124 Modena ItalyAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesData are currently held in a private repository