RT Journal Article SR Electronic T1 Development of a data-driven COVID-19 prognostication tool to inform triage and step-down care for hospitalised patients in Hong Kong: A population-based cohort study JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2020.07.13.20152348 DO 10.1101/2020.07.13.20152348 A1 Eva L.H. Tsui A1 Carrie S.M. Lui A1 Pauline P.S. Woo A1 Alan T.L. Cheung A1 Peggo K.W. Lam A1 Van T.W. Tang A1 C.F. Yiu A1 C.H. Wan A1 Libby H.Y. Lee YR 2020 UL http://medrxiv.org/content/early/2020/07/20/2020.07.13.20152348.abstract AB Background This is the first study on prognostication in an entire cohort of laboratory-confirmed COVID-19 patients in the city of Hong Kong. Prognostic tool is essential in the contingency response for the next wave of outbreak. This study aims to develop prognostic models to predict COVID-19 patients’ clinical outcome on day 1 and day 5 of hospital admission.Methods We did a retrospective analysis of a complete cohort of 1,037 COVID-19 laboratory-confirmed patients in Hong Kong as of 30 April 2020, who were admitted to 16 public hospitals with their data sourced from an integrated electronic health records system. It covered demographic information, chronic disease(s) history, presenting symptoms as well as the worst clinical condition status, biomarkers’ readings and Ct value of PCR tests on Day-1 and Day-5 of admission. The study subjects were randomly split into training and testing datasets in a 8:2 ratio. Extreme Gradient Boosting (XGBoost) model was used to classify the training data into three disease severity groups on Day-1 and Day-5.Results The 1,037 patients had a mean age of 37.8 (SD±17.8), 53.8% of them were male. They were grouped under three disease outcome: 4.8% critical/serious, 46.8% stable and 48.4% satisfactory. Under the full models, 30 indicators on Day-1 and Day-5 were used to predict the patients’ disease outcome and achieved an accuracy rate of 92.3% and 99.5%. With a trade-off between practical application and predictive accuracy, the full models were reduced into simpler models with seven common specific predictors, including the worst clinical condition status (4-level), age group, and five biomarkers, namely, CRP, LDH, platelet, neutrophil/lymphocyte ratio and albumin/globulin ratio. Day-1 model’s accuracy rate, macro- and micro-averaged sensitivity and specificity were 91.3%, 84.9%-91.3% and 96.0%-95.7% respectively, as compared to 94.2%, 95.9%-94.2% and 97.8%-97.1% under Day-5 model.Conclusions Both Day-1 and Day-5 models can accurately predict the disease severity. Relevant clinical management could be planned according to the predicted patients’ outcome. The model is transformed into a simple online calculator to provide convenient clinical reference tools at the point of care, with an aim to inform clinical decision on triage and step-down care.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study received no external funding.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Research Ethics Committee(Kowloon Central / Kowloon East)All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe de-identified datasets generated and analysed during the current study are not publicly available for patient privacy protection as their disclosure at granular level may entail the risk of subject re-identification. Aggregate data are available from the corresponding author on reasonable request. The calculator tool which is developed based on the prognostic models' results of this study will be available for online public access after the study being published. AalbuminA/Gratio albumin-globulin ratioAIArtificial IntelligenceAIIairborne infection isolationALPalkaline phosphataseALTalanine aminotransferaseASTaspartate aminotransferaseCMSClinical Management SystemCOVID-19Coronavirus Disease 2019CRPC-reactive proteinECMOextracorporeal membrane oxygenationeNIDElectronic Notification of Infectious DiseaseGglobulinHAHospital AuthorityHKHong KongICUintensive care unitILIinfluenza-like illnessLDHlactate dehydrogenaseMPVmean platelet volumeN/Lratio neutrophil-lymphocyte ratioNDORSNotifiable Diseases and Outbreak Reporting SystemPCTprocalicitoninRT-PCRreverse-transcription polymerase chain reactionSARSSevere Acute Respiratory SyndromeSDstandard deviationWBCwhite blood cell countWHOWorld Health OrganizationXGBoostExtreme Gradient Boosting