PT - JOURNAL ARTICLE AU - Sarah Kefayati AU - Fred Roberts AU - Sayali Pethe AU - Xuan Liu AU - Hu Huang AU - Vishrawas Gopalakrishnan AU - Piyush Madan AU - Jianying Hu AU - Prithwish Chakraborty AU - Raman Srinivasan AU - Ajay Deshpande AU - Gretchen Jackson TI - On Machine Learning-Based Short-Term Adjustment of Epidemiological Projections of COVID-19 in US AID - 10.1101/2020.09.11.20180521 DP - 2020 Jan 01 TA - medRxiv PG - 2020.09.11.20180521 4099 - http://medrxiv.org/content/early/2020/09/13/2020.09.11.20180521.short 4100 - http://medrxiv.org/content/early/2020/09/13/2020.09.11.20180521.full AB - Epidemiological models have provided valuable information for the outlook of COVID-19 pandemic and relative impact of different mitigation scenarios. However, more accurate forecasts are often needed at near term for planning and staffing. We present our early results from a systemic analysis of short-term adjustment of epidemiological modeling of COVID 19 pandemic in US during March-April 2020. Our analysis includes the importance of various types of features for short term adjustment of the predictions. In addition, we explore the potential of data augmentation to address the data limitation for an emerging pandemic. Following published literature, we employ data augmentation via clustering of regions and evaluate a number of clustering strategies to identify early patterns from the data.From our early analysis, we used CovidActNow as our underlying epidemiological model and found that the most impactful features for the one-day prediction horizon are population density, workers in commuting flow, number of deaths in the day prior to prediction date, and the autoregressive features of new COVID-19 cases from three previous dates of the prediction. Interestingly, we also found that counties clustered with New York County resulted in best preforming model with maximum of R2= 0.90 and minimum of R2=0.85 for state-based and COVID-based clustering strategy, respectively.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNo external funding was received.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The IRB exemption decision for this study was ruled by Western Institutional Review Board per below: "We determined this study is exempt from IRB review because it does not meet the definition of human subject research as defined in 45 CFR 46.102. Specifically, this project involves analysis of data from publicly available datasets and deidentified private datasets. The research activities do not involve human subjects, because the activities do not involve interaction or intervention with the subjects. Additionally, the investigator will not be able to readily ascertain the identity of any of the human subjects whose data is used in this project." All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data used in model building are all publicly available data except comorbidity data that is IBM proprietary.