PT - JOURNAL ARTICLE AU - Emily L. Aiken AU - Sarah F. McGough AU - Maimuna S. Majumder AU - Gal Wachtel AU - Andre T. Nguyen AU - Cecile Viboud AU - Mauricio Santillana TI - Real-time Estimation of Disease Activity in Emerging Outbreaks using Internet Search Information AID - 10.1101/19010470 DP - 2019 Jan 01 TA - medRxiv PG - 19010470 4099 - http://medrxiv.org/content/early/2019/11/02/19010470.short 4100 - http://medrxiv.org/content/early/2019/11/02/19010470.full AB - Understanding the behavior of emerging disease outbreaks in, or ahead of, real-time could help healthcare officials better design interventions to mitigate impacts on affected populations. Most healthcare-based disease surveillance systems, however, have significant inherent reporting delays due to data collection, aggregation, and distribution processes. Recent work has shown that machine learning methods leveraging a combination of traditionally collected epidemiological information and novel Internet-based data sources, such as disease-related Internet search activity, can produce meaningful “nowcasts” of disease incidence ahead of healthcare-based estimates, with most successful case studies focusing on endemic and seasonal diseases such as influenza and dengue. Here, we apply similar computational methods to emerging outbreaks in geographic regions where no historical presence of the disease of interest has been observed. By combining limited available historical epidemiological data available with disease-related Internet search activity, we retrospectively estimate disease activity in five recent outbreaks weeks ahead of traditional surveillance methods. We find that the proposed computational methods frequently provide useful real-time incidence estimates that can help fill temporal data gaps resulting from surveillance reporting delays. However, the proposed methods are limited by issues of sample bias and skew in search query volumes, perhaps as a result of media coverage.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study was funded in part by the Bill and Melinda Gates Foundation (OPP 1195154).Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data used in the study is publicly available. https://github.com/emilylaiken/outbreak-nowcasting/