Abstract
Understanding the past, current, and future dynamics of dengue epidemics is challenging yet increasingly important. To date, many techniques across statistics, mathematics, and machine learning have provided us with quantitative tools for studying dengue epidemics. Here, using data from provinces in northern Peru across 2010 to 2021, we provide a new interdisciplinary pipeline that draws on a new and existing techniques to provide comprehensive understanding and robust prediction of dengue epidemic dynamics.
Wavelet analyses can unveil spatiotemporal patterns in epidemic dynamics across annual and multi-annual time periods. Here, these included climatic forcing and greater spatial similarity in large outbreak years. Space-varying epidemic drivers included climatic influences and shorter pairwise distances driving greater epidemic similarity in more northerly coastal provinces. Then, using a Bayesian model, we can probabilistically quantify the timing, structure, and intensity of such climatic influences on Dengue Incidence Rates (DIRs), while simultaneously considering other influences. Recognising that a single model is generally sub-optimal for any forecasting task, we demonstrate how to form trained and untrained probabilistic ensembles for forecasting dengue cases in settings reflective of real-world conditions. We introduce a suite of climate-informed and covariate-free deep learning approaches that leverage big data and foundational time series, temporal convolutional networks, and conformal inference. We complement these modern techniques with statistically principled training and assessment of ensemble frameworks, while explicitly considering strong benchmark models, computational costs, public health priorities, and data availability limitations. In doing so, we show how ensemble frameworks consistently outperform individual models across space and time, and produce sharp and accurate forecasts with robust, reliable descriptions of uncertainty. We report interpretable classification metrics for detection of outbreaks to communicate our outputs with the wider public and public health authorities.
Looking forward, whether the objective is to understand and/or to predict epidemic dynamics, our modelling pipeline can be used in any dengue setting to robustly inform the decision-making and planning of public health authorities.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
C.M. is supported by a studentship from the UK's Engineering and Physical Sciences Research Council. M.U.G.K. acknowledges funding from The Rockefeller Foundation (PC-2022-POP-005), Google.org, the Oxford Martin School Programmes in Pandemic Genomics & Digital Pandemic Preparedness, European Union's Horizon Europe programme projects MOOD (No. 874850) and E4Warning (No. 101086640), the John Fell Fund, a Branco Weiss Fellowship and Wellcome Trust grants 225288/Z/22/Z, 226052/Z/22/Z & 228186/Z/23/Z, United Kingdom Research and Innovation (No. APP8583) and the Medical Research Foundation (MRF-RG-ICCH-2022-100069). The contents of this publication are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission or the other funders. C.A.D. is supported by the UK National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Emerging and Zoonotic Infections in partnership with Public Health England (PHE), Grant Number: HPRU200907. The funders had no role in the study design or analysis.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The dengue incidence surveillance data are publicly available from the National Centre for Epidemiology, Disease Prevention and Control (Peru CDC) in Peru's Ministry of Health. We sourced these from https://www.dge.gob.pe/salasituacional.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability Statement
The dengue incidence surveillance data are publicly available from the National Centre for Epidemiology, Disease Prevention and Control (Peru CDC) in Peru’s Ministry of Health. We sourced these from https://www.dge.gob.pe/salasituacional. Mid-year population estimates are made available by the National Institute of Statistics and Information of Peru at https://www.inei.gob.pe/media/MenuRecursivo/indices_tematicos/proy_04.xls. The WorldClim monthly historical climate data are available at https://www.worldclim.org/. SPI-6 data from the European Drought Observatory are available at https://jeodpp.jrc.ec.europa.eu/. The El Niño indices of the ONI and ICEN are available respectively from the NOAA (https://origin.cpc.ncep.noaa.gov/) and the Geophysical Institute of Peru (http://met.igp.gob.pe). All code and data used in our analysis is available at https://github.com/cathalmills/peru_dengue_province/.