Spatio-temporal clusters and patterns of spread of dengue, chikungunya, and Zika in Colombia

Background Colombia has one of the highest burdens of arboviruses in South America. The country was in a state of hyperendemicity between 2014 and 2016, with co-circulation of several Aedes-borne viruses, including a syndemic of dengue, chikungunya, and Zika in 2015. Methodology/Principal findings We analyzed the cases of dengue, chikungunya, and Zika notified in Colombia from January 2014 to December 2018 by municipality and week. The trajectory and velocity of spread was studied using trend surface analysis, and spatio-temporal high-risk clusters for each disease in separate and for the three diseases simultaneously (multivariate) were identified using Kulldorff’s scan statistics. During the study period, there were 366,628, 77,345 and 74,793 cases of dengue, chikungunya, and Zika, respectively, in Colombia. The spread patterns for chikungunya and Zika were similar, although Zika’s spread was accelerated. Both chikungunya and Zika mainly spread from the regions on the Atlantic coast and the south-west to the rest of the country. We identified 21, 16, and 13 spatio-temporal clusters of dengue, chikungunya and Zika, respectively, and, from the multivariate analysis, 20 spatio-temporal clusters, among which 7 were simultaneous for the three diseases. For all disease-specific analyses and the multivariate analysis, the most-likely cluster was identified in the south-western region of Colombia, including the Valle del Cauca department. Conclusions/Significance The results further our understanding of emerging Aedes-borne diseases in Colombia by providing useful evidence on their potential site of entry and spread trajectory within the country, and identifying spatio-temporal disease-specific and multivariate high-risk clusters of dengue, chikungunya, and Zika, information that can be used to target interventions.


6
110 diagnosis plus at least one serological positive immunoglobulin M test (IgM) or an epidemiological 111 link to a confirmed case 14 days prior to symptom onset [31,36,37]. Population data was obtained 112 from the National Administrative Department of Statistics of Colombia (DANE, 113 https://www.dane.gov.co/).

115
A trend surface analysis was performed to estimate the front wave velocity for chikungunya 116 and Zika, the two Aedes-diseases that emerged in Colombia during the study period. Dengue was not 117 considered for this analysis as the disease was already endemic in the country. Trend surface analysis 118 has been used to examine diffusion processes in two dimensions: time and space, using polynomial 119 regression. A continuous surface is estimated with the order of the model capturing the general 120 direction and speed of the emerging or front wave of an infectious disease [38][39][40].

121
For this analysis, we separately identified the first notification of a chikungunya and Zika case 122 for each municipality and then used the centroids of the municipalities, calculated in meters using 123 QGIS software [41]. The response variable was time in weeks from the first chikungunya or Zika 124 case notified for each centroid (X and Y coordinates) in a given municipality. The continuous surface 125 of time to notification was estimated by regressing it against the X and Y coordinates in meters.
126 Parameters were estimated using least squares regression, and if a simple 2-D plane through the points 127 is insufficient to model the data, high-order polynomials are often used to capture local scale trends 128 [42]. The best-fit model was selected using Bayesian information criterion (BIC) and the model with 129 polynomial terms of order five provided the best fit for the registration date.

130
The rate of change was obtained by taking the partial derivatives with respect to X and Y, for 131 the best-fit linear model, shown below as order five polynomial (1).

134
(2) ∂f(t|x,y) / ∂x = β 1 + 2β 3 X + 3β 5 X 2 + 4β 7 X 3 + 5β 9 X 4 + β 11 Y Equations (2) and (3) provide expressions for a slope vector at a given location (X,Y). The 137 vectors can be converted to express the magnitude and direction of rate of change (in days per km) 138 by finding the inner product of the vector, where magnitude ||xy|| = √(x 2 + y 2 ) and the direction θ = 139 tan -1 (y/x). The rate we were primarily interested in was velocity (in km per week), which was obtained 140 by inverting the final magnitude of the slope.
141 Scan statistics

142
To identify spatio-temporal clusters we used Kulldorff's scan statistics [43]. We used a 143 discrete Poisson model approach including the total number of cases per municipality offset by the 144 population at-risk (municipal population). Cluster detection was performed to identify statistically 145 significant space-time high-risk clusters of each disease separately and then performed for the three 146 diseases simultaneously, using the multivariate scan analysis.

147
Kulldorff's scan statistics determine the presence of clusters using a cylinder that moves 148 across space and time under predefined spatial and temporal scanning windows. The clusters are 149 identified by observing a higher risk within the cylinder compared to the risk outside of the cluster.
150 The relative risk (RR) is calculated as follows: is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.17.22272536 doi: medRxiv preprint 156 being P the total population in the study area and p the population within the cluster [43].

157
Clusters were ordered based on the likelihood ratio; clusters with the higher maximum 158 likelihoods were more-likely, i.e. with stronger evidence of fitting the definition of a cluster. The 159 likelihood function for the Poisson model is proportional to: 161 where I() is set to 1 when there are more cases than expected, and 0 otherwise. For the multivariate 162 scan analysis, the likelihood ratio of the cluster is the sum of the likelihood ratio for each disease, 163 calculated using equation 6 [43,44].

164
Each space-time cluster had its own start and end dates, which were used to calculate the 165 duration in weeks of the cluster, the number of accumulated cases observed in this period, the 166 population within the cluster, and its relative risk.

167
The spatial scanning window was based on the centroid of each municipality, with the 168 maximum size set to 150 km of radius and 20% of the total population at risk. Colombia's population 169 is highly concentrated in a few municipalities, with only 5 municipalities accounting for nearly 30% is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint 183 Results

184
During the study period, from January 2014 to December 2018, there were 366,628, 77,345 185 and 74,793 cases of dengue, chikungunya, and Zika, respectively, captured by the national is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.17.22272536 doi: medRxiv preprint 204 months, following a southern pattern of dispersal from Barranquilla and a radiating pattern of 205 dispersal from Tuluá (Fig 2A). Since the first notification of chikungunya until December 2018, it 206 was identified in 875 different municipalities in Colombia with an average speed of 27 km/week, 207 ranging from 1 to 397 km/week.

222
We identified 21, 16 and 13 spatio-temporal clusters of dengue, chikungunya and Zika, 223 respectively (Table 1). When we consider the median values for each disease, Zika case clusters 224 included more municipalities, were shorter in duration, and had higher relative risks compared to 225 chikungunya and dengue.

226
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The spatial distribution of the detected clusters was similar for all diseases (Fig 3A-C). The 232 most likely clusters for each disease were identified in the same region, the Valle del Cauca 233 department, including its capital Cali, and were also the largest clusters in terms of population (S1  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

247
From the multivariate analysis, 20 spatio-temporal clusters were identified. Among those, 7 248 were significant for the three diseases, 5 Fig 4A-B), and was also detected in the Valle del Cauca department. All 254 clusters significant for all three diseases were detected in 2015 ( Fig 4C). In general, these clusters 255 lasted longer than other clusters ( Fig 4D) and presented higher relative risks for Zika compared to 256 dengue and chikungunya (Fig 4E-G). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint 279 with findings from another study using a different methodology [50]. We also observed that, 280 compared to chikungunya, Zika's clusters were quicker and had higher relative risks, indicating a 281 more explosive epidemic. This is possibly a result of an increased demand of people seeking medical 282 care as a consequence of being more concerned about Zika due to the association of the virus with 283 congenital malformations and of neurological complications. Another possible explanation is of Zika 284 virus being more transmissible than chikungunya, supported by evidence that Ae. Aegypti is more 285 efficient for transmitting the Zika virus than the chikungunya virus, even if co-infected [51,52].

286
For the disease-specific analyses and the multivariate analysis, the most likely cluster was 287 consistently identified in the south-western region of Colombia and also presented elevated duration is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.17.22272536 doi: medRxiv preprint 292 identified clusters of Aedes-borne diseases in the south-western region of Colombia [29,30]. This 293 region presents a favorable environment and suitable climate for the Aedes mosquitoes and 294 historically has had the highest concentration of dengue cases in the country [34,54]. Factors such as 295 environmental, cultural (e.g. water storage inside the houses) and social and material deprivation, in 296 addition to a high population mobility due to trade with bordering countries and other cities have 297 favored the introduction and dissemination of arboviruses in this region [34, [55][56][57].

298
In is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.17.22272536 doi: medRxiv preprint The main limitations of this study are related to the quality and timeliness of the surveillance 317 data. We used official case counts that are from a passive surveillance system, meaning our study 318 population included only patients who sought health care. Underreporting is a known limitation when 319 working with surveillance data and is also an important challenge with Colombia's surveillance 320 system [62,63]. During a syndemic of arboviral diseases that share similar symptoms such as dengue, 321 chikungunya, and Zika, misclassification likely occurred as only a small proportion of cases are 322 laboratory confirmed (27.2%, 4.4% and 2.7% of cases of dengue, chikungunya, and Zika, 323 respectively, in Colombia in 2017), although differential diagnosis algorithms are used [63][64][65][66].
324 Earlier introductions of chikungunya may not have been captured by the surveillance system. In the 325 case of Zika, its introduction was expected after the Brazilian epidemic, however, given the mild and 326 generic nature of symptoms and the high proportion of asymptomatic persons, some cases may not 327 have been captured by the surveillance system, especially in non Aedes-endemic areas [20]. Sporadic 328 geographically dispersed cases were recorded in various parts of Colombia, which increased the 329 uncertainty associated with the front wave analysis. These cases, such as those in southeastern 330 Colombia, increased uncertainty in direction and speed estimates, which are also related to edge 331 effects. Edge effects occurred along the boundary of the study area, which in this study were 332 constructed by using fewer data points and are therefore less stable. One limitation of scan statistics 333 is that the method uses circular scans to detect clusters. This could result in low-risk municipalities 334 being considered as part of a cluster if surrounded by high risk municipalities. We counterbalanced 335 this by restricting the clusters' size and verifying the relative risk of the municipalities forming  In this study, we applied two methods to examine the emergence and spatio-temporal clusters 338 of Aedes-borne diseases in Colombia. The detection of simultaneous clusters using the multivariate 339 scan statistics analysis highlights the key hotspots areas for dengue, chikungunya, and Zika, and that 340 should be prioritized for interventions to reduce the burden of these diseases. Additionally, these areas . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.17.22272536 doi: medRxiv preprint 341 may also be at higher risk for other emerging Aedes-borne diseases. It is important that the 342 surveillance in these locations is strengthened to be able to early detect the circulating viruses as well 343 as unusual increase in the number of cases. Both methods are simple and provide helpful insight into 344 the trajectory of arboviral diseases in Colombia region, which can be used to inform targeted 345 interventions, such as enhanced surveillance activities and prevention activities. The methods have 346 the potential to be also applied for other emerging infectious diseases or variants of concern for 347 COVID-19.

350
The authors would like to thank the National Institute of Health of Colombia for making the 351 diseases' surveillance data publicly available. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.17.22272536 doi: medRxiv preprint