Mapping Climate Change’s Impact on Cholera Infection Risk in Bangladesh ========================================================================= * Sophia E. Kruger * Paul A. Lorah * Kenichi W. Okamoto ## Abstract Several studies have investigated how *Vibrio cholerae* infection risk changes with increased rainfall, temperature, and water pH levels for coastal Bangladesh, which experiences seasonal surges in cholera infections associated with heavy rainfall events. While coastal environmental conditions are understood to influence *V. cholerae* propagation within brackish waters and transmission to and within human populations, it remains unknown how changing climate regimes impact the risk for cholera infection throughout Bangladesh. To address this, we developed a random forest species distribution model to predict the occurrence probability of cholera incidence within Bangladesh for 2015 and 2050. Using R, our random forest model was trained on cholera incidence data and spatial environmental raster data at a resolution of 250 square meters. This model was then predicted to environmental data for the training data year (2015) and for 2050. We interfaced R with ArcGIS to develop risk maps for cholera infection for the years 2015 and 2050, proxying infection risk with cholera occurrence probability predicted by the model. The best-fitting model predicted cholera occurrence given elevation and distance to water. We find that although cells of high risk cluster along the coastline predominantly in 2015, by 2050 high-risk areas expand from the coast to inland Bangladesh with all but the northwestern district of Rangpur seeing increased clusters around surface water. Mapping the geographic distribution of cholera infections given projected environmental conditions provides a valuable tool for guiding proactive public health policy tailored to areas most at risk of future disease outbreaks. ## Introduction Cholera, a waterborne bacterial disease that causes severe diarrhea and dehydration in humans, remains a significant threat to global health. Despite proposed efforts to reduce global cholera mortality by 90% by 2030 (World Health Organization, 2017), researchers estimate that between 1.3 million and 4 million cholera cases occur annually, with an estimated 21,000 to 143,000 deaths (Ali et al., 2015). The etiological agent *Vibrio cholerae* resides in coastal brackish water and riverine habitats and is typically seeded along coastlines (Escobar et al., 2015). Among many proposed hosts, vectors, and reservoirs of infection, zooplankton remain the largest known environmental reservoir of *V. cholerae* (Vezzulli et al., 2010). Consumption of seafood or water contaminated with an infective dose of free-floating *V. cholerae* or *V. cholerae-*harboring zooplankton causes human infections, while infection may also occur through fecal-oral transmission between human hosts. Such transmission pathways are influenced by environmental conditions in waterbodies that favor bacterial growth (Lemaitre et al., 2019). Changes to such waterbodies influence the epidemiology and ecology of *V. cholerae* by altering bacterial reproduction, transmission, and exposure risks. Climatic conditions, such as rainfall and sea surface temperature, thus drive epidemiological risk, with warmer, wetter environments increasing the likelihood of disease transmission and infection (Christaki et al., 2020). Salinity, pH, and sea surface temperature have been shown to encourage bacterial growth and hence *V. cholerae* infection risk and endemicity in the Bay of Bengal (Islam et al., 2020). However, future climate conditions can also promote increased infection risk in inland populations. Heavy rainfall events (e.g., El Niño and Southern Oscillation and summer monsoons) increase cholera infection risk by damaging sanitation systems and contaminating water sources with sewer spillage (Lemaitre et al., 2019; Moore et al., 2017; Koelle et al., 2005). Surface water contaminated with brackish coastal waters may also serve as sources of infection after flooding events (Khan et al., 2017). Cholera infection risk may also increase in periods of drought, during which reliance on scarce water sources increases the likelihood of contamination with *V. cholerae,* especially if human hygiene practices partake in waters used for drinking water (Hashizume et al., 2008). Curbing widespread infection, mortality, and social disruption requires characterizing the epidemiological risk for cholera, which in turn depends on how regional weather, land-use practices, and climate conditions influence cholera epidemiology and ecology. Risk mapping, a method of associating risk values to explicit geographic areas, has become an effective tool for not only visualizing the spatial distribution of disease burden (i.e., risk) but also for guiding public health policy to reduce that burden (Peterson, 2014; Leta et al., 2018). One approach to estimating risk across a landscape is to use non-mechanistic correlative models that predict infection risk given disease incidence data (e.g., disease presence/absence) and environmental covariates. Predicting risk under future environmental and climate scenarios is essential for disease surveillance and is a powerful tool in guiding proactive public health policy for areas most at risk of future disease outbreaks. Such a strategy is particularly critical in endemic areas, as pandemic strains of *V. cholerae* almost invariably emerge from endemic areas that seed epidemics abroad (Azman et al., 2020; Christaki et al., 2020). Several studies have sought to predict risk for cholera infection given climate and weather differences via risk-mapping (Xu et al., 2013; Baker-Austin et al., 2013; Escobar et al., 2015; Khan et al., 2017; Lessler et al., 2018; and Azman et al., 2020). Most risk-mapping studies restrict their analyses to present climatic conditions or limit climate projections to coastal settings only. To our knowledge, no study to date integrates long-term climate projections into risk mapping, especially for inland populations of endemic countries notoriously affected by climate change. Such an analysis is critical to sustaining effective regional public health strategies over the medium and long term. Here we construct risk maps for cholera infection for Bangladesh under current and future climate scenarios. We identify spatial environmental variables associated with human cholera infection and cholera incidence data from a detailed country-wide serosurvey study (Azman et al., 2020), and employ a fitted random forest model to predict the risk of infection across Bangladesh at a fine spatial resolution. Below, we characterize our analyses in greater detail. ## Materials and Methods ### (a) Cholera occurrence data We used a serosurvey dataset described in Azman et al. (2020; anonymized data are publicly available at [https://github.com/HopkinsIDD/Bangladesh-Cholera-Serosurvey](https://github.com/HopkinsIDD/Bangladesh-Cholera-Serosurvey)) that identifies cholera prevalence within Bangladesh for 2015 for our disease presence data. Of the 2930 surveyed individuals, the 639 predicted positive cases constituted our model’s presence data while the predicted 2291 negative cases constituted absence data. The approximate coordinate location of each surveyed individual was also used by our model to extract values from our spatial covariates. Notably, multiple presence or background points may exist at the same coordinate location as serum samples were often taken from multiple individuals within the same household. ### (b) Spatial environmental data To develop our model, we considered 13 spatial variables known to correlate with *V. cholerae* occurrence and case incidence and for which data were available for 2015 and 2050 (Table 1). Given our interest in predicting risk for the entirety of Bangladesh, we restricted our variables to those with values available for each cell in the extent used. Moreover, as *V. cholerae* can be found in semiaquatic and seasonally aquatic settings (Colwell, 1996; Islam et al., 2020), we excluded environmental variables describing aquatic environments only. All raster datasets were projected to the World Geodetic System 84 (WGS 84) projection, resampled to a 0.00214° (approximately 250-m2) resolution, and cropped to the extent of our study area using the ‘raster’ package version 3.4-13 in R (R Core Team, 2021; Hijmans and van Etten 2012; also, see Supplementary Material). View this table: [Table 1.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/T1) Table 1. Description of the environmental variables considered in model building. Included, where available, are descriptions of each variable’s data (for 2015 and 2050), data sources, and references to known associations with cholera incidence. ### (c) Statistical analyses We constructed a predictive model estimating cholera incidence in each 250-m2 raster cell in 2015 as a function of the spatial correlates using the presence-absence algorithm of the ‘randomForest’ package version 4.6-14 in R (Liaw and Wiener, 2002). Briefly, the random forest (RF) algorithm uses bootstrap aggregation and resampling to create an ensemble of lowly correlated decision trees that together classify each datapoint (Brieman, 2001; Muschelli and Varadhan, 2014). To select the best-fitting model, we performed a stepwise model selection procedure using the variable importance measures from the RF model calibrated and evaluated with all covariates included. From this model, we selected the highest contributing variable to first create a univariate RF model using 80% of each sample group (i.e., presence and absence) as training data for model calibration and the remaining 20% for model evaluation. We ran the univariate models for 1000 iterations, computing the area under the curve (AUC) statistic from the receiver operating curve (ROC) generated for each run to create a 95% confidence interval of the AUC. From here, covariates were added individually to this model if the AUC confidence interval generated for the new model over 1000 iterations indicated improved predictive ability over the univariate model. For each iteration of the RF model, we used the algorithm’s default settings in R to perform a supervised classification. Once the relevant variables were identified, we ran the best-fitting RF model from 2015 1000 times, training and evaluating the model of each iteration with the same 80% sample or 20% sample of presence-absence data, respectively. With each iteration, the model fitted to the 2015 data predicted cholera occurrence probabilities for 2050 for each 250-m2 cell. From these predictions, we constructed a mean, 2.5%-, and 97.5%-quantile rasterized map for each year by determining the mean, 2.5%-quantile, and 97-5%-quantile values for each cell within Bangladesh. Using the ‘arcgisbinding’ package in R, we interfaced ArcGIS Pro version 2.6.3 with R to transfer the raster maps generated in R to ArcGIS to ensure our rasters were of the appropriate resolution and extent (ESRI, 2019; ESRI, 2020). All code used in the analysis is publicly available on github (github.com/sophiakruger/cholera_risk) and released under the GNU Public License v.3 (Stallman, 2007). ## Results ### (a) Drivers of cholera infection risk Our random forest classification model including all predictors (“the full model”) showed elevation as the most prominent predictor (*see* supplementary materials, Table S1). Thus, we began our stepwise model selection by sequentially adding other predictors to a univariate model with elevation as the sole predictor. The random forest classification model including elevation and distance to water as predictors increased the model’s predictive power compared to the full model and outperformed all other predictors that were added to the univariate model (Table 2). Model performance invariably declined when additional variables were added one at a time to the bivariate model (results not shown, *see* supplementary materials, Table S2). Generally, we find cholera infection risk increased with lower elevation and a shorter distance to the nearest surface water body (supplementary materials, figures S1 and S2). View this table: [Table 2.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/T2) Table 2. Summary of the top performing models obtained from our stepwise model comparison process. The area-under-the-curve (AUC) statistic identifies the bivariate model as the best-fitting model. The AUC confidence interval for the full model, univariate elevation model, and univariate distance-to-water model reflects the change of interval extremes (2.5% and 97.5% quantiles) from a null model (where AUC=0.50). The AUC confidence interval for the bivariate model reflects the increase in AUC from the mean AUC of the univariate elevation model. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/06/14/2022.06.09.22276227/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/F1) Figure 1. Change in occurrence probability (a proxy for infection risk) from 2015 to 2050 according to the bivariate random forest model with elevation and distance to water as predictors. The (a) 2.5% quantile, (b) mean, and (c) 97.5% quantile predicted values are shown with the districts of Rangpur (A), Rajshahi (B), Dhaka (C) with the capitol of Dhaka starred, Sylhet (D), Khulna (E), Barisal (F), and Chittagong (G). Supplementary figures S3 and S4 contain the underlying occurrence probabilities for each year. ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/06/14/2022.06.09.22276227/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/F2) Figure 2. Average occurrence probability of 0.50 or greater for (a) 2015 and (b) 2050 according to the bivariate random forest model with elevation and distance to water as predictors. The districts of Rangpur (A), Rajshahi (B), Dhaka (C) with the capitol of Dhaka starred, Sylhet (D), Khulna (E), Barisal (F), and Chittagong (G) are highlighted. Supplementary figures S3 and S4 contain the underlying occurrence probabilities for the entirety of Bangladesh for each year. ### (b) Spatial predictions of cholera infection risk We find that the distribution of cholera infection risk changes over time, with coastal and inland Bangladesh projected to experience an increase in cholera infection occurrence probability from 2015 to 2050 (Fig. 1(a)-1(c)). Even under the most conservative estimate for 2050, we find risk increases along tributaries, running inland from the coast (Fig. 1(a)). In 2015, cells with an average occurrence probability of 0.50 or greater cluster tightly along the coast of the Khulna and Barisal districts and are more widely distributed inland, though many follow the Padma River north into the district of Dhaka (Fig. 2(a)). Yet by 2050, clusters of cells with an occurrence probability of 0.50 or greater are predicted to increase inland in the districts of Khulna, Barisal, Chittagong, Rajshahi, Dhaka, and Sylhet (Fig. 2(b)). Notably, while an occurrence probability of 0.50 and greater cluster around major river systems along district boundaries in 2015, by 2050 these risk clusters expand inland latitudinally (Fig. 2(b)). ## Discussion In this study, we predicted how changing climatic and land-use patterns can alter the risk for cholera infection at very fine spatial scales for the entirety of Bangladesh between the years 2015 and 2050. Using a species distribution modelling approach, we found areas with low elevation and shorter distances to surface water to be at highest risk. Areas at low elevations have greater potential for inundation from future rainfall events, which may compromise sanitation systems and increase risk for the spread of waterborne pathogens. Not only this, but projected increases in coastal vulnerability to *V. cholerae* (Escobar et al., 2015) and more frequent heavy rainfall events will also likely increase the presence of *V. cholerae* in surface waters at these elevations (Kirby et al., 2016). Low elevation areas are also likely at greater risk for infection than those of higher elevation given human settlement patterns on low-lying arable land, along rivers and other surface water. To the extent that high population density correlates with increased risk for infection, whether through increased contact with positive cases, sanitation system strain, or under-development and poverty, these areas therefore exhibit greater potential for human-to-human cholera spread (Borroto and Martinez-Piedra, 2000; Siddique et al., 1992; Root, 1997; Penrose et al., 2010). We find that although cells of high risk (designated as having a cholera case occurrence probabilities of 0.50 and higher) cluster along the coastline predominantly in 2015, by 2050 high-risk areas expand from the coast to inland Bangladesh with all but the northwestern district of Rangpur seeing increased clusters around surface water. The overall increased risk for infection in inland Bangladesh indicates that coastal vulnerability to infection translates to increased inland infection risk. This is worrying given the predicted doubling of ENSO events in the future which will only increase coastal cholera incidence (Cai et al., 2014; Escobar et al., 2015). While other studies have sought to develop risk maps for cholera infection and ecological presence under current climatic conditions for the entirety of Bangladesh and for future climatic conditions in strictly coastal and marine areas, our study expanded the spatial scope of predictions under a future climate scenario to include inland Bangladesh, where approximately 70% of the population lives (Ahmad, 2019). Given ongoing efforts to reduce global cholera morbidity by 90% by 2030, our study offers valuable insight into projected high-risk areas in need of continued, if not additional, public health intervention measures to reduce the burden of disease in the coming decades. Even in the presence of infrastructural and public health advances, predictive risk mapping studies for cholera infection risk will continue to be essential in reducing the disease burden. This is because such predictions characterize a baseline set of expectations about the distribution of infection risk if future conditions resemble current circumstances. Moreover, novel cholera strains are expected to continue to arise in Bengali waters, due in part to cholera biology in the environmental reservoir. For instance, while bacteriophage niche adaptation has allowed bacteriophages to prey on *V. cholerae* infecting zooplankton in fresh and estuary water, coevolution enables *V. cholerae* to resist bacteriophage predation (Silva-Valenzuela and Camilli, 2019; Angermeyer et al., 2018). Additionally, phages can facilitate the evolution of specific toxigenic *V. cholerae* biotypes through horizontal transfer of genes associated with virulence or enhanced environmental fitness (Faruque and Mekalanos, 2012). This suggests that aquatic interactions between bacteriophages and strains of *V. cholerae* can not only select for more environmentally persistent strains, but also more virulent strains with the capacity to seed epidemics. Climate change is likely to affect not only the distribution of waterborne diseases inland, but also socioeconomic conditions and infrastructural integrity. Thus, further modelling studies should seek to include covariates of the latter in combination with climatic variables to predict infection risk. Such models should also consider the potential for climate-associated human migration inland from vulnerable coastal regions to influence inland risk. In developing our model, we initially found the distance from each grid cell to the coast of Bangladesh to be an important variable in predicting cholera infection occurrence, with closer distances experiencing higher cholera occurrence probabilities. However, the lack of coastline projections for 2050 prevented us from including that variable in our model. Therefore, there exists a need for accurate coastline data under future climate scenarios to support robust predictive studies into disease occurrence. In addition to supporting the need for accurate sociological variable data— which is difficult to project decades into the future—remote sensing data could fill this need and in turn be useful in training models that seek to consider the interplay between human hosts and their environment on the risk for cholera infection. As with our study, to generate valid risk predictions future models must also rely on robust case incidence data that reflects actual disease prevalence. Risk predictions from correlative models may also improve with added model complexity, but potentially at the expense of explanatory power. In future cholera infection risk forecasting studies, researchers should consider the use of hierarchical spatial models or neural networks in spatial distribution modelling that have been shown to generate robust predictions in emerging infectious disease studies (Métras et al., 2015; Redding et al., 2017; Asadgol et al., 2019; Deneu et al., 2021). There is also a need for mechanistic models of transmission. Species distribution models (SDMs), like that of this study, represent a key first step in developing such models, but may not include the effect of climate-sensitive ecological processes on model predictions (Cuddington et al., 2013). Therefore, in the context of global change, modelling the spatial distribution of risk for cholera infection is best done using process-based models that will use our model’s infection probabilities, consider the correlative components of our model, and incorporate the ecological mechanisms influencing the distribution of cholera and human transmission. Nevertheless, our study holds importance in providing robust inland climate-associated cholera infection risk predictions that can inform preventive Bengali public health strategies. ## Data Availability All code used in the analysis is publicly available on github (github.com/sophiakruger/cholera_risk) and released under the GNU Public License v.3 (Stallman, 2007). This paper uses publicly available data sets (obtained from [https://github.com/HopkinsIDD/Bangladesh-Cholera-Serosurvey/tree/master/data](https://github.com/HopkinsIDD/Bangladesh-Cholera-Serosurvey/tree/master/data)). These data are de-identified and intended for public use. [github.com/sophiakruger/cholera\_risk](http://github.com/sophiakruger/cholera_risk) ## Supplementary Materials ### S1. Spatial Data Manipulation in R and ArcGIS To interface R programming language with ArcGIS Pro, a Geographic Information Systems (GIS) software, we used the ‘arcgisbinding’ package in R to facilitate loading ArcGIS raster layers into R and exporting raster layers from R to ArcGIS (ESRI, 2019). All raster data available from source as TIFF files were uploaded into ArcGIS Pro, projected to WGS84, resampled to a resolution of 250m square grid cells, and cropped to the rectangular extent surrounding the country of Bangladesh. We used the administrative boundary level 0 provided by the GADM spatial database (v. 3.6) as the extent for our study area (88.01057°W, 92.67366°E, 20.74111°S, 26.63407°N). Image service layers, provided by ESRI’s Living Atlas Portal, were imported into ArcGIS through the portal, then projected, resampled, and cropped with the same procedure as described for all TIFF raster files. Of the covariates used for 2015, the average precipitation and temperature raster layers were products of additional data manipulation that occurred in R and ArcGIS Pro. For 2050, the precipitation, temperature, and elevation rasters required additional manipulation. Below we detail additional steps taken to include these data as spatial covariates in our model. #### S1.1 Average Precipitation and Temperature Rasters (2015 and 2050) The average precipitation and maximum and minimum temperature rasters were created from average monthly climate data (as NetCDF files) provided by TerraClimate (Abatzoglou, et al., 2018). The NetCDF files were converted in R to GeoTIFF files (see “file-coversion.R” in repository), then exported to ArcGIS for projecting, resampling, and cropping using the ModelBuilder functionality. Using the raster calculator tool in ModelBuilder, the raster bands for October 2015 through January 2016 were averaged to estimate the average monthly precipitation accumulation (mm), maximum temperature (°C), and minimum temperature (°C) experienced in Bangladesh during the survey period of the Azman data. The months were selected with the assumption that the climatic conditions during the survey period would have influenced measured incidence. To consider the possibility that cholera incidence is a lagging indicator predicted by rainfall and temperatures of the monsoon season, we also created separate monsoon temperature and precipitation covariates for 2015 using the same methodology to average the monthly values for June through September. This process of averaging the precipitation and temperature data in ArcGIS was also replicated in R (see “2015-raster-manipulation.R” in repository) to ensure that estimated averages were consistent across platforms. For 2050, the average precipitation and maximum and minimum temperature rasters from WorldClim (v2.1) were created following the same procedure as described for the TerraClimate files, though these layers were downloaded from source as TIFF files and thus did not need conversion (see “2050-raster-manipulation.R” for the averaging process in R). #### S1.2 Elevation Raster (2050) The 90-meter coastal elevation layer from Kulp and Strauss (2018) at ClimateCentral was projected, resampled, and cropped to Bangladesh in ArcGIS. The raster layer was then exported to R wherein missing values were imputed from the 2015 elevation layer using a generalized linear model in R (see “2050-raster-manipulation.R” in repository). Once all layers were manipulated using the ModelBuilder functionality in ArcGIS, all raster layers were exported for use in R. To avoid memory overload issues in using R solely to project, resample, and crop global raster layers, our methodology relies on ArcGIS’s ModelBuilder to do the same and export all raster layers to R for analysis. Nonetheless, the code written in R follows the procedure streamlined in ArcGIS (e.g., projecting, resampling, and cropping) to encourage reproducibility and to also ensure that all raster layers, once imported into R, align to the desired raster cell size, global projection, and study area extent (see “2015-raster-manipulation.R” and “2050-raster-manipulation.R” in repository). View this table: [Table S1.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/T3) Table S1. Confidence interval (95%) of the variable importance (mean decrease in Gini) for each covariate included in our full model. A higher mean decrease in Gini coefficient reflects a variable’s greater importance to the random forest model. The (M) differentiates our temperature and precipitation variables between data taken during the monsoon period (M) and not. View this table: [Table S2.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/T4) Table S2. Change in AUC confidence interval for each model obtained in our stepwise model selection process. ![Figure S1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/06/14/2022.06.09.22276227/F3.medium.gif) [Figure S1.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/F3) Figure S1. Response curve of the 2015 occurrence probabilities generated by our bivariate random forest model against the elevation values for 2015. ![Figure S2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/06/14/2022.06.09.22276227/F4.medium.gif) [Figure S2.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/F4) Figure S2. Response curve of the 2015 occurrence probabilities generated by our bivariate random forest model against the distance to water values for 2015. ![Figure S3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/06/14/2022.06.09.22276227/F5.medium.gif) [Figure S3.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/F5) Figure S3. Mean and quantile (2.5% and 97.5%) risk maps predicting cholera infection risk for 2015 from predictions constructed by our best-fitting random forest model. Risk values range from 0 to 1 with 1 representing the highest risk for cholera infection in the specified geographic area. ![Figure S4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/06/14/2022.06.09.22276227/F6.medium.gif) [Figure S4.](http://medrxiv.org/content/early/2022/06/14/2022.06.09.22276227/F6) Figure S4. Mean and quantile (2.5% and 97.5%) risk maps predicting cholera infection risk into 2050 from predictions constructed by our best-fitting random forest model. Risk values range from 0 to 1 with 1 representing the highest risk for cholera infection in the specified geographic area. ## Acknowledgements This research was made possible in part by a Sustainability Scholars grant awarded by the Undergraduate Research Opportunities Program at the University of St. Thomas. The authors acknowledge the Minnesota Supercomputing Institute (MSI) at the University of Minnesota for providing resources that contributed to the research results reported within this paper. URL: [http://www.msi.umn.edu](http://www.msi.umn.edu). The authors also thank Charlie Frye for his helpful recommendations regarding covariate data. ## Footnotes * sophia.kruger{at}stthomas.edu, palorah{at}stthomas.edu * Received June 9, 2022. * Revision received June 9, 2022. * Accepted June 14, 2022. * © 2022, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. 1.Abatzoglou, John T., Solomon Z. Dobrowski, Sean A. Parks, and Katherine C. Hegewisch. “TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958–2015.” Scientific Data 5, no. 1 (2018): 1–12. 2. 2.Ahmad, H. (2019). Bangladesh coastal zone management status and future trends. Journal of Coastal Zone Management, 22(1), 1–7. 3. 3.Ali, M., Emch, M., Donnay, J. P., Yunus, M., & Sack, R. B. (2002). Identifying environmental risk factors for endemic cholera: a raster GIS approach. Health & Place, 8(3), 201–210. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S1353-8292(01)00043-0&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12135643&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F06%2F14%2F2022.06.09.22276227.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000177619000006&link_type=ISI) 4. 4.Ali, M., Nelson, A. R., Lopez, A. L., & Sack, D. A. (2015). Updated global burden of cholera in endemic countries. PLoS neglected tropical diseases, 9(6), e0003832. 5. 5.Angermeyer, A., Das, M. M., Singh, D. V., & Seed, K. D. (2018). Analysis of 19 highly conserved Vibrio cholerae bacteriophages isolated from environmental and patient sources over a twelve-year period. Viruses, 10(6), 299. 6. 6.Asadgol, Z., Mohammadi, H., Kermani, M., Badirzadeh, A., & Gholami, M. (2019). The effect of climate change on cholera disease: The road ahead using artificial neural network. PloS One, 14(11), e0224813. 7. 7.Azman, A. S., Lauer, S. A., Bhuiyan, T. R., Luquero, F. J., Leung, D. T., Hegde, S. T., … & Gurley, E. S. (2020). Vibrio cholerae O1 transmission in Bangladesh: insights from a nationally representative serosurvey. The Lancet Microbe, 1(8), e336–e343. 8. 8.Baker-Austin, C., Trinanes, J. A., Taylor, N. G., Hartnell, R., Siitonen, A., & Martinez-Urtaza, J. (2013). Emerging Vibrio risk at high latitudes in response to ocean warming. Nature Climate Change, 3(1), 73–77. 9. 9.Borroto, R. J., & Martinez-Piedra, R. (2000). Geographical patterns of cholera in Mexico, 1991– 1996. International journal of epidemiology, 29(4), 764-772. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/29.4.764&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10922357&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F06%2F14%2F2022.06.09.22276227.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000089069400023&link_type=ISI) 10. 10.Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1023/A:1010933404324&link_type=DOI) 11. 11.Cai, W., Borlace, S., Lengaigne, M., Van Rensch, P., Collins, M., Vecchi, G., … & Jin, F. F. (2014). Increasing frequency of extreme El Niño events due to greenhouse warming. Nature climate change, 4(2), 111–116. 12. 12.Christaki, E., Dimitriou, P., Pantavou, K., & Nikolopoulos, G. K. (2020). The Impact of Climate Change on Cholera: A Review on the Global Status and Future Challenges. Atmosphere, 11(5), 449. 13. 13.Colwell, R. R. (1996). Global climate and infectious disease: the cholera paradigm. Science, 274(5295), 2025–2031. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEzOiIyNzQvNTI5NS8yMDI1IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMDYvMTQvMjAyMi4wNi4wOS4yMjI3NjIyNy5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 14. 14.Cuddington, K., Fortin, M. J., Gerber, L. R., Hastings, A., Liebhold, A., O’connor, M., & Ray, C. (2013). Process-based models are required to manage ecological systems in a changing world. Ecosphere, 4(2), 1–12. 15. 15.Danielson, Jeffrey J., and Dean B. Gesch. Global multi-resolution terrain elevation data 2010 (GMTED2010). US Department of the Interior, US Geological Survey, 2011. 16. 16.Deneu, B., Servajean, M., Bonnet, P., Botella, C., Munoz, F., & Joly, A. (2021). Convolutional neural networks improve species distribution modelling by capturing the spatial structure of the environment. PLoS Computational Biology, 17(4), e1008856. 17. 17.Environmental Systems Research Institute (ESRI) (2020). ArcGIS Pro (Version 2.6.3). ESRI Inc. [https://www.esri.com/en-us/arcgis/products/arcgis-pro/overview](https://www.esri.com/en-us/arcgis/products/arcgis-pro/overview). 18. 18.ESA. Land Cover CCI Product User Guide Version 2. Tech. Rep. (2017). Available at: [maps.elie.ucl.ac.be/CCI/viewer/download/ESACCI-LC-Ph2-PUGv2_2.0.pdf](http://maps.elie.ucl.ac.be/CCI/viewer/download/ESACCI-LC-Ph2-PUGv2_2.0.pdf). 19. 19.Escobar, L. E., & Craft, M. E. (2016). Advances and limitations of disease biogeography using ecological niche modeling. Frontiers in Microbiology, 7, 1174. 20. 20.Escobar, L. E., Ryan, S. J., Stewart-Ibarra, A. M., Finkelstein, J. L., King, C. A., Qiao, H., & Polhemus, M. E. (2015). A global map of suitability for coastal Vibrio cholerae under current and future climate conditions. Acta Tropica, 149, 202–211. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.actatropica.2015.05.028&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26048558&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F06%2F14%2F2022.06.09.22276227.atom) 21. 21.ESRI (2019). arcgisbinding: Bindings for ArcGIS. R package version 1.0.1.237. [http://esri.com/](http://esri.com/). 22. 22.Faruque, S. M., & Mekalanos, J. J. (2012). Phage-bacterial interactions in the evolution of toxigenic Vibrio cholerae. Virulence, 3(7), 556–565. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.4161/viru.22351&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23076327&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F06%2F14%2F2022.06.09.22276227.atom) 23. 23.Hashizume, M., Armstrong, B., Hajat, S., Wagatsuma, Y., Faruque, A. S., Hayashi, T., & Sack, D. A. (2008). The effect of rainfall on the incidence of cholera in Bangladesh. Epidemiology, 19(1), 103–110. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/EDE.0b013e31815c09ea&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18091420&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F06%2F14%2F2022.06.09.22276227.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000251889400017&link_type=ISI) 24. 24.Hijmans RJ, van Etten J. 2012 Raster: geographic analysis and modeling with raster data. R package version 20–12. 25. 25.Islam, M. S., Zaman, M. H., Islam, M. S., Ahmed, N., & Clemens, J. D. (2020). Environmental reservoirs of Vibrio cholerae. Vaccine, 38, A52–A62. 26. 26.Khan, R., Anwar, R., Akanda, S., McDonald, M. D., Huq, A., Jutla, A., & Colwell, R. (2017). Assessment of risk of cholera in Haiti following Hurricane Matthew. The American journal of tropical medicine and hygiene, 97(3), 896–903. 27. 27.Kirby, J. M., Mainuddin, M., Mpelasoka, F., Ahmad, M. D., Palash, W., Quadir, M. E., … & Hossain, M. M. (2016). The impact of climate change on regional water balances in Bangladesh. Climatic change, 135(3), 481–491. 28. 28.Koelle, K., Rodó, X., Pascual, M., Yunus, M., & Mostafa, G. (2005). Refractory periods and climate forcing in cholera dynamics. Nature, 436(7051), 696–700. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature03820&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16079845&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F06%2F14%2F2022.06.09.22276227.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000230964500043&link_type=ISI) 29. 29.Kulp, Scott A., and Benjamin H. Strauss. “CoastalDEM: a global coastal digital elevation model improved from SRTM using a neural network.” Remote sensing of environment 206 (2018): 231–239. 30. 30.Lemaitre, J., Pasetto, D., Perez-Saez, J., Sciarra, C., Wamala, J. F., & Rinaldo, A. (2019). Rainfall as a driver of epidemic cholera: comparative model assessments of the effect of intra-seasonal precipitation events. Acta tropica, 190, 235–243. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.actatropica.2018.11.013&link_type=DOI) 31. 31.Lessler, J., Moore, S. M., Luquero, F. J., McKay, H. S., Grais, R., Henkens, M., … & Azman, A. S. (2018). Mapping the burden of cholera in sub-Saharan Africa and implications for control: an analysis of data across geographical scales. The Lancet, 391(10133), 1908–1915. 32. 32.Leta, S., Beyene, T. J., De Clercq, E. M., Amenu, K., Kraemer, M. U., & Revie, C. W. (2018). Global risk mapping for major diseases transmitted by Aedes aegypti and Aedes albopictus. International Journal of Infectious Diseases, 67, 25–35. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ijid.2017.11.026&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F06%2F14%2F2022.06.09.22276227.atom) 33. 33.Liaw, A. and Wiener, M. (2002). Classification and Regression by randomForest. R News 2(3), 18–22. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1159/000323281&link_type=DOI) 34. 34.Luque Fernandez, M. A., Schomaker, M., Mason, P. R., Fesselet, J. F., Baudot, Y., Boulle, A., & Maes, P. (2012). Elevation and cholera: an epidemiological spatial analysis of the cholera epidemic in Harare, Zimbabwe, 2008-2009. BMC Public Health, 12(1), 1-8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1471-2458-12-1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22214479&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F06%2F14%2F2022.06.09.22276227.atom) 35. 35.McDowell, R. W., Noble, A., Pletnyakov, P., & Mosley, L. M. (2020). Global database of diffuse riverine nitrogen and phosphorus loads and yields. Geoscience Data Journal. 36. 36.Métras, R., Jewell, C., Porphyre, T., Thompson, P. N., Pfeiffer, D. U., Collins, L. M., & White, R. G. (2015). Risk factors associated with Rift Valley fever epidemics in South Africa in 2008–11. Scientific Reports, 5(1), 1–7. 37. 37.Moore, S. M., Azman, A. S., Zaitchik, B. F., Mintz, E. D., Brunkard, J., Legros, D., … & Lessler, J. (2017). El Niño and the shifting geography of cholera in Africa. Proceedings of the National Academy of Sciences, 114(17), 4436–4441. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMToiMTE0LzE3LzQ0MzYiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8wNi8xNC8yMDIyLjA2LjA5LjIyMjc2MjI3LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 38. 38.Muschelli, J., Betz, J., & Varadhan, R. (2014). Binomial regression in R. In Handbook of Statistics (Vol. 32, pp. 257–308). Elsevier. 39. 39.Osei, F. B., & Duker, A. A. (2008). Spatial and demographic patterns of cholera in Ashanti region-Ghana. International Journal of Health Geographics, 7(1), 1–10. 40. 40.Osei, F. B., Duker, A. A., Augustijn, E. W., & Stein, A. (2010). Spatial dependency of cholera prevalence on potential cholera reservoirs in an urban area, Kumasi, Ghana. International Journal of Applied Earth Observation and Geoinformation, 12(5), 331–339. 41. 41.Penrose, K., Castro, M. C. D., Werema, J., & Ryan, E. T. (2010). Informal urban settlements and cholera risk in Dar es Salaam, Tanzania. PLoS Neglected Tropical Diseases, 4(3), e631. 42. 42.Peterson, A. T. (2014). Mapping disease transmission risk: enriching models using biogeography and ecology. JHU Press. 43. 43.Phillips, S. J., M. Dudík, and R. E. Schapire. “Maxent software for modeling species niches and distributions (Version 3.4. 1). 2018.” Available in: [http://biodiversityinformatics.amnh.org/open_source/maxent/](http://biodiversityinformatics.amnh.org/open_source/maxent/) Accessed on (2021): 5–21. 44. 44.R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL [https://www.R-project.org/](https://www.R-project.org/). 45. 45.Redding, D. W., Tiedt, S., Lo Iacono, G., Bett, B., & Jones, K. E. (2017). Spatial, seasonal and climatic predictive models of Rift Valley fever disease across Africa. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1725), 20160165. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1098/rstb.2016.0165&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28584173&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F06%2F14%2F2022.06.09.22276227.atom) 46. 46.Root, G. (1997). Population density and spatial differentials in child mortality in Zimbabwe. Social Science & Medicine, 44(3), 413–421. 47. 47.Siddique, A. K., Zaman, K., Baqui, A. H., Akram, K., Mutsuddy, P., Eusof, A., … & Sack, R. B. (1992). Cholera epidemics in Bangladesh: 1985-1991. Journal of Diarrhoeal Diseases Research, 79-86. 48. 48.Silva-Valenzuela, C. A., & Camilli, A. (2019). Niche adaptation limits bacteriophage predation of Vibrio cholerae in a nutrient-poor aquatic environment. Proceedings of the National Academy of Sciences, 116(5), 1627–1632. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMDoiMTE2LzUvMTYyNyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzA2LzE0LzIwMjIuMDYuMDkuMjIyNzYyMjcuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 49. 49.Stallman, Richard. (2007). GNU General Public License v3. URL [http://www.gnu.org/licenses/gpl.html](http://www.gnu.org/licenses/gpl.html). 50. 50.Stoltzfus, J. D., Carter, J. Y., Akpinar-Elci, M., Matu, M., Kimotho, V., Giganti, M. J., … & Elci, O. C. (2014). Interaction between climatic, environmental, and demographic factors on cholera outbreaks in Kenya. Infectious diseases of poverty, 3(1), 1–9. 51. 51.Vezzulli, L., Pruzzo, C., Huq, A., & Colwell, R. R. (2010). Environmental reservoirs of Vibrio cholerae and their role in cholera. Environmental Microbiology Reports, 2(1), 27–33. 52. 52.World Health Organization. (2017). Ending cholera a global roadmap to 2030. In Ending cholera a global roadmap to 2030 (pp. 32–32). 53. 53.Wu, J., Yunus, M., Ali, M., Escamilla, V., & Emch, M. (2018). Influences of heatwave, rainfall, and tree cover on cholera in Bangladesh. Environment International, 120, 304–311. 54. 54.Xu, M., Cao, C., Wang, D., Kan, B., Jia, H., Xu, Y., & Li, X. (2013). District prediction of cholera risk in China based on environmental factors. Chinese Science Bulletin, 58(23), 2798–2804. ## References 1. Abatzoglou, John T., Solomon Z. Dobrowski, Sean A. Parks, and Katherine C. Hegewisch. “TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958-2015.” Scientific data 5, no. 1 (2018): 1-12. 2. CMIP6 Downscaled Monthly Climate Projections: 2041-2060. WorldClim v2.1. 3. ESRI (2019). arcgisbinding: Bindings for ArcGIS. R package version 1.0.1.237. [http://esri.com/](http://esri.com/). 4. GADM. (2021). Database of Global Administrative Areas. [https://gadm.org/data.html](https://gadm.org/data.html). 5. Kulp, Scott A., and Benjamin H. Strauss. “CoastalDEM: a global coastal digital elevation model improved from SRTM using a neural network.” Remote sensing of environment 206 (2018): 231-239.