The closer to the Europe Union headquarters, the higher risk of COVID-19? Cautions regarding ecological studies of COVID-19

Several ecological studies of the coronavirus disease 2019 (COVID-19) have reported correlations between group-level aggregated exposures and COVID-19 outcomes. While some studies might be helpful in generating new hypotheses related to COVID-19, results of such type of studies should be interpreted with cautions. To illustrate how ecological studies and results could be biased, we conducted an ecological study of COVID-19 outcomes and the distance to Brussels using European country-level data. We found that, the distance was negatively correlated with COVID-19 outcomes; every 100 km away from Brussels was associated with approximately 6% to 17% reductions (all P<0.01) in COVID-19 cases and deaths in Europe. Without cautions, such results could be interpreted as the closer to the Europe Union headquarters, the higher risk of COVID-19 in Europe. However, these results are more likely to reflect the differences in the timing of and the responding to the outbreak, etc. between European countries, rather than the 'effect' of the distance to Brussels itself. Associations observed at the group level have limitations to reflect individual-level associations - the so-called ecological fallacy. Given the public concern over COVID-19, ecological studies should be conducted and interpreted with great cautions, in case the results would be mistakenly understood.


Abstract
Several ecological studies of the coronavirus disease 2019 (COVID-19) have reported correlations between group-level aggregated exposures and COVID-19 outcomes. While some studies might be helpful in generating new hypotheses related to COVID-19, results of such type of studies should be interpreted with cautions. To illustrate how ecological studies and results could be biased, we conducted an ecological study of COVID-19 outcomes and the distance to Brussels using European country-level data. We found that, the distance was negatively correlated with COVID-19 outcomes; every 100 km away from Brussels was associated with approximately 6% to 17% reductions (all P<0.01) in COVID-19 cases and deaths in Europe.
Without cautions, such results could be interpreted as the closer to the Europe Union headquarters, the higher risk of COVID-19 in Europe. However, these results are more likely to reflect the differences in the timing of and the responding to the outbreak, etc. between European countries, rather than the 'effect' of the distance to Brussels itself. Associations observed at the group level have limitations to reflect individual-level associations -the so-called ecological fallacy. Given the public concern over COVID-19, ecological studies should be conducted and interpreted with great cautions, in case the results would be mistakenly understood.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 29, 2020. .

Introduction
As of 22 April 2020, the coronavirus disease 2019 (COVID-19) pandemic had been reported to cause more than 2.4 million cases and 169,006 deaths worldwide 1 . Many studies have been conducted to understand this 'unknown' new disease. Some studies have found that regional aggregated exposures (i.e., Bacillus Calmette-Guerin vaccination, air quality index) were correlated with regional aggregated COVID-19 outcomes (i.e., numbers of cases and deaths) [2][3][4] . Such type of studies using group-level data, rather than individual-level data, is typically called ecological studies 5 . While some ecological studies might be helpful in generating new hypotheses related to COVID-19, their results should be interpreted with cautions, as ecological studies by nature are sometimes more vulnerable to bias than studies using individual-level data.
To illustrate how ecological studies and relevant results could be biased, we presented an ecological study of investigating the relationships between COVID-19 outcomes and the distance to Brussels, where the Europe Union (EU) headquarters located, in Europe.

Methods
We obtained daily numbers of COVID-19 cases and deaths for European countries from the European Centre for Disease Prevention and Control (https://www.ecdc.europa.eu/en/publications-data/downloadtodays-data-geographic-distribution-covid-19-cases-worldwide). To minimise the results being impacted by small numbers of cases, we included countries with at least 2,000 cumulative cases (up to and including 22 April 2020) only. A total of 26 countries were included ( Figure 1). We studied three COVID-19 outcomes: confirmed number of cases per million people, confirmed number of deaths per million people, and case fatality rate defined as the proportion of deaths in confirmed cases. For each country, the distance to Brussels was defined as the direct distance (as the crow flies) from its capital to Brussels, and was measured using an online tool (https://www.freemaptools.com/how-far-is-it-between.htm).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 29, 2020. . https://doi.org/10.1101/2020.04.23.20077008 doi: medRxiv preprint We investigated the correlations between the distance and each of the three outcomes (log-transformed) using Pearson correlation coefficient. We estimated the 'effect' of the distance on COVID-19 outcomes using a log-linear regression model.

Results
All the three COVID-19 outcomes were negatively correlated with the distance to Brussels (Figure 1); the correlation coefficient ranged from -0.5 to -0.7, and all the P-values were less than 0.01.

Discussion
From our ecological study, we found that for European countries their distance to Brussels was negatively correlated with their COVID-19 outcomes, and all the correlations were statistically significant. If no cautions were given to our study, such results could be interpreted as the closer to the EU headquarters, the higher risk of COVID-19 in Europe -a conclusion with limited plausibility.
Ecological bias, or ecological fallacy, is the major limitation of ecological studies in making causal inference. This bias is usually interpreted as the failure of the observed association at the group level to reflect the biological effect at the individual level 5 . While a strong individual-level effect of an exposure on an outcome could result an effect at the group level, i.e., a population with more smokers tend to have more lung cancers compared with a (comparable) population with fewer smokers, the reverse is not always held.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 29, 2020. . https://doi.org/10.1101/2020.04.23.20077008 doi: medRxiv preprint One source for ecological bias is that the group-level measures do not necessarily reflect the measures at the individual level, as the latter are not measured; after all, health conditions occur at the individual level.
Even the purpose of an ecological study is to make inference at the group level, rather than at the individual level, cautions are still needed. Confounders between groups and effect modification by group could also introduce biased ecological associations. In terms of our example, there are considerable differences in the timing of the outbreak, responding to the outbreak, test capacity, healthcare system, population structure, etc. between European countries that could confound the association at the country level. Our observed ecological association is likely to reflect the impacts of these differences on the different COVID-19 outcomes between the studied European countries, rather than the 'effect' of the distance to Brussels itself.
Nevertheless, ecological studies indeed have some values, especially in generating hypotheses. Some discoveries about the causes of cancer could be attributed to the hypotheses generated by internationally comparing cancer incidences 6 . The current ecological studies of COVID-19 might inspire some further studies to research in depth to provide more evidence for supporting, or falsifying, relevant hypotheses.
Understanding the differences in health conditions between populations is the driving force of the development of epidemiology; however, modern epidemiology tends to more focus on the individual level.
Comparing group-level data could bring the public-health orientation back to epidemiology, but simply conducting ecological studies would have little help 5,6 .
Given the public concern over COVID-19, any 'novel/striking finding' from ecological studies would have the potential of becoming eye-catching headlines and attract considerable media coverage, and be easily interpreted problematically. Researchers should take cautions when conducting ecological studies and discussing relevant results.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 29, 2020. . https://doi.org/10.1101/2020.04.23.20077008 doi: medRxiv preprint