Is the spread of COVID-19 across countries influenced by environmental, economic and social factors?

The SARS-CoV-2 virus, emerged from Wuhan, China is spreading all over the world in an unprecedented manner, causing millions of infections and thousands of deaths. However, the spread of the disease across countries and regions are not even. Why some countries and regions are more affected than some other countries and regions? We employ simple statistical methods to investigate any linkage between the severity of the disease and the environmental, economic and social factors of countries. The estimation results indicate that the number of confirmed cases of Coronavirus infection is higher in countries with lower yearly average temperatures, higher economic openness, and stronger political democracy. However, findings of this analysis should be interpreted carefully keeping in mind the fact that statistical relations do not necessarily imply causation. Only clinical experiments with medical expertise can confirm how the virus behaves in the environment.


Introduction
The ongoing global Coronavirus pandemic started all on a sudden and has posed serious threat for the existence of normal human life. The novel virus, named afterward as SARS-CoV-2, belongs to highly pathogenic Coronavirus family and causes severe acute respiratory disease, COVID-19 leading even to death for some cases 1 .
The speed of human-to-human transmission of the pathogen is unprecedented. It has spread to most of the countries of the world within 4 months of its first known transmission to human and the number of infected people as well as the number of deaths caused by the virus is increasing every day till date. The first human case of present outbreak of Coronavirus is identified in Wuhan, the capital of Hubei province of China, in December, 2019 1,2 . Though the virus is spreading many countries, the speed and severity of spread vary country to country. Europe is badly affected than Asia, Africa and South America. Though South Korea, Thailand, Japan and Singapore were affected immediately after China, wide-spread infections were observed in Italy and Spain before those countries. Despite having political and economic connection and national borders with China, Myanmar and North Korea report very few cases of infection. Other European countries and USA were also hit within a very short period.
Nevertheless, other Asian, African and South American countries are less affected till the beginning of April 2020. The uneven transmission mechanism to countries raises the question-what factors do influence the pathogen to spread? Is the virus attacking countries with lower temperature, or higher precipitation, or higher openness?
Biological explanation can answer many questions related to transmission mechanism . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.08.20058164 doi: medRxiv preprint 3 of a pathogen. Apart from biological properties, it is susceptible that few environmental, social and economic elements can act as possible catalysts for the spread of novel Coronavirus. First, connection with China is likely to be an important factor for rapid transmission of the disease at the initial stage of the pandemic. As the virus continues to spread many countries, any connection with badly effected countries may equally responsible for the transmission. In other words, level of international connection of a country is probably playing a vital role in country-tocountry transmission of the virus. International trade, alternatively economic openness can be treated as a good measure of international connection of a country. Thus, economic openness is likely to play an important role in the spread of the disease. If a country is more open, the virus will find it easier to transmit from human to human by crossing national borders. Once the virus is in, the rate of spread within a country will depend on several factors, some of which are difficult to identify, some though can be pointed out by educated guess. Population density, level of urbanization, social cohesiveness and weather conditions can be primarily identified as influential factors of human-to-human transmission of a pathogen. Viability of infectious viruses is found being affected markedly by environmental factors like temperature and humidity 3 .
Social, economic and environmental changes around the world are also believed to increase the occurrences of infectious diseases in the world, especially in developing countries 4 . In case of SARS-CoV-2 pandemic, many countries responded immediately and are imposing partial or full lockdown. Measures taken by governments and response rate of citizens at the initial stage of the outbreak at any locality are likely to impact significantly on the spread 5 . Furthermore, published number of confirmed . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.08.20058164 doi: medRxiv preprint 4 cases are not completely reliable for some reasons. There is some evidence that fear of economic penalties sometimes lead authorities to under-report epidemics 4 . Countries of autocratic rulers are reluctant to reveal exact or approximate number of cases as it may hamper their reign adversely. Moreover, social cohesiveness is likely to be higher in free societies leading to possibilities of more infections of a highly pathogenic disease. Though, democratization does not necessarily mean social cohesiveness or integrity, it can be used as a proxy to assess how much the citizens of a society are involved in social activities. If social activities are higher, the chance of more infections are higher. Thus, the level of democratization is a possible inducer of the number of cases of infection.
Underlying factors influencing the present spread mechanism of SARS-CoV-2 across countries are not well-known yet. Researchers around the world are contributing to better understand the virus that may save lives.
Present study is motivated to find any possible linkage between world-wide uneven spread of COVID-19 and environmental, social and economic characteristics of countries by applying statistical methods. Though statistical relationship does not mean causation, they are at least indicative to a true relationship. We use least squares method to investigate whether the transmission of COVID-19 across countries is influenced by few environmental, social and economic variables.

Trends of COVID-19 Spread
On 12 January 2020 the World Health Organization (WHO) announced that a novel Coronavirus is infecting clusters of people in Wuhan, China from December, 2019 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Fluctuations in the trends of daily new cases are probably highly influenced by governments' interventions. China reached its peak on 13 February 2020 with 15141 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The biological and transmission mechanisms of SARS-CoV-2 are not well-known yet 8 and under extensive investigation by researchers throughout the world. A very recent study claims that high temperature and high relative humidity reduce reproductive number, R0 of Coronavirus significantly and consequently lessen the transmission of the disease 9 . Human-to-human transmission of SARS Coronavirus is likely to be . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. According to Guo et al (2020) human-to-human transmission of SARS-CoV-2 occurs mainly through a close contact with patients of COVID-19 or incubation carriers. In a groundbreaking work, published on 30 January 2020, Riou and Althaus (2020) apprehended that the spread of the novel virus will depend on two key properties.
First, the reproduction number, R0 (average number of secondary cases by an infectious index case) value of which is estimated as 2.2, greater than the threshold level, 1. Second, the dispersion parameter of secondary cases, k whose value is liable for super-spreading. A low value of k would lead a steady growth of the epidemic.
Though the outbreak appears to have started from a single or multiple zoonotic transmission in Wuhan 10 , the spread outside of China is clearly Human-to-human transmission. Human-to-human transmission may be direct or indirect. Direct Humanto-human transmission occurs because of close contact with a patient. However, the virus can stay in the environment for sometimes and can infect a susceptible person if get contacted. This indirect Human-to-human transmission is likely to depend on many factors including pathogenic characteristics and environmental factors.
This study aims to explore the effects of few environmental, social and economic characteristics of countries on country-wise number of infection cases per one million people. We hypothesize that number of infection cases per one million people of a country depend on yearly average temperature, yearly average precipitation, economic openness, level of democracy, and on population density of the country.
How long the virus can stay alive in the environment without any receptor will depend . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Modeling Approach
Transmission mechanism of a pathogen is mainly characterized by biological properties. Total number of cases of an infectious disease within a country and at a point of time is determined by a lot of biological and non-biological factors. Thus, we can presume that total number of confirmed cases across countries is generated by a stochastic process. However, the number that observed at a point of time is basically a . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.04.08.20058164 doi: medRxiv preprint 11 realization of the process through which data is generated 11 . The realized value is a random selection from a range of possible values that could be generated by the interaction of all factors. Though it is not easy to reveal the true characteristics of the underlying data generation process, the realized or observed values can be utilized to draw inferences about the process. As a stochastic process the important determinant of the number of cases of a day is its own past values, implying that Yt, the number of cases on a day, depend on the past values, i.e. Yt-1, Yt-2, Yt-3 etc. Hence, an autoregressive model may be suitable to approximate the data generation process of the cases of infection on a day. Nevertheless, our aim in this study is not to find the data generation process of total number of cases of a country on a specific day. Rather, our objective is to examine data to investigate any possible linkage between country level severities of infection with few selected environmental, economic and social variables. For this purpose, we first simply regress total number of cases of infection per one million people by countries reported on a recent day (03 April 2020) on our selected explanatory variables. Specifically, we estimate the following regression-= + 1 1 + 2 2 + 3 3 + 4 4 + 4 5 + Y is the total number of cases of confirmed infection per one million people in a country on a day (03 April 2020), X1 is yearly average temperature of countries, X2 is yearly average precipitation of countries, X3 is openness measured by international trade as a percentage of GDP of countries, X4 is democracy index of countries and X5 is population density of countries in 2018.
We also presume that total number of cases of infection in a country on a specific day depends, apart from environmental, social and economic conditions, on the previous . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. where ̂ is the residual series. Following the properties of Classical Linear Regression Model, we can deduce that ̂ of the estimated model (3) is un-correlated with X's and can be used as a proxy variable of (− ) in model (2). Consequently, model (2) has converted to the following estimable one-= ∝ + 1 1 + 2 2 + 3 3 + 4 4 + 4 5 +̂+ . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Estimation Results and Discussion
We apply Least Squares method on model (1) and find that precipitation and population density have no significant effect on the number of infection cases per one million people (Y). Those variables are then excluded, and the model is re-estimated. In the re-estimated model, the variables average temperature, openness and democracy appear as highly significant. We made some experimentations with lags 1 to 9 to estimate model (5). In each case we first estimate model (3) and obtain the residuals, then estimate model (5) using the estimated residuals obtained from estimated model (3). As the number of infections are still increasing day by day in almost all countries, the estimated model fits better with lower lag. We drop density and precipitation from our experimentations as they frequently appear insignificant. In all estimated model yearly average temperature, openness and democracy appear as highly significant.
Estimation results of three regression models with no lag, lag 1 and lag 9 are presented in Table- . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.04.08.20058164 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.04.08.20058164 doi: medRxiv preprint 15 Positive sign of openness suggests that the more a country is involved in international trade, i.e. export and import, the more it is vulnerable to Coronavirus infection. This finding is plausible as more international trade bring forth more interaction with overseas people and pave the way for the viral infection.
In the same manner, the positive sign of democracy index indicates that more democratic countries are affected more by the disease. This is may be the result of the fact that the democracy index may imply social cohesiveness or social integration.
People are likely to be more involved in social activities and festivals in a country which is more democratic than a country ruled by an autocrat. Moreover, autocratic rulers are generally reluctant to publish the actual figure of fatalities and try to undermine it in any crisis.
The estimated models are not free from any misspecifications. As the diagnostic tests indicate, the estimated models are suffering from heteroscedasticity and violate the normality assumption. However, the models are free from autocorrelation (except model 1) as indicated by Breusch-Godfrey LM test statistic and high multicollinearity as indicated by high ̅ 2 and significant t-ratios for all explanatory variables. Consequently, the estimated coefficients are unbiased, though do not possess minimum variances.
Despite some deficiency, the estimated models admit that COVID-19 is likely to spread more countries with low yearly average temperature, higher international trade and higher level of democracy.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Conclusion
In this study we employ Least Squares method and provide statistical evidences that yearly average temperature, economic openness and political democracy level of countries have significant effect on the spread of SARS-CoV-2 across countries. Data on 163 infected countries is analyzed latest on 03 April 2020 which reveals that countries with higher average temperature, lower international trade and weaker democracy