Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

The Effect of GDP and Distance on Timing of COVID-19 Spread in Chinese Provinces in 2020

Alice Kuan, Mingxin Chen, David Bishai
doi: https://doi.org/10.1101/2020.07.19.20157354
Alice Kuan
1Johns Hopkins University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mingxin Chen
2Johns Hopkins Bloomberg School of Public Health
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David Bishai
2Johns Hopkins Bloomberg School of Public Health
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: dbishai1@jhu.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

The geographical spread of COVID-19 across China’s provinces provides the opportunity for retrospective analysis on contributors to the timing of the spread. Highly contagious diseases need to be seeded into populations and we hypothesized that greater distance from the epicenter in Wuhan, as well as higher province-level GDP per capita, would delay the time until a province detected COVID-19 cases. To test this hypothesis, we used province-level socioeconomic data such as GDP per capita and percentage of the population aged over 65, distance from the Wuhan epicenter, and health systems capacity in a Cox proportional hazards analysis of the determinants of each province’s time until epidemic start. The start was defined by the number of days it took for each province to reach thresholds of 3, 5, 10, or 100 cases. We controlled for the number of hospital beds and physicians as these could influence the speed of case detection. Surprisingly, none of the explanatory variables had a statistically significant effect on the time it took for each province to get its first cases; the timing of COVID-19 spread appears to have been random with respect to distance, GDP, demography, and the strength of the health system. Looking to other factors, such as travel, policy, and lockdown measures, could provide additional insights on realizing most critical factors in the timing of spread.

Introduction

The global COVID-19 pandemic has transformed the world economically and socially. The original epicenter in Wuhan, China slowly lifted lockdown regulations and allowed its citizens to return to normal life in March. Since it was the first country to face the COVID-19 pandemic, China may hold clues to social and geographical factors that affected the spatial spread of the epidemic. This will have implications for other countries now concerned with limiting cross-border introductions of new cases. Recent literature on the spread of the pandemic has highlighted the crucial role of the capacity of health systems – specifically, a sufficient health labor force and adequate volume of intensive care unit (ICU) hospital beds. Italy was one of the earlier well-known examples where severe health worker and ICU bed shortages led to a spike in cases and deaths (Armocida et al., 2020). In Spain, underinvestment leading to insufficient numbers of health professionals and ICU beds contributed to the country’s struggled COVID19 response (Legido-Quigley, Mateos-García, et al., 2020). Lack of bed capacity was also cited as important lessons from Singapore, Hong Kong, and Japan (Legido-Quigley, Asgari, et al., 2020). In addition, past research on other infectious disease outbreaks have observed significant relationships between population density and disease incidence, as was the case of hand, foot, and mouth disease (HFMD) in China (Hu et al., 2012).

Other articles have also examined the interaction between macroeconomic factors and the spread of infectious diseases. The prevalence of tuberculosis in EU member states in 2010 had an inverse relationship with their GDP levels, accounting for wealth distribution within each country (Semenza et al., 2010). More generally, macroeconomic factors such as national distribution of wealth or GDP per capita, affects the quality of population health (Subramanian et al., 2002). Poor macro socio-economic conditions provide more opportunities and environments in which infectious diseases spread. Economic growth can also alter the prevalence of infectious agents and lead to aging populations, which heightens vulnerability to disease (Ceddia et al., 2013).

Literature about COVID-19 spread in China has found significant correlations between transportation links between Wuhan and other cities and COVID-19 caseloads. One study applied Pearson’s correlation analysis and found daily frequencies of flights, bus trips and train trips was correlated to cumulative COVID-19 cases (Zheng et al., 2020). Another article found an inverse square relationship between other regions distance from Wuhan and their cumulative number of cases, with a correlation coefficient of 0.267 for places in China and -0.197 for the rest of the world (Biswas & Sen, 2020). International studies show that proximity to China seems to have affected the number of days from the first reported cases in China until COVID-19 arrives in another country (Adegboye et al., 2020).

These articles provide helpful insights into associations between several factors and the volume of cases. They also reveal, however, additional opportunities to control for other variables – including macroeconomics or health systems capacity as highlighted above. The volume of health workers or the number of elders within a population could influence the speed of spread, thus serving as confounders. In addition, past studies did not address the timing of each location’s cases. We aim to include these considerations in our paper and re-examine the question of geospatial spread of COVID-19 across provincial borders in China. We apply event history analysis to determine the effects of province-level GDP per capita and distance from Wuhan on the time it took for each Chinese province to reach its first wave of cases, measured as elapsed number of days between January 22, 2020 and the occurrence of various thresholds of 3, 5, 10, or 100 cumulative cases. We control for health systems capacity and population factors, including the number of doctors and hospital beds per 1000 people in each province, proportion of the population aged over 65, and other variables that are delineated below.

Methods

The econometric model included 10 explanatory variables described in Table 1. Data on the number of cases by province were drawn on May 26, 2020 from a GitHub repository managed by the Johns Hopkins Center for Systems Science and Engineering. Data on provincial economic and demographic variables are in the public domain.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1: Variable Descriptions

Our analysis used the Cox proportional hazards model shown below: Embedded Image where h(t) is the hazard of reaching a COVID-19 threshold between time 0 (January 22) and time t, and h0 is a common baseline hazard which is modified by the covariates listed below.

Table 3 in the Appendix provides a correlation matrix that gives more insight into the general relationship between each independent variable at the province level. The matrix shows that higher GDP per capita is associated with a larger proportion of elders, consistent with literature discussing how economic growth could lead to aging populations. Higher GDP is also associated with more healthcare workers, while it correlates with less OOP and THE. This could be because more wealth per capita leads to a healthier population, lessening the need for health spending. Furthermore, a wealthier region could have more economic opportunity, such as in education which allows people to become doctors or nurses.

View this table:
  • View inline
  • View popup
Table 2: Results of Cox Regressions using Different Thresholds as Dependent Variables
View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3: Correlation Matrix of Independent Variables

We used Stata IC16 to estimate the proportional hazards model. For sensitivity analysis we tested four alternative definitions of the failure time. In each case, we started observations on January 22 and declared failure (i.e. epidemic onset) when a province reached its first 3, 5, 10, or 100 cases. Each independent variable coefficient indicates the effect of the variable on the amount of time a province takes to reach a threshold. Several provinces reported more than 3 or 5 cases on January 22 and could not contribute meaningful observations to the event history analysis with low thresholds. Therefore, repeating the model with higher thresholds such as 100 cases ensured that the results would account for a wider number of provinces.

Results

From Table 2, we see that the coefficients on provincial GDP and distance have no statistically significant effect on the number of days it takes for a province to reach any of the thresholds. This disproves our initial hypotheses that distance and macroeconomic factors would alter the speed of COVID-19 spread. We find that similarly and remarkably, none of the other variables significantly affect the time until the pandemic starts in a province. The F-statistic shows that the joint significance of all explanatory variables together is compatible with the null hypothesis that nothing explains the time to failure.

Figures 1a-d provide Kaplan-Meier curves displaying the probability of survival – that is, cases staying below the threshold – accounting for all provinces over time. Each step represents a failure event, or at least one province reaching or surpassing the threshold. The greater the number of days from January 22, the greater the number of cases and therefore the lower survival probability. We see from Figures 1a-b that the probability decreases rapidly for thresholds of 3 and 5 cases, likely because they are very low minimums to surpass. On the other hand, the probability decreases at a slower pace for thresholds of 10 and 100, particularly for the latter because it took longer for some provinces than others to surpass 100 cases. Figure 1d reflects this by showing more variation across provinces in the timing of events.

Figure 1:
  • Download figure
  • Open in new tab
Figure 1: Kaplan-Meier Curves of Time to Reach Four Different Thresholds

Discussion

Our results are inconsistent with recent literature analyzing the effects of distance on the severity of the COVID-19 epidemic, as well as with older studies showing that macroeconomic and socioeconomic variables play a role in disease spread. We intend to further research in these areas and provide suggestions for future study.

There are a few reasons that our results could differ. Unlike recent studies, our model controls for population variables and socioeconomic factors when assessing distance from Wuhan, as we include GDP, population, and health worker data in our model. In addition, our methodology takes advantage of provincial differences in the timing of each province-level epidemic, rather than examining the magnitude of the count of cases on any given date. This allows our research to bear a unique dependent variable, providing analyses specific to the context of the number of days until epidemic start which, to our knowledge, was not the primary focus of other studies.

Another reason could be that the Chinese New Year festival diaspora on January 31, 2020 was able to seed every province with a sufficient number of infected travelers at approximately the same time. An estimated 5 million people left Wuhan before the start of the travel ban on January 23, and a third of these travelers moved to locations outside Hubei province (Chen et al., 2020). Our results would suggest that these roughly 1.6 million outbound travelers included enough latently infected individuals to seed other provinces with COVID-19. For this scenario to be compatible with our findings, distance from Wuhan would not serve as a factor that would delay the arrival of the New Year’s celebrants, and neither GDP nor health systems capacity had bearing on the containment of the illness. Thus, the provinces appear to have sufficiently equitable economic states so that provincial GDP is an immaterial factor in the timing and initial spread of the epidemic. This could be a finding unique to the country, but there are other considerations as well, such as the timing of the lockdown measures and subsequent changes in crossprovince travel that could serve as other research areas. Future studies on in-country travel data and lockdown policies, while controlling for the variables presented in this paper and other relevant socioeconomic factors, may provide a new angle of observing the COVID-19 spread within different countries.

Data Availability

The data on cumulative COVID-19 cases in Chinese provinces is publicly available from the JHU CSSE Github repository. Data on province-level variables is available from Beijing University.

https://github.com/CSSEGISandData/COVID-19

Acknowledgements

We would like to acknowledge help from PeiPei Chang at Beijing University for gathering province level data, and Katelyn Jisoo Yoon from the World Bank for helpful comments.

Footnotes

  • Methods section updated to clarify model.

  • 1 Number of Provinces indicates how many provinces were included in the analysis for each threshold; in other words, how many provinces had not yet reached the threshold on Jan 22 nd, 2020. There are 31 provinces in total. The number of provinces that had already reached or surpassed the threshold by that day are thus 22, 17, 5, and 1 respectively in order of the columns from left to right.

References

  1. ↵
    Adegboye, O., Adekunle, A., Pak, A., Gayawan, E., Leung, D., Rojas, D., Elfaki, F., McBryde, E., & Eisen, D. (2020). Change in outbreak epicenter and its impact on the importation risks of COVID-19 progression: a modelling study. MedRxiv, 2020.03.17.20036681. https://doi.org/10.1101/2020.03.17.20036681
  2. ↵
    Armocida, B., Formenti, B., Ussai, S., Palestra, F., & Missoni, E. (2020). The Italian health system and the COVID-19 challenge. The Lancet Public Health, 5(5), e253. https://doi.org/10.1016/S24682667(20)30074-8
    OpenUrlCrossRef
  3. ↵
    Biswas, K., & Sen, P. (2020). Space-time dependence of corona virus (COVID-19) outbreak. 1, 5–7. http://arxiv.org/abs/2003.03149
    OpenUrl
  4. ↵
    Ceddia, M. G., Bardsley, N. O., Goodwin, R., Holloway, G. J., Nocella, G., & Stasi, A. (2013). A complex system perspective on the emergence and spread of infectious diseases: Integrating economic and ecological aspects. Ecological Economics, 90, 124–131. https://doi.org/10.1016/j.ecolecon.2013.03.013
    OpenUrl
  5. ↵
    Chen, S., Yang, J., Yang, W., Wang, C., & Bärnighausen, T. (2020). COVID-19 control in China during mass population movements at New Year. In The Lancet (Vol. 395, Issue 10226, pp. 764–766). Lancet Publishing Group. https://doi.org/10.1016/S0140-6736(20)30421-9
    OpenUrl
  6. ↵
    Hu, M., Li, Z., Wang, J., Jia, L., Liao, Y., Lai, S., Guo, Y., Zhao, D., & Yang, W. (2012). Determinants of the incidence of hand, foot and mouth disease in China using geographically weighted regression models. PLoS ONE, 7(6). https://doi.org/10.1371/journal.pone.0038978
  7. ↵
    Legido-Quigley, H., Asgari, N., Teo, Y. Y., Leung, G. M., Oshitani, H., Fukuda, K., Cook, A. R., Hsu, L. Y., Shibuya, K., & Heymann, D. (2020). Are high-performing health systems resilient against the COVID-19 epidemic? The Lancet, 395(10227), 848–850. https://doi.org/10.1016/S01406736(20)30551-1
    OpenUrl
  8. Legido-Quigley, H., Mateos-García, J. T., Campos, V. R., Gea-Sánchez, M., Muntaner, C., & McKee, M. (2020). The resilience of the Spanish health system against the COVID-19 pandemic. The Lancet Public Health, 5(5), e251–e252. https://doi.org/10.1016/S2468-2667(20)30060-8
    OpenUrl
  9. ↵
    Semenza, J. C., Suk, J. E., & Tsolova, S. (2010). Social determinants of infectious diseases: a public health priority. Euro Surveillance?: Bulletin Européen Sur Les Maladies Transmissibles = European Communicable Disease Bulletin, 15(27), 2–4. https://doi.org/10.2807/ese.15.27.19608-en
    OpenUrl
  10. ↵
    Subramanian, S. V., Belli, P., & Kawachi, I. (2002). The Macroeconomic Determinants of Health. Annual Review of Public Health, 23(1), 287–302. https://doi.org/10.1146/annurev.publhealth.23.100901.140540
    OpenUrlCrossRefPubMedWeb of Science
  11. ↵
    Zheng, R., Xu, Y., Wang, W., Ning, G., and Bi, Y. (2020). Spatial transmission of COVID-19 via public and private transportation in China. Travel Medicine and Infectious Disease.
Back to top
PreviousNext
Posted February 09, 2021.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
The Effect of GDP and Distance on Timing of COVID-19 Spread in Chinese Provinces in 2020
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
The Effect of GDP and Distance on Timing of COVID-19 Spread in Chinese Provinces in 2020
Alice Kuan, Mingxin Chen, David Bishai
medRxiv 2020.07.19.20157354; doi: https://doi.org/10.1101/2020.07.19.20157354
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
The Effect of GDP and Distance on Timing of COVID-19 Spread in Chinese Provinces in 2020
Alice Kuan, Mingxin Chen, David Bishai
medRxiv 2020.07.19.20157354; doi: https://doi.org/10.1101/2020.07.19.20157354

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Economics
Subject Areas
All Articles
  • Addiction Medicine (76)
  • Allergy and Immunology (202)
  • Anesthesia (55)
  • Cardiovascular Medicine (495)
  • Dentistry and Oral Medicine (91)
  • Dermatology (57)
  • Emergency Medicine (170)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (217)
  • Epidemiology (5749)
  • Forensic Medicine (3)
  • Gastroenterology (221)
  • Genetic and Genomic Medicine (883)
  • Geriatric Medicine (89)
  • Health Economics (233)
  • Health Informatics (777)
  • Health Policy (400)
  • Health Systems and Quality Improvement (256)
  • Hematology (105)
  • HIV/AIDS (187)
  • Infectious Diseases (except HIV/AIDS) (6581)
  • Intensive Care and Critical Care Medicine (397)
  • Medical Education (119)
  • Medical Ethics (28)
  • Nephrology (94)
  • Neurology (859)
  • Nursing (45)
  • Nutrition (143)
  • Obstetrics and Gynecology (166)
  • Occupational and Environmental Health (266)
  • Oncology (521)
  • Ophthalmology (168)
  • Orthopedics (44)
  • Otolaryngology (107)
  • Pain Medicine (48)
  • Palliative Medicine (22)
  • Pathology (150)
  • Pediatrics (257)
  • Pharmacology and Therapeutics (147)
  • Primary Care Research (116)
  • Psychiatry and Clinical Psychology (990)
  • Public and Global Health (2264)
  • Radiology and Imaging (380)
  • Rehabilitation Medicine and Physical Therapy (175)
  • Respiratory Medicine (314)
  • Rheumatology (110)
  • Sexual and Reproductive Health (83)
  • Sports Medicine (83)
  • Surgery (118)
  • Toxicology (25)
  • Transplantation (34)
  • Urology (42)