Socioeconomic Disparities in the Effects of Pollution on Spread of Covid-19: Evidence from US Counties

This paper explores disparities in the effect of pollution on confirmed cases of Covid-19 based on counties socioeconomic and demographic characteristics. Using data on all US counties on a daily basis over the year 2020 and applying a rich panel data fixed effect model, we document that: 1) there are discernible social and demographic disparities in the spread of Covid-19. Blacks, low educated, and poorer people are at higher risks of being infected by the new disease. 2) The criteria pollutants including Ozone, CO, PM10, and PM2.5 have the potential to accelerate the outbreak of the novel coronavirus. 3) The disadvantaged population is more vulnerable to the effects of pollution on the spread of coronavirus. Specifically, the effects of pollution on confirmed cases become larger for blacks, low educated, and counties with lower average wages in 2019.


Introduction
The novel coronavirus was observed initially in a small cluster in Wuhan, China in December 2019 and spread around the globe during the following year, and claimed about 1.85 million deaths in 2020 (CNN, 2020). While the outbreak of the virus was unprecedented and fast the factors behind its pace of spread have risen policy-relevant questions. For instance, some studies point to the fact that there are disparities in the outbreak of the Covid-19 across occupations (McClure et al., 2020). Blacks, minorities, males, older individuals, low educated, and poorer individuals are at higher risks of being infected with Covid-19 (Figueroa et al., 2020;Kopel et al., 2020;McClure et al., 2020;Paul et al., 2020;Yang et al., 2020). For instance, Yang et al. (2020) apply a negative binomial regression at a county-level dataset that covers data on Covid-19 cases up to June 13 th and find that counties with a higher density of racial and ethnicity have higher confirmed cases. They show that this link is enhanced for counties with higher segregation between blacks and whites.
On the other end, there are also environmental factors that may affect the spread of Covid-19. Temperature and pollution are among the factors that were related to the Covid-19 through various mechanism channels. For instance, NoghaniBehambari, Salari, et al. (2020) examine the impacts of ambient air on the spread of Covid-19 across US counties. They use panel data fixed effect models and GMM models in a panel of county-by-day and find that an increase of one degree in air temperature is associate with 0.041 more cases per 100,000 population. The results are robust when they include county-by-week fixed effects and also across various subsamples. In another study, Contini & Costabile (2020) evaluates the literature on pollution and Covid-19 and conclude that, although marginally, specific pollutants such as PM10 can explain variations in the outbreak of Covid-19. However, no study has investigated the heterogeneous effects of pollution on the spread of Covid-19 based on demographic and socioeconomic characteristics. This paper aims to fill this gap in the literature. This paper evaluates the effects of demographic and socioeconomic features of counties on the relationship between pollution and the spread of Covid-19 in the US. Using daily panel data across all US counties that cover days in the year 2020 and applying a rich panel data fixed effect model, we find that: 1) Pollution has a small but significant effect on the pace of Covid-19 outbreak. 2) There is heterogeneity in confirmed cases based on demographic characteristics.
Counties with a higher share of blacks, higher share of low educated people, and lower average wages reveal higher rates of confirmed cases. 3) The marginal effects of pollution on the spread of Covid-19 is larger among counties with lower wages and a higher share of minorities.
This paper adds to the literature that investigates the sources of variations in the outbreak of Covid-19 in two ways. First, to the best of our knowledge, this is the first study to evaluate the socioeconomic disparities in the effect of pollution on Covid-19 confirmed cases. Second, we update the findings of the literature on pollution and Covid-19 using data from all US counties on a daily basis that covers all days of the year 2020 while the previous literature exploited from part of this timeframe.
The findings of this paper have important policy implications. The fact that pollution causes a mechanism channel for the spread of pandemic suggests that policymakers should reevaluate the abatement structure during the pandemic to protect public health. The evidence on the racial and demographic disparities also help policymakers design optimal welfare programs during the pandemic to close the health gap among different groups within a society. remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 The rest of the paper is organized as follows. In section 2, we introduce the data sources and discuss the final sample. Section 3 provides the econometric framework. In section 4, we go over the results. Section 5 concludes the paper.

Data Sources
This study implements a wide array of data sources. the daily count of new confirmed cases is extracted from USA-Facts (2020). The daily temperature data is extracted from the Global Summary of the Day data files produced by the National Oceanic and Atmospheric Administration (NOAA). The county population and demographic data are extracted from SEER (2019). The pollution data comes from the Environmental Protection Agency (EPA). The county average wage data is from the Quarterly Census of Employment and Wages (QCEW) and is extracted from replication codes provided by . Finally, the unemployment rate data is extracted from Local Area Unemployment Statistics gathered by the Bureau of Labor Statistics.
First, we show the geographic disparities in Covid-19 and socioeconomic disparities in a series of figures. Figure 1 illustrates the quartiles of total confirmed cases per county population (top panel) and the quartiles of daily average cases (bottom panel) across US counties for the whole year of 2020. The rates of confirmed cases for both outcomes are concentrated mainly in eastern and western states. Figure 2 depicts the geographic distribution of counties based on quartiles of the share of people with low education (top panel) and high education (bottom panel). 4 Figure 3 shows the geographic distribution based on the percentage of whites (top panel) and the percentage of blacks (bottom panel). 4 Throughout the paper we categorize people with less than high school education as low educated. Similarly, we consider people with bachelor and above as high educated. remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10. 1101/2021 The pollution data reported by EPA has two problems. First, pollutants have different units of measurement. In order to solve this problem and to make the interpretations easy and intuitive, we standardize the pollution data. We subtract the variable from the mean and then divide it by its standard deviation over the sample period. Therefore, all pollutants have a mean of zero and a standard deviation of one. For this reason, we avoid reporting their summary statistics. Second, the distribution of pollution monitors across counties is sporadic. Moreover, not every pollution monitor reports every essential criteria pollutant on a regular basis. To show this fact visually, Figure 4 illustrates the quartiles of Ozone pollution across counties. While the distribution is arbitrary across counties they cover a small fraction of counties. For instance, only 356 counties report Ozone among 3,148 counties covered in the final sample.
A summary statistics of the final sample is reported in Table 1. On average, there have been 17.8 new confirmed cases per 100,000 population and the total inflicted individuals in 2020 within each county add up to 6,588 persons per 100,000 county population. Roughly 9.9 percent of people are black and 13.4 percent are low educated.

Econometric Framework
We start with a cross-sectional data of counties and explore the cross-tabulation between county characteristics in 2019 and the rate of spread of Covid-19 in 2020 using the following OLS model: The main reason to use the characteristics in 2019 is that socioeconomic characteristics have not yet been released for the year 2020. In this specification, is the Covid-19 confirmed cases per 100,000 population of county for the year 2020.
remix, or adapt this material for any purpose without crediting the original authors.
preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10. 1101/2021 In the next step, we use a panel of county-by-day data to assess the effect of pollution on the rates of Covid-19 using fixed-effect models of the following form: Where Indexes the county, indexes the state, and indexes day-by-month of observation. is the daily rates of Covid-19 confirmed in the county. is the standardized variable of pollution measures including Ozone, Carmon Monoxide (CO), particulate matters less than 10 (PM10), and particulate matters less than 2.5 (PM2.5). The pollution is measured three days in advance since the literature suggests that the virus has an average incubation period of 3 days (Lauer et al., 2020;Li et al., 2020;Tan et al., 2005). Since temperature is discussed to be one of the causes that accelerate the outbreak we also control for daily temperature in all regressions represented by (NoghaniBehambari, Salari, et al., 2020;Wang et al., 2020;Xie & Zhu, 2020).
The parameter is the county fixed effect. is a set of state by day-month fixed effects. The matrix represents day-by-month fixed effects. In some specifications, we also include a county-specific linear time trend. Finally, is a disturbance term. All standard errors are clustered at the county level. All regressions are weighted using the average of county population in 2020. To assess the socioeconomic disparities in 1 of equation 2, we use an interaction term for each characteristics using the following formulation: Where all parameters follow the same notation as in equations 1 and 2. The coefficient of interest s 1 that shows the effect of pollution on coronavirus cases per population for the group with characteristics represented in compared to the reference group.
remix, or adapt this material for any purpose without crediting the original authors.
preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint

Main Results
We start by reevaluating the social disparities in confirmed cases of coronavirus. Table 2 shows the results of regressions introduced in equation 1 for total and average daily cases in columns 1 and 2, respectively. If the share of blacks in a county goes up by 10 percent the confirmed cases of corona increase by 1.1 cases per 100,000 population, an increase equivalent to a 6.2 percent change from the mean of daily confirmed cases (column 2, first row). In a similar manner, if the share of people with at least a bachelor's degree goes up by 10 percent in a county then the average daily confirmed cases go down by 16.9 cases per 100,000 population. This change can explain 15.5 percent of the standard deviation of confirmed cases over the year 2020. Overall, counties with a higher share of blacks, low educated people, and lower-income and wages have higher rates of confirmed cases.
Next, we reexamine the effect of pollution on Covid-19. Using equation 2, Table 3 reports the results for models without and with a linear county trend (columns 1 and 2, respectively). Each independent variable is in a separate row and each cell represents a separate regression. Looking at the full specification of column 2, one standard deviation increase in CO, Ozone, PM10, and PM2.5 is associated with an increase in Covid-19 cases by 0.04, 0.29, 0.35, and 0.11 cases per 100,000 population. Although these effects are marginal and economically small they are significant at 1% level and robust to including or excluding county by time linear trend.
Finally, we report the main results of the paper using equation 3 in Table 4 through Table   7. Interestingly, as areas with a higher share of poor people and minorities reveal higher confirmed cases they also are more susceptible to pollution-driven confirmed cases. The interaction term between pollution measures and blacks (Table 4) and low educated (Table 6) are positive implying that the relationship between pollution and the outbreak of the virus is stronger among these remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint people. On the other hand, the interaction term between pollutants and whites (Table 5), high educated (Table 7), and average wages (Table 8) are negative implying the protective effects against the Covid-19 consequence of pollution among counties with a higher share of whites, high educated, and income. For instance, the marginal effect of one standard deviation increase in PM10 on confirmed cases for a 10 percent rise in the share of blacks in a county goes up by 6.49 cases per 100,000 population (column 3, Table 4).

Conclusion
Understanding the racial and social disparities in exposure to a pandemic and specifically, the disparities in the effect of pollution on the outbreak of a pandemic are essential for policymakers to design optimal welfare programs and effective restriction orders. In this paper, we explored this aspect of the outbreak of Covid-19 using daily data across all US counties covering all days of 2020. Applying a rich set of fixed effects that also controls for a linear county by time trend, we documented that 1) there are discernible social and demographic disparities in the spread of Covid-19. Blacks, low educated, and poorer people are at higher risks of being infected by the new disease. 2) The criteria pollutants have the potential to accelerate the outbreak of the virus. Among others, these pollutants include Ozone, CO, PM10, and PM2.5. 3) The disadvantaged population is more vulnerable to the effects of pollution on the spread of coronavirus. Specifically, the effects of pollution on confirmed cases become larger for blacks, low educated, and counties with lower average wages in 2019. Overall, these results suggest that the abatement structures should be strengthened during a pandemic with more weight towards areas with a higher concentration of minorities and poor people. remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021.  remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021  No Yes Notes. Each cell represents a separate regression. Robust standard errors, clustered on the county, are reported in parentheses. All regressions are weighted by the average county population in 2020.
remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021  remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021  remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint Quartiles of Average New Cases, 2020 remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint Quartiles of Bachelor or Above Education remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint Quartiles of Share of Blacks remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint Quartiles of Ozone remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, The copyright holder has placed this this version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.06.21249303 doi: medRxiv preprint