Racial Disparity in COVID-19 Deaths: Seeking Economic Roots with Census Data

John McLaren

doi:10.1515/bejeap-2020-0371

Publicly Available Published by De Gruyter April 30, 2021

Racial Disparity in COVID-19 Deaths: Seeking Economic Roots with Census Data

John McLaren

From the journal The B.E. Journal of Economic Analysis & Policy

https://doi.org/10.1515/bejeap-2020-0371

Abstract

This note seeks the socio-economic roots of racial disparities in COVID-19 mortality, using monthly county-level mortality, economic, and demographic data from 3140 counties through December 2020. The county-level approach shows a sharp disparity affecting all minority groups in the sample, peaking in spring or summer 2020 and then dissipating by the end of autumn. The effect disappears for Asian Americans when occupation and other controls are added, but not for other minorities; for them, the racial disparity, as long as it lasts, does not seem to be due to differences in income, poverty rates, education, occupational mix, or even access to healthcare insurance, although in April public transit use explains a large part of it. This is a puzzle, but the rapid change in the disparities over the year show that they are not immutable – an important message for future pandemics.

Keywords: COVID-19; racial disparity; healthcare insurance; commuting

JEL Classifications: I14; J15

1 Introduction

The higher mortality rates for African Americans and other minority groups in the COVID-19 pandemic has been the subject of much public concern.Centers for Disease Control (2020) and Oppel et al. (2020) survey much of what is known about the phenomenon, and Wood (2020) and McLaren (2020) provide overviews of related work. Gross et al. (2020) find that the mortality rates for African Americans are more than triple the rates for whites after correcting for age, and the rates for Hispanic/Latinos are almost double the rates for non-Hispanic whites. The Navajo Nation has had among the highest mortality rates in the nation.^[1] The sources of these disparities, however, are more difficult to analyze than establishing the existence of such disparities. If the disparities can be explained in terms of socioeconomic factors such as income, access to insurance, and occupation, that knowledge can be useful in designing policies to reduce the disparities. A number of authors have called for this analysis of sources as an urgent priority for research. Chowkwanyun and Reed (2020) argue that without an analysis of the sources, the political discourse can gravitate toward ‘biologic explanations’ or explanations based on racial stereotypes which are harmful in themselves and get in the way of policy solutions.^[2]Wood (2020) summarizes several commentators on the topic, arguing that on COVID-19 racial disparity, “Outrage is warranted. But outrage unaccompanied by analysis is a danger in itself.” Timothy Freeman, pastor at Trinity African Methodist Episcopal Zion Church in Washington, DC, suggested that ‘COVID-19 is affecting black and brown people in disproportionate numbers’, “and not just because we’re black and brown, but because of the social and economic conditions people are forced to live in…” Johnson (2020).

In this important task of finding the sources of the disparity, a major difficulty is data limitations. As Gross et al. (2020) report, only 28 state health systems break down their COVID-19 data reports by race; the FOIA request by Oppel et al. (2020) yielded race information on fewer than 1000 counties. Even when the case information is broken down by race, confidentiality rules prevent information about the patient’s economic variables from being matched up with the clinical information.^[3] Ideally, one would have a large sample of COVID-19 patients and non-COVID-19 patients with full information about employment, education, occupation, income, and so on for each individual but that is not available.^[4]

This note provides an approach to get around these problems partially, using geographic variation in aggregate variables by county.^[5] The data used are US county aggregates of COVID-19 mortality over time, matched with socioeconomic and demographic data by country. The mortality data are not broken down by race, but in principle one can still test for and measure racial disparity by looking for correlation between the size of a county’s minority population share and the county’s mortality rate. If there were no difference in mortality between a minority group and the rest of the population, the size of the minority share would have no effect on the county’s mortality rate, so this correlation provides indirect evidence of a disparity in mortality rates. The exercise follows two steps: first, confirm the existence of a racial disparity by using a regression to show that minority population shares are correlated with mortality rates. Second, control for a range of county-level socio-economic factors to see if the racial disparity weakens or disappears. If it does, then that stands as evidence that the socioeconomic factor is part of the reason for the racial mortality discrepancy.

We apply this simultaneously to four minorities, as defined by the Census: ‘Black or African American,’ which will be abbreviated here as ‘African American;’ ‘Hispanic/Latino;’ ‘Asian;’ and ‘American Indian and Alaska Native,’ which will be abbreviated here as ‘First Nations.’

To anticipate results, the county-level approach shows a sharp disparity affecting all minority groups in the sample, peaking in spring or summer 2020, but then – surprisingly – dissipating by the end of autumn, so that by the end of the year for the most part a positive disparity no longer shows up. This ‘hump shape’ may be the result of policy or private-behavior responses, but whatever the precise cause they effectively refute the ‘biologic explanations’ referred to above, an important lesson for this pandemic and for future ones as well. In addition, the effect disappears for Asian Americans when occupation and other controls are added, but not for other minorities. For these other groups the racial disparity, as long as it lasts, does not seem to be due to differences in income, poverty rates, education, occupational mix, or even access to healthcare insurance, although in April public transit use explains a large part of it.

A number of studies have used a similar approach. Desmet and Wacziarg (2020) also use Census data at the county level to examine correlates of COVID-19. Looking both at cases and deaths, they confirm strong correlations with the minority population share, particularly for African American and Hispanic/Latino shares. The focus of the study is not on race, however, and it does not investigate to what extent racial disparities can be explained by socio-economic factors. In a study complementary to this one, Benitez, Courtemanche, and Yelowitz (2020) examine racial disparities by zip code in early June and confirm that they are strong and robust, and in particular that for both African Americans and Hispanics at least one half of the disparities are driven by factors other than socio-economic controls, which is similar to the findings in this study. The methods and findings are different in a number of ways; looking at counties allows us to look at the whole country at once, and allowing effects to change over time we find an important trend for African Americans. In particular, the ‘hump-shaped’ pattern of racial disparity over time demonstrated in this note requires an approach that allows for time-varying effects over several months, and is a new finding. The finding that the disparities are no longer observed in the county-level data by late fall is also new. In addition, the important but short-lived contribution of public transit to the racial disparity is an original contribution of this note.

2 Data

All data in this note are at the county level, from 50 states.^[6] The ideal would be individual data but that is not publicly available, so the present goal is to work with the finest possible geographic units. Mortality data are available at the level of zip codes, which are typically smaller than counties, but the economic and demographic variables are typically not available at that level.^[7] The data include 3140 counties after counties with missing data have been dropped, covering a total population of 322,179,225. Counties are extremely heterogenous in size, with a highly skewed distribution. The smallest is Kalawao County, Hawaii, with 75 people, and the largest is Los Angelos County, with more than 10 million. The mean size is 102,605 and the median is 25,732. For this reason, all descriptive statistics reported in Table 1 and all regressions are weighted by population.

Table 1:

Descriptive statistics.

	Population-weighted mean	Population-weighted standard deviation	Population-weighted median
COVID-19 deaths per million by month	95.07	164.22	45.92
African-American share	12.6	12.67	8.16
Hispanic-Latino share	17.82	17.06	11.2
Asian share	5.45	6.41	3.39
First-nations share	0.84	3.27	0.36
Median household income	62,926.2	17,146.8	60,146
Poverty rate	14.07	5.14	14.1
Share with less than high school	12.41	5.41	11.5
High-school graduate share	27.08	5.41	26.7
Share with some college	29.08	4.83	29.3
Share with college degree	31.42	11.04	31.4
No health insurance	9.37	4.41	8.8
Fraction of workers who use a carpool to get to work	9.15	2.01	9.05
Fraction who use public transit to get to work	4.80	10.0	1.60

2.1 Mortality Data

This note focusses on mortality data, which is less likely than case rates to be distorted by local variations in testing policy. Usafacts.org provides cumulative COVID-19 deaths per day in each county; here, we use cumulative deaths at the end of each month. To find deaths during each month, we take first differences. All mortality figures are scaled as deaths per million residents. Table 1 shows that the population-weighted average mortality over all month-county observations was 95.07 per million. These figures are sharply skewed to the right, with a population-weighted median of 45.92, just under half the mean.

2.2 Demographic and Economic Data

The demographic and economic variables come from the American Community Survey (ACS) of the US Census Bureau. The most recent round of the ACS with complete publicly-available information required for this exercise is the 2018 survey; the five-year data are used here, meaning that all data are an average from the five years of the ACS ending in 2018.

2.2.1 Racial and Ethnic Variables

For each county we have the fraction of population accounted for by each of the four minority groups. Table 1 shows that the African-American and Hispanic/Latino populations shares are much larger than the other two, and there is substantial variation in all four across counties.

2.2.2 Income, Education, and Health Insurance

Controls include: median household income in each county; the fraction of adults 25 years of age or older without high-school diploma; with high-school diploma but no further education; with some college but no four-year degree; and with a four-year college degree; and the poverty rate in each county, summarized in rows 6–11 of Table 1. The ACS also asks respondents if they have any healthcare insurance, and the fraction who respond in the negative for each county is included in the data here; the average nationwide stands at 9.37%.

2.2.3 Commuting

The ACS reports detailed information on how workers get to work, from which we extract the fraction of workers who share a vehicle to get to work, and the fraction who use public transit (not including taxis) to do so. Both means of transport involve being inside a vehicle with other people for the length of the ride and could therefore potentially help spread the virus. The public-transit fraction is dramatically skewed. Even the 75th percentile county (weighted, as always, by population) has only a 3.89% share who use public transit, significantly below the mean of 4.80%. The highest share is held by Kings County, NY, at 61.36%.

2.2.4 Occupations and the Ability to Work From Home

The occupational composition of employment potentially matters for transmission of the virus for several reasons. Clearly, occupations that require contact with affected patients may be more vulnerable to infection. More broadly, since physical distancing is a key strategy for preventing spread of the virus, workers who are able to work from home may well have an advantage in staying healthy. Dingel and Neiman (2020) have shown that the ability to work from home varies greatly from one occupation to another. To capture all of these effects, we control for occupations, using the 22 two-digit categories from the Bureau of Labor Statistics because they are what are released by the Census Bureau at the county level. Specifically, the 2013–8 average fraction of employed adults in each occupational category for each county is interacted with month dummies, omitting category 11 (Management Occupations).

3 Empirical Approach

The basic estimation is quite similar to Desmet and Wazciarg (2020), and is based on the equation:

(1)mortalitytc=β0+ΣsmβsmDS(s,c)DM(m,t)+ΣkΣmβmkshareckDM(m,t)+ϵc,t,

where mortalitytc is COVID-19 deaths per million in county c during month t; D^S(s, c) is a dummy variable taking a value of 1 if c is in state s, and 0 otherwise; D^M(m, t) takes a value of 1 if t = m, and 0 otherwise; shareck is the share of minority group k in the population of county c; and ϵ_c,t is an error term. The effect of an increase in group k’s share on mortality in month m is βmk; any positive value for this implies a higher-than-average mortality rate for minority k, since a mere increase in the population share of a given minority would not change the overall mortality rate if all racial and ethnic groups had the same mortality rate.

The purpose of the β_sm terms is to pick up any unobserved state-level effects. An alternative way of doing this is to control for rest-of-state mortality, which gives rise to:

(2)mortalitytc=β0+ΣsβsDS(s,c)+ΣmβmDM(m,t)+βRROStc+ΣkΣmβmkshareckDM(m,t)+ϵc,t,

where ROStc is mortality per million residents during month t in the rest of the state to which county c belongs. Both results will be reported. For all regressions, errors are clustered by state and month following Cameron, Gehlbach and Miller (2011) using Stata command reghdfe by Sergio Correia.

This regression provides a simple indirect way of backing out differential mortality rates. Consider a thought experiment in which there are two demographic groups, numbered 1 and 2, with a mortality rate of m₁ and m₂ respectively, and assume that both of these rates are constants. Group 1 comprises s_c percent of the population in county c, so the average mortality rate is:

(3)m̄c=scm1+(100−sc)m2100.

If the share s_c varies across counties but the mortality rates m₁ and m₂ do not,^[8] we can estimate ∂m̄c∂sc by regressing m̄c on s_c as in (1). If we denote the regression coefficient by β, we can write:

(4)∂m̄c∂sc=m1−m2100=β,

and then we can solve (3) and (4) and compute the ratio m1m̄ for group 1:

(5)m1mc̄=1+β(100−sc)m̄c.

We can call this value the ‘differential mortality ratio’ and compute it for every county. The expression is easy to understand. If β = 0, so that changes in the minority share have no effect on mortality rates, then m1m̄c=1, and there is no disparity. On the other hand, if s_c is close to 100, so that group 1 is almost the whole population, then again m1m̄c will be close to 1. A large value for the differential mortality ratio will be seen when the group is a relatively small minority (s_c is small) and yet addition of a few new members of that minority has a sizable effect on aggregate mortality (β is large relative to m̄c). In our regression application, the implied differential mortality ratio for group k can be computed as:

(6)mkm̄c=1+βmk(100−skc)m̄c,

where m_k is the mortality rate for group k, m̄c is the population-average mortality rate, and s_kc is the share of group k in the population of county c.

After estimating either (1) or (2), subsequent regressions add economic and demographic controls interacted with the month. If these controls reduce the implied differential mortality ratio, then they can be interpreted as part of the cause of that ratio.

4 Results

Results are presented in Tables 2 –6. Column (1) of each table shows regression (1).^[9] Column (2) adds all socio-economic controls. Column (3) reports regression (2), and Column (4) adds socio-economic controls. Comparison of the minority-share coefficients in Columns (2) and (1), and (4) and (3), respectively, reveal how much of the racial disparities is explained by socio-economic variables. The results for the two approaches will be seen to give very similar results, but controlling for rest-of-state mortality provides estimates of slightly smaller racial disparities.

Table 2:

Monthly regression results.

	(1)	(2)	(3)	(4)
Rest-of-state deaths per million			0.6933^***	0.5928^**
			(0.1841)	(0.2000)
Feb^*African-American share	−0.00004209	−0.00001409	−0.01779	−0.3698
	(0.00004148)	(0.00002859)	(0.1766)	(0.2205)
Mar^*African-American share	0.1933	0.1740	0.1468	−0.1745
	(0.1248)	(0.1281)	(0.1794)	(0.2220)
Apr^*African-American share	3.296^***	2.569^***	2.879^***	2.123^***
	(0.4628)	(0.7079)	(0.4627)	(0.6056)
May^*African-American share	3.127^***	3.184^***	3.023^***	2.729^***
	(0.4535)	(0.5049)	(0.3227)	(0.4834)
Jun^*African-American share	2.076^***	1.752^***	2.035^***	1.464^***
	(0.2402)	(0.3592)	(0.2387)	(0.3801)
Jul^*African-American share	1.747^***	1.552^***	1.974^***	1.398^***
	(0.2551)	(0.3802)	(0.3187)	(0.4311)
Aug^*African-American share	1.495^***	1.005^**	2.597^***	1.746^**
	(0.2057)	(0.3891)	(0.4460)	(0.5606)
Sep^*African-American share	1.585^***	1.341^**	2.057^***	1.282^**
	(0.3549)	(0.5124)	(0.2979)	(0.4164)
Oct^*African-American share	0.2562	−0.6789	0.05403	−0.8032
	(0.2731)	(0.5157)	(0.3416)	(0.4776)
Nov^*African-American share	−1.146^***	−2.155^***	−1.710^***	−1.769^***
	(0.3071)	(0.5439)	(0.4301)	(0.4908)
Dec^*African-American share	−2.485^***	−3.339^***	−2.789^***	−2.176^**
	(0.4448)	(0.7142)	(0.5733)	(0.7176)
Feb^*First-Nations share	0.00003976	0.00004266	0.01644	−0.1809
	(0.00009637)	(0.00004835)	(0.3958)	(0.4321)
Mar^*First-Nations share	−0.02033	0.002540	0.04197	−0.1672
	(0.04303)	(0.1007)	(0.3879)	(0.4257)
Apr^*First-Nations share	0.5708	0.2573	0.9950	0.5728
	(0.4170)	(0.7204)	(0.5905)	(0.9168)
May^*First-Nations share	1.554^*	2.812^**	1.393	2.272^*
	(0.7243)	(1.158)	(0.7958)	(1.235)
Jun^*First-Nations share	1.677^**	1.450	1.525^*	1.342
	(0.6592)	(0.9402)	(0.7098)	(1.018)
Jul^*First-Nations share	2.325^***	1.872^*	1.781^***	1.748^**
	(0.5146)	(0.8610)	(0.5111)	(0.7053)
Aug^*First-Nations share	1.156^**	0.1473	0.5809	−0.2239
	(0.4934)	(0.9725)	(0.5529)	(0.8691)
Sep^*First-Nations share	0.9942	0.3094	0.5392	−0.5517
	(0.6926)	(0.9184)	(0.6704)	(0.9306)
Oct^*First-Nations share	1.269	0.2415	1.953*	0.7766
	(0.9601)	(1.633)	(0.9620)	(1.587)

Table 2:

(continued)

	(1)	(2)	(3)	(4)
Nov^*First-Nations share	1.635^**	0.7080	2.483^**	1.709
	(0.6828)	(1.156)	(1.072)	(1.720)
Dec^*First-Nations share	0.4858	−0.8975	0.5041	−0.2640
	(0.7887)	(1.235)	(0.7776)	(1.580)

Columns (1) and (2) have state-by-month interaction fixed effects; (3) and (4) control for rest-of-state mortality instead. Columns (1) and (3) do not have socio-economic controls, but (2) and (4) do. All regressions are weighted by county population and have standard errors clustered by month and state.

Table 3:

Monthly regression results.

	(1)	(2)	(3)	(4)
Feb^*Asian share	0.002385	0.002683	1.244	0.9883
	(0.001440)	(0.002149)	(0.9345)	(0.7072)
Mar^*Asian share	0.8739	0.3770	1.572	1.180
	(0.6798)	(0.4454)	(1.004)	(0.7768)
Apr^*Asian share	9.430^***	−1.478	5.859^**	0.8177
	(2.548)	(1.703)	(2.617)	(1.591)
May^*Asian share	4.468^*	1.131	3.895	1.304
	(2.071)	(1.708)	(2.221)	(1.560)
Jun^*Asian share	1.069	0.1547	1.818	0.8341
	(0.9863)	(1.296)	(1.447)	(1.139)
Jul^*Asian share	−2.457^**	−0.9526	−1.376	−0.4731
	(1.063)	(1.241)	(0.9962)	(1.228)
Aug^*Asian share	−5.308^**	−1.766	−4.206^**	−0.2193
	(1.765)	(1.577)	(1.689)	(1.178)
Sep^*Asian share	−5.617^**	−0.2915	−4.283^**	−0.03223
	(2.076)	(1.702)	(1.760)	(1.276)
Oct^*Asian share	−5.989^**	−0.09798	−6.151^***	−0.4329
	(2.266)	(2.095)	(1.893)	(1.528)
Nov^*Asian share	−9.183^**	1.013	−10.25^***	−1.256
	(2.927)	(2.323)	(2.384)	(2.107)
Dec^*Asian share	−14.35^***	3.668	−15.20^***	−0.7468
	(3.388)	(2.551)	(2.875)	(2.155)
Feb^*Hispanic-Latino share	−0.00009016	−0.0001063	0.1689	−0.2834
	(0.00007703)	(0.00008615)	(0.1986)	(0.4081)
Mar^*Hispanic-Latino share	0.02603	0.002513	0.2213	−0.2262
	(0.1031)	(0.06738)	(0.2135)	(0.4006)
Apr^*Hispanic-Latino share	1.164^**	0.2677	1.132^**	0.2438
	(0.5104)	(0.4551)	(0.3672)	(0.5192)
May^*Hispanic-Latino share	1.005^*	0.4007	1.001^**	0.2878
	(0.5254)	(0.6121)	(0.3317)	(0.5063)
Jun^*Hispanic-Latino share	0.9326^*	0.4112	0.8089^**	0.2443
	(0.4215)	(0.4609)	(0.3425)	(0.4739)
Jul^*Hispanic-Latino share	2.716^***	2.022^***	2.015^***	1.330^*
	(0.5117)	(0.3973)	(0.4680)	(0.7027)
Aug^*Hispanic-Latino share	3.235^***	3.178^**	2.392^***	1.910^*
	(0.8999)	(1.209)	(0.7270)	(1.049)
Sep^*Hispanic-Latino share	1.958^**	1.303	1.572^**	0.4880
	(0.8510)	(0.8669)	(0.6186)	(0.8189)
Oct^*Hispanic-Latino share	0.4141	−0.6144	0.1398	−1.265
	(0.5149)	(0.6643)	(0.4806)	(0.8653)
Nov^*Hispanic-Latino share	0.1241	−0.7776	0.4084	−0.1637
	(0.4894)	(0.8322)	(0.6882)	(0.8309)

Table 3:

(continued)

	(1)	(2)	(3)	(4)
Dec^*Hispanic-Latino share	−0.3597	−0.5927	1.493^*	3.112^***
	(0.2291)	(0.7208)	(0.7537)	(0.8409)

Columns (1) and (2) have state-by-month interaction fixed effects; (3) and (4) control for rest-of-state mortality instead. Columns (1) and (3) do not have socio-economic controls, but (2) and (4) do. All regressions are weighted by county population and have standard errors clustered by month and state.

Table 4:

Monthly regression results.

	(2)	(4)
Feb^*carpooling fraction	−3.559 × 10⁻⁶	0.5828^*
	(0.00006447)	(0.2828)
Mar^*carpooling fraction	0.02534	0.6332
	(0.3002)	(0.4203)
Apr^*carpooling fraction	1.649	2.668
	(1.597)	(1.658)
May^*carpooling fraction	1.858	2.203
	(1.639)	(1.613)
Jun^*carpooling fraction	1.808^**	2.200^**
	(0.7595)	(0.8351)
Jul^*carpooling fraction	−1.484	−1.101
	(1.959)	(1.858)
Aug^*carpooling fraction	−1.318	−0.9755
	(0.8848)	(0.8320)
Sep^*carpooling fraction	−2.940	−2.471
	(2.198)	(2.137)
Oct^*carpooling fraction	0.7035	0.4486
	(1.909)	(1.851)
Nov^*carpooling fraction	1.211	−0.4258
	(2.704)	(2.727)
Dec^*carpooling fraction	4.622	2.392
	(4.350)	(4.364)
Feb^*fraction who use public transit	−0.0005251	0.8113
	(0.0007712)	(0.5328)
Mar^*fraction who use public transit	1.589^*	1.686^**
	(0.8002)	(0.7295)
Apr^*fraction who use public transit	23.62^***	18.04^***
	(3.097)	(3.844)
May^*fraction who use public transit	6.033^***	6.749^***
	(1.097)	(1.512)
Jun^*fraction who use public transit	0.7548	2.063^*
	(0.8587)	(0.9522)
Jul^*fraction who use public transit	−1.860^**	−0.8106
	(0.7667)	(0.7558)
Aug^*fraction who use public transit	−1.467	−1.120
	(1.123)	(1.191)
Sep^*fraction who use public transit	−1.713^*	−0.8398
	(0.8657)	(0.9450)
Oct^*fraction who use public transit	0.6744	1.388
	(1.406)	(1.561)
Nov^*fraction who use public transit	3.167^*	3.392^*
	(1.695)	(1.777)

Table 4:

(continued)

	(1)	(2)	(3)	(4)
Dec^*fraction who use public transit		2.333		2.833
		(1.631)		(2.883)

Columns (1) and (2) have state-by-month interaction fixed effects; (3) and (4) control for rest-of-state mortality instead. Columns (1) and (3) do not have socio-economic controls, but (2) and (4) do. All regressions are weighted by county population and have standard errors clustered by month and state.

Table 5:

Monthly regression results.

	(2)	(4)
Feb^*uninsured fraction	−0.0001223	−0.6918
	(0.0001112)	(0.5678)
Mar^*uninsured fraction	−0.1884	−0.5366
	(0.1534)	(0.5911)
Apr^*uninsured fraction	0.3908	2.203
	(0.7102)	(1.574)
May^*uninsured fraction	−3.431^**	−2.133
	(1.181)	(1.384)
Jun^*uninsured fraction	−0.2866	−0.9778
	(0.9282)	(0.9734)
Jul^*uninsured fraction	−0.2505	−1.191
	(0.9670)	(0.9412)
Aug^*uninsured fraction	0.6128	0.4874
	(1.706)	(1.451)
Sep^*uninsured fraction	0.8528	1.825
	(1.225)	(1.226)
Oct^*uninsured fraction	1.030	1.414
	(1.643)	(1.499)
Nov^*uninsured fraction	3.428^*	2.911
	(1.780)	(1.793)
Dec^*uninsured fraction	0.9622	−0.2548
	(2.654)	(3.649)

Columns (1) and (2) have state-by-month interaction fixed effects; (3) and (4) control for rest-of-state mortality instead. Columns (1) and (3) do not have socio-economic controls, but (2) and (4) do. All regressions are weighted by county population and have standard errors clustered by month and state.

Table 6:

Monthly regression results.

	(2)	(4)
Feb^*median household income	2.190 × 10⁻⁸	0.0001039
	(8.357 × 10⁻⁸)	(0.0002065)
Mar^*median household income	0.0001953	0.0001078
	(0.0001273)	(0.0002076)
Apr^*median household income	0.001233^*	0.00004581
	(0.0005824)	(0.0009554)
May^*median household income	0.0005620	0.0001949
	(0.0004917)	(0.0007249)
Jun^*median household income	0.0004640	0.0004307
	(0.0003935)	(0.0004592)
Jul^*median household income	0.0002456	0.0002820
	(0.0004952)	(0.0004348)
Aug^*median household income	−0.0001936	0.0001822
	(0.0005401)	(0.0004982)
Sep^*median household income	−0.0001468	0.00007802
	(0.0003893)	(0.0003640)
Oct^*median household income	−0.0003157	−0.0001499
	(0.0005669)	(0.0005353)
Nov^*median household income	−0.002862^***	−0.002568^***
	(0.0007533)	(0.0006982)
Dec^*median household income	−0.002067^*	−0.001637
	(0.0009429)	(0.0009480)
Feb^*poverty rate	0.00001088	0.2616
	(0.00009230)	(0.3269)
Mar^*poverty rate	0.01941	0.2699
	(0.1600)	(0.3643)
Apr^*poverty rate	0.5824	1.110
	(1.674)	(1.725)
May^*poverty rate	−0.5192	−0.4734
	(1.012)	(1.025)
Jun^*poverty rate	1.729	1.791^*
	(0.9780)	(0.8975)
Jul^*poverty rate	0.5653	0.4957
	(0.9572)	(0.9872)
Aug^*poverty rate	1.499	1.656
	(1.912)	(2.107)
Sep^*poverty rate	−0.6656	−0.9524
	(1.105)	(1.149)
Oct^*poverty rate	0.6345	0.4421
	(1.115)	(1.395)
Nov^*poverty rate	−5.606^**	−5.496^**
	(2)	(2.249)

Table 6:

(continued)

	(1)	(2)	(3)	(4)
Dec^*poverty rate		−2.993		−3.986
		(2.566)		(3.051)
N	34,540	34,540	34,540	34,540
R²	0.3578	0.3911	0.3161	0.3514

Columns (1) and (2) have state-by-month interaction fixed effects; (3) and (4) control for rest-of-state mortality instead. Columns (1) and (3) do not have socio-economic controls, but (2) and (4) do. All regressions are weighted by county population and have standard errors clustered by month and state.

The first half of Table 2 shows the coefficients for the African-American share. They reveal a sharp disparity that rises to a maximum in either April or May, depending on the specification, and then falls sharply, becoming insignificant by October and then changing sign. A similar pattern arises for the First-Nations share (second half of Table 2), where the disparity peaks in July, and the Hispanic-Latino share (second half of Table 3), where it peaks in August. In each of these cases the disparity becomes insignificant or negative following the peak, with the exception of the December values for the Hispanic-Latino in Columns (3) and (4). For these three minority groups, the pattern is qualitatively the same whether socio-economic controls are in place ((2) and (4)) or not ((1) and (3)). In the case of the Asian share (first half of Table 3), by contrast, the strong positive and negative disparities revealed in the top part of Table 3 all become insignificant once controls are in place.

The implied differential mortality ratios from (6) can be computed from either base regression. Since (6) implies a different number for each county, the population-weighted median value for each month is of most interest. Focussing on the month that gives the highest implied ratio for each minority group, we obtain for Column (1) (Column (2)) a ratio of 5.77 (5.61) for African Americans; 4.97 (4.04) for First-Nations people; 16.70 (10.76) for Asian-Americans; and 5.38 (4.24) for Hispanic-Latinos. Note that the figure for Asian Americans is anomalous, as the coefficient is significant and positive for only one month, and then only without controls. These figures are slightly on the high side compared to others in the literature. In figures broken down by race and age group released by the government of New York City in June,^[10] the ratio of deaths per million for African-American or Hispanic/Latino relative to whites varied from 1.58 to 5.95; in the case of Asians it was close to 1 for adults. Gross et al. (2020) report age-corrected relative mortality ratios by state as of April 21, 2020 for states whose data is broken down by race, ranging from a ratio of 18 for the African-American-to-white ratio in Wisconsin down to 0.44 for Pennsylvania, averaging to 3.57 for African Americans and 1.88 for Hispanic/Latinos. Price-Haywood et al. (2020) obtained data on 3481 COVID-19 patients at a hospital in Louisiana; 70.6% of deaths were black, versus 31% of the population, implying a differential mortality ratio of 2.28.

Most of the socio-economic controls have little apparent effect on death rates. Carpooling (top half of Table 4), the fraction of local population with no healthcare insurance (Table 5), median household income and poverty rates (Table 6) show mainly insignificant coefficients with the occasional exception.^[11] The striking exception is the use of public transit, as shown in the bottom half of Table 4. For both the specifications of Column (2) and (4), the estimates show a positive effect for this variable from March to May (and also June for Column (4)), with a maximum in April of 23.62 and 18.04 respectively for the two specifications.

A one-standard-deviation increase in the use of public transit corresponds to 236.2 more deaths per million in April from the Column (2) specification and 180.4 for Column (4), which represents two-thirds and one-half of the cross-sectional standard deviation of deaths for April respectively – an enormous effect. Since June, the public transit effect has disappeared, perhaps as masks became widely used (consistent with the analysis of Harris (2020) for New York City).

We can use the computation suggested by Gehlbach (2016) to find how much of the racial disparity identified by equation (1) or (2) is due to the role of public transit, for each of the months where it has a significant effect. This involves regressing the public-transit variable (interacted with the relevant month dummy) on the regressors for (1) or (2), and then multiplying the resulting coefficient of the minority share for that month by the coefficient for public transit for that month in the main regression. The results are shown in Table 7 for April and May, the only months for which the public transit coefficient is large enough to be important. The first two columns simply reproduce the racial-disparity effects from Tables 2 –6 before and after controlling for socio-economic variables. The third column reports the portion of the disparity that can be attributed to the public-transit effect, following Gehlbach (2016). The final column shows how much of the raw disparity can be attributed to this public transit effect. The effect is enormous for April. A quarter of the disparity can be attributed to public transit for the African American share, more than the total disparity for the Asian share, and a commanding majority for the other two groups. The effect is a much smaller portion of the total disparity for May, but still stands at 70.3% for the Asian share.

Table 7:

Public transit’s contribution to racial disparity, April and May.

		(1)	(2)	(3)	(4)
		Racial coefficient without controls	With controls	Contribution of public-transit use	As percentage of total disparity
African-American	April	3.296	2.569	0.843	25.58
	May	3.127	3.184	0.215	6.89
First Nations	April	0.5708	0.2573	0.469	82.14
	May	1.554	2.812	0.120	7.71
Asian	April	9.43	−1.478	12.298	130.42
	May	4.468	1.131	3.141	70.30
Hispanic-Latino	April	1.164	0.2677	0.752	64.63
	May	1.005	0.4007	0.192	19.12

Columns (1) and (2) reproduce the relevant coefficients from the first two columns of Tables 2 –6. The value from Column (1) minus the value from Column (2) is the combined effect of socio-economic controls on the racial disparity. Column (3) is the portion of that combined effect contributed by public transit, computed as in Gehlbach (2016), namely, the coefficient from regressing public transit use on the variables in Column (1) of Tables 2 –6, times the public-transit coefficient from Column (2) of Tables 2 –6. Column (4) expresses the public-transit effect on the racial disparity as a percentage of the raw racial disparity, or (3)(1)×100.

A final observation about public transit is that it accounts for much of the difference between mortality in New York City and Los Angeles during the first pandemic wave. Both of these coastal mega-cities have had a large volume of travel with the most COVID-19-afflicted countries, but their experience of the pandemic has been strikingly different. To take Kings County as an example, the county that comprises Brooklyn, COVID-19 deaths per million were 1628 and 810 in April and May respectively, while for Los Angeles the figures were 72 and 117, an order of magnitude smaller. The fraction who use public transit to get to work was 61% for Brooklyn and 6% for Los Angeles, which by the regression estimates would account for 64–83% and 48–54% of the differential mortality between the two cities in the two months respectively (depending on which specification is used).^[12]

Given the evidence of early excess deaths followed by later below-average deaths among three of the minority groups, it is natural to ask what is the net effect of these biasses over the year. Table 8 investigates this, reporting in Column (1) a version of regression equation (1) with the dependent variable equal to total COVID deaths per million cumulated over the year for each county. (Of course, the month dummies are replaced by a constant.) The results reveal a strong positive coefficient for the same three minorities, indicating a positive excess death rate overall for all three. Following the computation described earlier for the differential mortality ratio using (6), we find implied differential mortality ratios of 1.96 for African Americans, 2.24 for First-Nations people, and 1.97 for Hispanic-Latino people over calendar year 2020. Column (2) adds the socio-economic controls other than the occupational shares, and Column (3) adds the occupational shares (the coefficients are not shown). Once again, public transit has a large effect but most of the other controls have no effect, including the fraction without health insurance.^[13] The controls reduce the disparity for all three groups but do not eliminate it, except for the First Nations disparity which has a large point estimate with full controls but is statistically insignificant.

Table 8:

Regression results for total 2020 COVID deaths.

	(1)	(2)	(3)
African-American share	10.14^***	6.644^***	5.405^***
	(1.809)	(1.764)	(1.765)
First-nations share	11.65^**	9.458^*	6.902
	(5.289)	(5.225)	(5.510)
Asian share	−27.06^*	0.6950	1.759
	(14.28)	(7.132)	(6.188)
Hispanic-Latino share	11.22^***	5.372^*	5.600^*
	(2.724)	(2.699)	(3.100)
Carpooling fraction		1.370	6.135
		(5.636)	(5.388)
Fraction who use public transit		31.99^***	33.13^***
		(6.571)	(5.794)
Uninsured fraction		−5.031	3.120
		(6.822)	(6.123)
Fraction with less than high school		24.13^***	22.42^***
		(4.576)	(6.581)
Fraction with high-school diploma		13.16^***	16.45^***
		(3.865)	(4.705)
Fraction with some college		5.749	6.960
		(4.391)	(5.658)
Median household income		−8.152 × 10⁻³^***	−2.885 × 10⁻³
		(2.671 × 10⁻³)	(2.612 × 10⁻³)
Poverty rate		−6.963	−4.754
		(6.284)	(5.913)
Constant	910.0^***	462.7	−1,147
	(54.80)	(350.5)	(1,002)
N	3140	3140	3140
R²	0.2980	0.3438	0.3637

In each column, the dependent variable is the cumulative COVID-19 deaths for the county over the course of 2020. Column (1) has state fixed effects but no socio-economic controls. Column (2) has socio-economic controls as shown, but no occupational shares. Column (3) has all controls including occupational shares.

5 Conclusion

The main findings can be summarized as follows. (i) For all four minorities, after controlling for state-level effects, there is a strong positive correlation across counties between the minority’s population share and COVID-19 deaths that peaks sometime between April and August and dissipates after that, suggesting a higher mortality rate for those groups during the first wave of the pandemic. (ii) For Asian-Americans the correlation is fragile, and disappears when we control for education, occupation, and commuting patterns. (iii) By contrast, for the other three minorities, the correlations are very robust. Regardless of what other factors are controlled for, the racial disparity in mortality rates does not seem to be due to differences in income, poverty rates, education, occupational mix, or even access to healthcare insurance, which has been hypothesized by many observers to be a key source of the disparity. (iv) The biggest single predictor of deaths is use of public transit, but only in April.

Since for three of the four minorities, a large portion of the disparities is not explained by the socioeconomic factors measured here, other candidates need to be considered. Possible causes beyond the scope of this note are listed in Centers for Disease Control (2020), and include: (i) concentration of workers in essential services. This cannot be completely ruled out by the results here simply because the ACS occupational controls available at the county level from the Census are rather crude. We have here the county-level employment share of 22 different occupational categories. There is likely much heterogeneity within those categories that is not captured in these regressions, both in how ‘essential’ each occupation is considered and in the risk to workers in each occupation. (ii) Differential incidence of pre-exisiting conditions that can make a COVID-19 infection more dangerous. Price-Haywood et al. (2020) and Wiemers et al. (2020) both show the importance of these conditions in the data and their disproportionate importance for African-Americans.^[14] (iii) Differential availability of paid sick leave. (iv) Environmental racism. Since the pathbreaking report from the United Church of Christ Commission for Racial Justice (1987), a rich literature has shown that environmental hazards tend to be concentrated in minority neighborhoods. This could be a source of the disparities. Wu et al. (2020) study the effect of county-level particulate matter in the air on COVID-19 death rates, and find alarmingly strong effects; however, they separately control for the Black population share, and find an effect of similar magnitude to the present study. This seems to rule out particulate matter as the underlying source of the disparity in death rates, but there are many other forms of environmental harm that could be explored in a similar manner.

In addition to the failure of socio-economic variables available here to explain the disparities, the decline in the measured disparities in the later part of the year is a surprising result. It is a counter-argument to assumptions that the racial disparities result from immutable biological characteristics, and therefore to any resulting fatalism about those disparities. It is, of course, possible that the observed decline is an artifact of the county method of measurement. As noted at the outset, individual data would be much better but is not available nationwide. However, as noted in the discussion of Table 2, the county method does show a disparity in the early parts of the pandemic that is in line with the results of individual data, so it is not clear why this method would stop performing well by mid-year.^[15]

Another possibility is that as local policymakers in highly-affected states became aware of the disparity, they took measures to correct existing biasses, and that these measures were effective. For example, the government of Michigan formed a task force in April on COVID-related racial disparities, which tailored information on mask wearing to minority communities and made extra efforts to provide testing, contact-tracing, and primary physicians for minorities. The African-American population accounted for 40.7% of the state’s COVID deaths in April but, likely due to some extent to the task force, they were only 8% in November.^[16] For another example, the leadership of the Navajo nation has pursued more aggressive strategies to curb the disease on Navajo territory than measures used by neighboring states, implementing mask rules long before the surrounding states and also banning outside visitors.^[17]

A third possibility is social learning. A large literature shows that knowledge about and attitudes toward natural disasters and precautionary behavior vary greatly from one racial/ethnic group to the next, based on the differential experience of these groups and within-group communication (Fothergill et al. 1999). It would not be surprising if the most devastated communities exhibited the most rapid adoption of private precautionary measures. By mid-2020, polling indicated that African-Americans were substantially more worried about the virus than other groups, which could lead to higher rates of precautionary behavior.^[18] At the same time, resistance to precautionary measures is correlated with political conservatism, which is itself correlated with race and ethnicity. Welsch (2020) shows strong evidence that the county-level vote for Donald Trump in the 2016 election is negatively correlated with the propensity to wear a mask within six feet of other people during the pandemic. Graham, Cullen, and Pickett (2020) shows evidence that intentions to follow social distancing is negatively correlated with ‘faith in Trump.’ These features of individual behavior, combined with large super-spreader events such as the motorcycle rally in Sturgis, SD during August 7–16 (Dave et al. 2021), and numerous conservative political rallies set up the possibility that the second wave would hit disproportionally whiter counties.

The exact reason for the time-varying effects cannot be established with certainty within this note, but the striking evidence for time-varying effects carries two lessons, both for this pandemic and for future ones. The first is that even over the course of a single year, it is important to allow for effects that vary over time. The way in which policies and behaviors change can rapidly change the character of the economy’s response to a pandemic. This also is in evidence in the way death rates respond to workplace presence, which changed dramatically over the course of the year (McLaren and Wang 2020), and is likely to be important for any future pandemic as well. The second is is that whatever the cause, the sharp decline in these estimated disparities sharply rebuts any theory that the disparities are the result of immutable biological differences across persons of different race, which as many observers have noted is important for the politics of pandemic response (for example, Chowkwanyun and Reed (2020) and Wood (2020)).

Corresponding author: John McLaren, Department of Economics, University of Virginia, P.O. Box 400182, Charlottesville, VA22904-4182, USA, E-mail: jmclaren@virginia.edu.

Acknowledgments

Su Wang provided excellent research assistance. Helpful comments from Adam Blandin and Merlin Chowkwanyun, two referees and the co-editor are gratefully acknowledged; all responsibility for errors is mine.

References

Benitez, J., C. Courtemanche, and A. Yelowitz. 2020. “Racial and Ethnic Disparities in COVID-19: Evidence from Six Large Cities.” Journal of Economics, Race, and Policy 3 (4): pp. 243–61, (December).10.3386/w27592Search in Google Scholar

Cameron, A. C., J. B. Gelbach, and D. L. Miller. 2011. “Robust Inference with Multiway Clustering.” Journal of Business & Economic Statistics 29 (2): 238–49. https://doi.org/10.1198/jbes.2010.07136.Search in Google Scholar

Centers for Disease Control. 2020. COVID-19 in Racial and Ethnic Minority Groups. https://www.cdc.gov/coronavirus/2019-ncov/.Search in Google Scholar

Chowkwanyun, M., and A. L. ReedJr. 2020. “Racial Health Disparities and Covid-19 – Caution and Context.” The New England Journal of Medicine 383: pp. 201–3.10.1056/NEJMp2012910Search in Google Scholar

Dave, D., D. McNichols, and J. J. Sabia. 2021. “The Contagion Externality of a Superspreading Event: The Sturgis Motorcycle Rally and COVID-19.” Southern Economic Journal 87 (3): 769–807. https://doi.org/10.1002/soej.12475.Search in Google Scholar

Desmet, K., and R. Wacziarg. 2020. “Understanding Spatial Variation in COVID-19 across the United States.” In NBER Working Paper No. 27329 (June).10.3386/w27329Search in Google Scholar

Dingel, J. I., and B. Neiman. 2020. “How Many Jobs Can Be Done at Home?” In NBER Working Paper #26948.10.3386/w26948Search in Google Scholar

Fothergill, A., E. G. M. Maestas, and J. A. DeRouen. 1999. “Race, Ethnicity and Disasters in the United States: A Review of the Literature.” Disasters 23 (2): 156–73. https://doi.org/10.1111/1467-7717.00111.Search in Google Scholar

Gehlbach, J. B., 2016. “When Do Covariates Matter? and Which Ones, and How Much?” Journal of Labor Economics 34 (2): 509–43.10.1086/683668Search in Google Scholar

Graham, A.,Gehlbach, J. B., F. T. Cullen, and J. T. Pickett. 2020. “Faith in Trump, Moral Foundations, and Social Distancing Defiance during the Coronavirus Pandemic.” Socius: Sociological Research for a Dynamic World (6).10.1177/2378023120956815Search in Google Scholar

Gross, C. P., U. R. Essien, S. Pasha, J. R. Gross, S.-y. Wang, and M. Nunez-Smith. 2020. “Racial and Ethnic Disparities in Population Level Covid-19 Mortality.” In Working Paper. Yale School of Medicine, hosted by medRxiv.10.1007/s11606-020-06081-wSearch in Google Scholar

Harris, J. E. 2020. “The Subways Seeded the Massive Coronavirus Epidemic in New York City.” In NBER Working Paper No. 27021 (April).10.3386/w27021Search in Google Scholar

Johnson, A. 2020. “On the Minds of Black Lives Matter Protesters: A Racist Health System.” ProPublica.org.Search in Google Scholar

McLaren, J. 2020. “Racial Disparity in COVID-19 Deaths: Seeking Economic Roots with Census Data.” VoxEu.org Column.10.3386/w27407Search in Google Scholar

McLaren, J., and S. Wang. 2020. “Effects of Reduced Workplace Presence on Covid-19 Deaths: An Instrumental-Variables Approach.” In NBER Working Paper 28275 (December).10.3386/w28275Search in Google Scholar

Oppel, R. A.Jr, K. K. Robert Gebeloff, R. Lai, W. Wright, and M. Smith. 2020. “Racial Disparity in Cases Stretches All Across Board.” The New York Times, p. A1 (July 6).Search in Google Scholar

Price-Haywood, E. G., J. Burton, D. Fort, and M. D. Leonardo Seoane. 2020. “Hospitalization and Mortality Among Black Patients and White Patients with Covid-19.” The New England Journal of Medicine 382: pp. 2534–43.10.1056/NEJMsa2011686Search in Google Scholar

Rubin, E. J., L. R. Baden, M. K. Evans, and S. Morrissey. 2020. “Audio Interview: The Impact of Covid-19 on Minority Communities.” New England Journal of Medicine 382: e111.10.1056/NEJMe2021935Search in Google Scholar

United Church of Christ Commission for Racial Justice. 1987. Toxic Wastes and Race in the United States: A National Report on the Racial and Socio-Economic Characteristics of Communities with Hazardous Waste Sites. New York: United Church of Christ.Search in Google Scholar

Welsch, D. M. 2020. “Do masks Reduce COVID-19 Deaths? A County-Level Analysis Using IV.” Covid Economics 57: 20–45.Search in Google Scholar

Wiemers, E. E., A. Scott, M. AlFakhri, V. J. Hotz, R. F. Schoeni, and J. A. Seltzer. 2020. “Disparities in Vulnerability to Severe Complications from Covid-19 in the United States.” In NBER Working Paper 27294 (June).10.3386/w27294Search in Google Scholar

Wood, G. 2020. “What’s Behind the COVID-19 Racial Disparity?” The Atlantic(May 27).Search in Google Scholar

Wu, X., R. C. Nethery, M. B. Sabath, D. Braun, and F. Dominici. 2020. “Exposure to Air Pollution and COVID-19 Mortality in the United States: A Nationwide Cross-Sectional Study.” In Working Paper. Harvard T.H. Chan School of Public Health. Hosted by MedRxiv.10.1101/2020.04.05.20054502Search in Google Scholar

Received: 2020-10-26

Revised: 2021-03-29

Accepted: 2021-03-29

Published Online: 2021-04-30

Racial Disparity in COVID-19 Deaths: Seeking Economic Roots with Census Data

Abstract

1 Introduction

2 Data

2.1 Mortality Data

2.2 Demographic and Economic Data

2.2.1 Racial and Ethnic Variables

2.2.2 Income, Education, and Health Insurance

2.2.3 Commuting

2.2.4 Occupations and the Ability to Work From Home

3 Empirical Approach

4 Results

5 Conclusion

Acknowledgments

References

Journal and Issue

Articles in the same Issue