Abstract
Covid-19 excess deaths refer to increases in mortality over what would normally have been expected in the absence of the Covid-19 pandemic. In this study, we take advantage of spatial variation in Covid-19 mortality across US counties to estimate its relationship with all-cause mortality. We then examine how the extent of excess mortality not assigned to Covid-19 varies across subsets of counties defined by demographic and structural characteristics. We estimate that 26.3% [95% CI, 20.1% to 32.5%] of excess deaths between February 1 and September 23, 2020 were ascribed to causes of death other than Covid-19 itself. Excess deaths not assigned to Covid-19 were even higher than predicted by our model in counties with high income inequality, low homeownership, and high percentages of Black residents, showing a pattern related to socioeconomic disadvantage and structural racism. The standard deviation of mortality across counties increased by 9.5% as a result of excess deaths directly assigned to Covid-19 and an additional 5.3% as a result of excess deaths not assigned to Covid-19. Our work suggests that inequities in excess deaths attributable to Covid-19 may be even greater than revealed by data reporting deaths assigned to Covid-19 alone.
Introduction
The novel coronavirus disease 2019 (Covid-19) is an international public health emergency caused by the respiratory droplet transmission of the coronavirus-2 (SARS CoV-2) virus.1 Covid-19 infects humans through the lung epithelium and is associated with a high incidence of acute respiratory distress syndrome, vascular injury, and death.2 The United States has emerged as an epicenter of the pandemic, with 6.98 million confirmed cases and over 202,000 deaths as of September 24, 2020.3
Vital registration data on cause of death are likely to underestimate the mortality burden associated with the pandemic for several reasons.4,5 First, some direct deaths attributable to Covid-19 may be assigned to other causes of death due to an absence of widespread testing and low rates of diagnoses at the time of death.6 Second, direct deaths from unfamiliar complications of Covid-19 such as coagulopathy, myocarditis, inflammatory processes, and arrhythmias may have caused confusion and led to attributions of death to other causes, especially early in the pandemic.7-9 Third, Covid-19 death counts do not take into account the indirect consequences of the Covid-19 pandemic on mortality levels.10-12 Indirect effects may include increases in mortality resulting from reductions in access to and use of health care services and psychosocial consequences of stay-at-home orders.13 Increases in stress, depression, and substance use related to the pandemic could also lead to suicides and overdose deaths.14,15 Economic hardship, housing insecurity, and food insecurity may cause indirect deaths, especially among those living with chronic illnesses or who face acute heath emergencies and cannot afford medicines or medical supplies.16-18 On the other hand, the pandemic may reduce mortality as a result of reductions in travel and associated motor vehicle mortality.19 It is also possible that Covid-19 deaths are over-recorded in some instances, e.g., because some deaths that should have been assigned to influenza were instead assigned to Covid-19. Finally, the Covid-19 epidemic may reduce mortality from certain other causes of death because of frailty selection; those who die from Covid-19 may have been unusually frail and vulnerable to death from other diseases. Consequently, the rate of death from those diseases may decline and offset some of the increase in all-cause mortality attributable to Covid-19 deaths alone.
Estimates of excess deaths from all causes associated with the pandemic provide a useful measure of the total mortality burden associated with Covid-19. The term “excess deaths” refers to increases in mortality over what would normally have been expected based on historical norms for the period of analysis. Excess deaths include deaths directly assigned to Covid-19 on death certificates and excess deaths not assigned to Covid-19, which were either misclassified to other causes of deaths or were indirectly related to the Covid-19 pandemic. In summary, using deaths from all causes to measure the excess mortality impact of the Covid-19 epidemic can help circumvent biases in vital statistics, such as low Covid-19 testing rates, reporting lags, and differences in death certification coding practices,5 and capture excess deaths indirectly related to the Covid-19 pandemic.
Previous reports have estimated excess deaths in several different ways. The National Center for Health Statistics (NCHS) estimates excess mortality through comparison of mortality levels in 2020 to historical mortality data by week and geographic location.20 They present a range of values for excess deaths based on different historical thresholds, including the average expected count or upper boundary of the uncertainty interval, and apply weights to the 2020 provisional death data to account for incomplete data. In contrast, Weinberger et al. and Woolf et al. use multivariable Poisson regression models to evaluate increases in the occurrence of deaths due to any cause across the US. 21,22 Weinberger et al. adjust for influenza activity.21 Kontis et al. apply Bayesian ensemble modeling to obtain smoothed estimates of excess deaths by age and sex in the UK.23 These studies make estimates for each spatial unit individually and do not allow any possible relationship between all-cause mortality and Covid-19 mortality to be identified through analysis across spatial units.
In this paper, we take advantage of spatial variation in Covid-19 mortality across US counties to estimate its relationship with all-cause mortality across all counties. We anticipate that counties with higher mortality from Covid-19 will also have experienced greater increases in mortality from other causes of death because the impact of the pandemic is not registered in Covid-19 deaths alone. We use the relationship between Covid-19 mortality and changing mortality from all causes of death to estimate excess mortality that was not directly assigned to Covid-19 as a cause of death. We then examine how the extent of excess mortality not assigned to Covid-19 varies across subsets of counties defined by area-level demographic and structural characteristics, allowing us to identify population subgroups with a disproportionate number of excess deaths that were not directly assigned to Covid-19. Our estimates provide an alternative approach to calculating excess deaths that can complement existing approaches.
Methods
Data Sources
We used NCHS provisional county-level data on all-cause mortality and directly assigned Covid-19 mortality from February 1 to September 23, 2020. 94% of deaths assigned to Covid-19 by NCHS have Covid-19 reported as the underlying cause of death; the other 6% have Covid-19 listed somewhere else on the death certificate.24 The data were limited to counties with 10 or more Covid-19 deaths and were considered provisional due to a time lag between the occurrence of deaths and the completion, submission, and processing of death certificates. The number of counties included in the analysis was 1,021, and the exclusion criteria are detailed in Supplemental Figure 1. To construct a historical comparison period, we utilized county-level data from CDC Wonder reporting all-cause mortality each month from February to September for 2013 through 2018. We also used U.S. Census data on county population from 2013 to 2020 and data on demographic and structural factors from a variety of consolidated sources including the 2020 RWJ Foundation County Health Rankings. A list of these data sources is provided in Supplemental Table 1. The present investigation relied on deidentified publicly available data and was therefore exempted from review by the Boston University Medical Center Institutional Review Board.
Death Rates
We produced crude death rates for all-cause and directly coded Covid-19 mortality in 2020 using the reported death counts and the estimated county-level population on June 1, 2020. We estimated the population on June 1 of each year from 2013 through 2020 by interpolating between the population on July 1 of that year and July 1 of the prior year. We multiplied the population by 235/365.25 so that our death rates would be in units of deaths per person-years. To compute an average historical death rate for 2013 to 2018, we added deaths from February to August of each year plus 23 of the 30 days in September. We then divided the sum of deaths from 2013 to 2018 by the total population from 2013 to 2018.
Demographic and Structural Factors
Prior literature has established that Covid-19 mortality may differ by demographic and structural factors.27,28 In this analysis, the demographic factors that we examined were population density (people per square miles) and rurality (% rural). Structural factors included population distribution by race/ethnicity (% non-Hispanic Black, % non-Hispanic white), socioeconomic status (median household income), income inequality (ratio of household income at the 80th percentile to income at the 20th percentile), residential segregation (% of non-white or white residents that would have to move to different geographic areas in order to produce a distribution that matches that of the larger area), housing (% homeownership), and housing costs (% of households whose monthly housing costs (including utilities) exceed 50% of monthly income). We stratified counties into deciles by each demographic or structural factor. We then examined excess mortality by deciles of the county-level characteristics that were most closely associated with variation in excess mortality.
Statistical Analysis
We first modeled the relationship between all-cause mortality and Covid-19 mortality using a model that allows both Covid and non-Covid sources of mortality change to be recognized:
M(i) = Death rate from all causes in county i in 2020
M*(i) = Average death rate from all causes, county i in 2013-2018
C(i) = Covid-19 death rate in county i in 2020
ε = Error term
The parameters of equation (1) were estimated using Ordinary Least Squares regression with county units weighted by their population size. The value of α represents changes in mortality that are independent of Covid-19 mortality and that are common across regions. The value of β1 represents the extent to which past levels of all-cause mortality in a county are replicated in 2020. If β1 = 1.0, for example, then all-cause mortality in 2020 is expected to be equal to that in 2013-18, plus or minus the value of α. Thus, combinations of α and β1 indicate how mortality changes that are not associated with Covid-19 vary with the level of all-cause mortality in 2013-18. The value of β2 indicates the extent to which mortality from Covid-19 affects all-cause mortality in 2020. If β2 = 1.0, it would imply that each death coded to Covid-19 would be associated with one additional death from all causes combined. Values of β2 greater than 1.0 would suggest that the effect of the Covid-19 pandemic in a county is not fully reflected in deaths assigned to Covid-19 and that excess deaths are being attributed to some other causes of death. Values of β2 less than 1.0 would suggest that Covid-19 is over-recorded as a cause of death or reductions in mortality are occurring for other causes.
In a sensitivity analyses, we excluded New York City where the death rate was the highest early in the pandemic. In further sensitivity analyses, we performed the regression among all counties in the dataset, including the 26 counties that were eliminated by our study’s exclusion criteria. We also conducted a sensitivity analysis in which we limited the analysis to counties in states who reported at least 80% completeness over the most recent 3 weeks of data. To assess the robustness of our results to alternative modeling approaches, we also estimated the relationship between Covid-19 and all-cause mortality using a Negative Binomial regression model.
We did not control for age in our primary analysis because the effect of age should be partially captured in the historical mortality term. In a sensitivity analysis, we re-estimated the OLS and Negative Binomial models using indirectly age-standardized death rates to adjust for differences in the counties’ age distributions. Deaths by age were not available at the county level so direct standardization of death rates was not possible. Indirect standardization adopts the age-specific death rate schedule for the whole US and applies it to the age distribution of a county to predict the number of deaths in that county.25 It then calculates the ratio of actual deaths in the county to the predicted number of deaths. Finally, it applies that ratio to the US crude death rate to estimate the indirectly age-standardized death rate for the county.26 Death rates and age distributions were employed in 10-year wide age intervals. When the death rate referred to Covid-19 mortality alone, death rates were restricted to that cause of death.
To identify counties where excess mortality was higher or lower than predicted by our model, we calculated the residuals from our primary model. Weighted means of the residuals were then calculated in each decile of the demographic and structural factors and compared. In a sensitivity analysis, we repeated this analysis using Pearson residuals from a Negative Binomial model.
Results
Supplemental Table 2 presents characteristics of the 1,021 counties included in the dataset, whose distribution across the U.S. is visualized in Supplemental Figure 2. Among these counties, the population density was an average of 2,500 people per square mile. On average, the counties were 13.4% non-Hispanic Black and 56.9% non-Hispanic white. The median annual household income in the counties was $67,142, and the ratio of household income at the 80th percentile to income at the 20th percentile was 4.8. In the counties, 62.4% of residents were homeowners, and 15.6% were living with high housing costs.
Figure 1 plots the difference between the 2020 all-cause mortality rate and the average 2013-2018 historical all-cause death rate against the Covid-19 mortality rate in the study counties. The area of each point is roughly proportional to the county’s population size. This figure presents evidence that there is a relationship between the change in mortality from all causes of death and the level of Covid-19 mortality in a county.
Table 1 presents coefficients of the model describing the relationship between the directly assigned Covid-19 mortality rate and all-cause mortality in 2020. The estimated value of α is 0.572 deaths per 1000 people and β1 is 0.97 (95% CI, 0.92 to 1.02). Given the observed range of 2013-18 death rates, this combination implies that, when Covid-19 mortality is set at zero, all-cause mortality is expected to have risen in all counties between 2013-18 and 2020. The coefficient of β2 is estimated to be 1.36 (95% CI, 1.24 to 1.47). This value suggests that, for every 100 deaths assigned to Covid-19, the number of all-cause deaths rose by 136. This result implies that 26.3% [95% CI, 20.1% to 32.5%] of all excess deaths were not directly assigned to Covid-19 on death certificates. In absolute terms, 183,686 directly assigned Covid-19 deaths were recorded in the 1,021 counties in the study, meaning there were 65,481 [95% CI, 44,435 to 86,527] excess deaths not directly assigned to Covid-19 for a total of 249,167 [95% CI, 228,121 to 270,213] excess deaths. When we limited to counties in states with at least 80% completeness over the most recent 3 weeks, the coefficient of β2 was 1.33 (95% CI, 1.19 to 1.46). Our results were relatively consistent across alternative modeling specifications with and without indirect age standardization, with the percent of excess deaths not attributed to Covid-19 ranging from 26.3% to 29.3% across Ordinary Least Squares and Negative Binomial models (Supplemental Table 3).
Figure 2 examines county-level factors associated with higher or lower than predicted excess deaths not directly assigned to Covid-19 by calculating residuals from our primary model and taking the weighted mean across deciles. For demographic factors, excess deaths were higher than predicted by our model among counties with greater population density and counties that were less rural. For structural factors, excess deaths were higher than predicted by our model among counties with a greater proportion of non-Hispanic Black residents, a lower proportion of non-Hispanic white residents, lower and middle household incomes, greater income inequality, less home ownership, more residents with high housing costs, and more residential segregation. These results based on structural factors show a coherent pattern related to socioeconomic disadvantage. Supplemental Figure 3 presents the residuals calculated using Negative Binomial regression.
Figure 3 shows the states with higher than predicted Covid-19 excess mortality not directly assigned to Covid-19. In order, these states are West Virginia, Kentucky, Tennessee, Arkansas, Missouri, Oklahoma, Mississippi, New Mexico, Wisconsin, and Kansas. These states are predominantly in the Midwest or in the South.
Figure 4 decomposes the excess death rate into the predicted death rate in the absence of Covid-19, the excess death rate directly assigned to Covid-19, and the excess death rate not assigned to Covid-19 and compares counties in the upper and lower 20% of household income, income inequality, and percent non-Hispanic Black. For each of these factors, the addition of excess deaths assigned to Covid-19 and excess deaths not assigned to Covid-19 increased disparities between the high and low category. The differences grew larger with indirect age standardization (Supplemental Figure 4). The standard deviation of mortality across counties increased by 9.5% as a result of excess deaths directly assigned to Covid-19 and an additional 5.3% as a result of excess deaths not assigned to Covid-19.
Discussion
We estimated that 26.3% of excess deaths attributable to the Covid-19 pandemic were not assigned to Covid-19 on death certificates. Prior estimates of excess mortality not assigned to Covid-19 have varied. An analysis by Woolf et al., based on data from March 1 through April 25, 2020, found that of 87,001 excess deaths, 30,755 excess deaths or 35% were not assigned to Covid-19.22 Weinberger et al. identified 95,235 directly coded Covid-19 deaths through May 30, and 122,300 total excess deaths attributable to the pandemic.21 This indicated that 27,065 deaths or 22% of excess deaths were excess deaths not assigned to Covid-19. As of September 24, 2020, the NCHS identified 189,574 directly coded Covid-19 deaths and between 208,390 to 274,055 total excess deaths.29 This data suggests that between 9% and 31% of excess deaths were not directly assigned to Covid-19.
Our estimates of excess deaths may be lower than estimates by Woolf et al. for several reasons. First, our analysis extends to September 23, 2020 while their analysis pertain to earlier periods. Weinberger et al. show that their estimates of excess deaths decline as time advances.21 Our estimates are in the middle of the range suggested by NCHS, whose analysis also extends to September. Second, our analytic approach is affected by measurement error in Covid-19 mortality in a way that differs from its effect on other estimates. In general, random measurement error in an independent variable (such as Covid-19 mortality in our analysis) is expected to bias the coefficient of that variable towards the null, or zero.30 If our estimate of β2 is biased downwards, then we will have underestimated the magnitude of excess deaths.
Our estimates predict that mortality would have risen between 2013-18 and 2020 even in the absence of the Covid-19 pandemic. This prediction is consistent with a rising trend in national crude death rates between 2013 and 2019. The crude death rate rose from 821.5 deaths per 100,000 people in the year 2013 to 867.8 deaths per 100,000 people in 2018.31 This increase is likely attributable primarily to population aging.
As noted earlier, excess deaths not assigned to Covid-19 could include deaths involving Covid-19 that were misclassified to other causes of death and deaths indirectly related to the Covid-19 pandemic. The NCHS has made an effort to examine excess deaths not assigned to Covid-19 by cause of death. As of September 24, 2020, NCHS has attributed 28,797 excess deaths to Alzheimer’s disease and related dementias, 16,040 excess deaths to hypertensive diseases, 11,222 excess deaths to ischemic heart disease, 10,285 excess deaths to diabetes, and 3,056 excess deaths to influenza and pneumonia.29 These estimates suggest that the majority of excess deaths identified in this study were likely assigned to Alzheimer’s disease and related dementias or various circulatory diseases and diabetes. It is possible that a substantial fraction of the deaths of individuals with pre-existing chronic conditions who acquire Covid-19 and die as a result are ascribed to the pre-existing condition. These may constitute many of the excess deaths not attributed to Covid-19.
Areas with an exceptionally high number of excess deaths not assigned to Covid-19 may reflect a lack of testing facilities or co-morbidities that acquire priority in the assignment of cause of death. Indirect deaths, such as suicide or drug overdose and deaths related to reduced use of health care services, may also be higher in these areas. The frequency of excess deaths not attributed to Covid-19 was higher in some midwestern and southern states. These geographic distinctions may be associated with testing, diagnostic, and coding differences that have yet to be identified. They could also relate to how governors have approached the reopening of the economy and enforcement of public health provisions such as social distancing and mask wearing.32,33 Analyses of multiple cause-of-death data, when available, will help shed additional light on the contribution of Covid-19 to US mortality.
Previous research has shown that counties with a higher percentage of Black residents have reported more mortality attributable to Covid-19.27,34 According to underlying cause of death data, Black people are 2.4 times more likely to die from Covid-19 than white people.35 Racial inequities in Covid-19 mortality relate to structural racism that has made Black people more likely to be exposed to Covid-19 at work, in transportation, and in housing during the pandemic.27,34,36,37 Another factor is racial health inequities in asthma, COPD, hypertension and diabetes, which are risk factors for Covid-19.38,39 The presence of these comorbidities may also reduce the likelihood of assigning Covid-19 as a cause of death. In fact, our analysis suggests that the impact of the Covid-19 pandemic on the Black population is understated when studying data reporting deaths assigned to Covid-19 alone, since counties with higher proportions of Black residents also have more excess deaths not assigned to Covid-19. Such excess mortality goes beyond the 36 excess deaths not attributed to Covid-19 for every 100 directly assigned Covid-19 deaths. When considering the addition of excess mortality not directly attributed to Covid-19, our work suggests that racial inequities in excess deaths attributable to Covid-19 directly and indirectly may be even greater than currently understood.
A major limitation of this analysis is that the 2020 all-cause mortality and Covid-19 mortality data used were provisional. Counties may have differential delays in reporting death certificate data that vary by county, state, rurality, or other area-level factors. In particular, counties that are currently reporting lower all-cause mortality in 2020 than in the historical period likely have incomplete data. Despite this limitation, our estimate of excess deaths not assigned to Covid-19 remained consistent when we limited to counties with at least 80% completeness over the most recent 3 weeks of data. Another challenge was that the majority of counties in the study had fewer than 50 directly assigned Covid-19 deaths. Although NCHS excluded counties with under 10 directly assigned Covid-19 deaths, this only partially addressed the uncertainty caused by small numbers. Lastly, the demographic and structural factors examined in this analysis were based on data from 2012 through 2018. County-level distribution of these factors may have changed between that time and 2020 when the mortality data was gathered and analyzed.
Although lower than some previously published studies, our findings suggest that the overall mortality burden of Covid-19 considerably exceeds reported Covid-19 deaths. Using provisional vital statistics county-level data on Covid-19 and all-cause mortality in 2020 from the NCHS, we estimate that 26.3% of all excess deaths in the United States from February 1 to September 23, 2020 were excess deaths not assigned to Covid-19. We also found that the number of excess deaths not assigned to Covid-19 differed by county-level factors including socioeconomic disadvantage and structural racism. As the impact of Covid-19 spreads over a broader swath of U.S. counties, many of which re-opened early and have struggled with testing and their overall response, the analysis of excess deaths will continue to be a valuable proxy measure for assessing the overall mortality burden of Covid-19. This study highlights the importance of considering excess deaths beyond those directly assigned to Covid-19 within the overall assessment of the mortality impact of the Covid-19 pandemic and provides a new method for doing so.
Data Availability
All data used in this manuscript are publicly available with the exception of the 2020 county population data, which are available through a special request to the U.S. Census Bureau. Further details about the data used in this analysis are provided at the linked GitHub repository.
https://github.com/pophealthdeterminantslab/covid-19-county-analysis
Conflicts of Interest
Dr. Stokes reported receiving grants from Ethicon Inc outside the submitted work. No other disclosures were reported.
Disclaimer
The interpretations, conclusions, and recommendations in this work are those of the authors and do not necessarily represent the views of the Robert Wood Johnson Foundation.
Acknowledgements
The Robert Wood Johnson Foundation supported the research reported in this publication. Elo was also supported by National Institute on Aging R01 AG060115 “Causes of Geographic Divergence in American Mortality Between 1990 and 2015: Health Behaviors, Health Care Access and Migration.”
The authors would like to thank Magali Barbieri, Jacob Bor, Dana Glei, Josh Goldstein, Michelle Guillot, Patrick Heuveline, Anna McGregor, Jennifer Weuve and Wubin Xie for their valuable feedback on the manuscript.