ABSTRACT
Background The infection fatality rate (IFR) of Coronavirus Disease 2019 (COVID-19) varies widely according to age and residence status.
Purpose Estimate the IFR of COVID-19 in community-dwelling elderly populations and other age groups from seroprevalence studies. Study protocol: https://osf.io/47cgb.
Data Sources Seroprevalence studies done in 2020 and identified by any of four existing systematic reviews.
Study Selection SARS-CoV-2 seroprevalence studies with ≥1000 participants aged ≥70 years that presented seroprevalence in elderly people; aimed to generate samples reflecting the general population; and whose location had available data on cumulative COVID-19 deaths in elderly (primary cutoff ≥70 years; ≥65 or ≥60 also eligible).
Data Extraction We extracted the most fully adjusted (if unavailable, unadjusted) seroprevalence estimates and sampling procedure details. We also extracted age- and residence-stratified cumulative COVID-19 deaths (until 1 week after the seroprevalence sampling midpoint) from official reports, and population statistics, to calculate IFRs corrected for unmeasured antibody types. Sample size-weighted IFRs were estimated for countries with multiple estimates. Secondary analyses examined data on younger age strata from the same studies.
Data Synthesis Twenty-three seroprevalence surveys representing 14 countries were included. Across all countries, the median IFR in community-dwelling elderly and elderly overall was 2.4% (range 0.3%-7.2%) and 5.5% (range 0.3%-12.1%). IFR was higher with larger proportions of people >85 years. Younger age strata had low IFR values (median 0.0027%, 0.014%, 0.031%, 0.082%, 0.27%, and 0.59%, at 0-19, 20-29, 30-39, 40-49, 50-59, and 60-69 years).
Limitations Biases in seroprevalence and mortality data.
Conclusions The IFR of COVID-19 in community-dwelling elderly people is lower than previously reported. Very low IFRs were confirmed in the youngest populations.
INTRODUCTION
Most Coronavirus Disease 2019 (COVID-19) affect the elderly (1), and persons living in nursing homes are particularly vulnerable (2). Hundreds of seroprevalence studies have been conducted in various populations, locations, and settings. These data have been used and synthesized in several published efforts to obtain estimates of the infection fatality rate (IFR, proportion of deceased among those infected), and its heterogeneity (3-6). All analyses identify very strong risk-gradient based on age, although absolute risk values still have substantial uncertainty. Importantly, the vast majority of seroprevalence studies include very few elderly people (7). Extrapolating from seroprevalence in younger to older age groups is tenuous. Elderly people may genuinely have different seroprevalence. Ideally, elderly should be more protected from exposure/infection than younger people, although probably the ability to protect the elderly has varied substantially across countries (8). Moreover, besides age, comorbidities and lower functional status markedly affects COVID-19 death risk (9). Particularly elderly nursing home residents accounted for 30-70% of COVID-19 deaths in high-income countries in the first wave (2), despite comprising <1% of the population. IFR in nursing home residents has been estimated to as high as 25% (10). Not separating residents of nursing homes from the community-dwelling may provide an average that is too low for the former and too high for the latter. Moreover, ascertainment and reporting of COVID-19 cases and deaths in nursing home populations show considerable variation across countries (2), with potentially heavy bearing on overall mortality, while community-dwelling elderly data may be less unreliable (especially in high-income countries). Finally, seroprevalence estimates reflect typically community-dwelling populations (enrollment of nursing home residents is scarce/absent in serosurveys).
Here we estimated the COVID-19 IFR in community-dwelling populations at all locations where seroprevalence studies with many elderly individuals have been conducted. Primary emphasis is on the IFR of the elderly. As a secondary analysis, we also explored the IFR of younger age-strata in these same studies.
METHODS
Data Sources and Searches
We identified seroprevalence studies (peer-reviewed publications, official reports, or preprints) in four existing systematic reviews (3, 7, 11, 12) as for a previous project (13), using the most recent updates of these reviews and their respective databases as of March 16, 2021. The protocol of this study was registered at the Open Science Framework (https://osf.io/47cgb) after piloting data availability in December 2020 but before extracting full data, communicating with local authorities and study authors for additional data and performing any calculations. Amendments to the protocol and their justification are described in Appendix Table 1.
Study Selection
We included studies on SARS-CoV-2 seroprevalence that had sampled at least 1000 participants aged ≥70 years in the location and/or setting of interest, provided an estimate of seroprevalence for elderly people, explicitly aimed to generate samples reflecting the general population, and were conducted at a location for which there is official data available on the proportion of cumulative COVID-19 deaths among elderly (with a cutoff placed between 60-70 years; e.g., eligible cutoffs were ≥70, ≥65, or ≥60, but not ≥75 or ≥55). Besides general population samples we also accepted studies focusing on patient cohorts (including residual clinical samples), insurance applicants, blood donors, and workers (excluding health care workers and others deemed to have higher than average exposure risk, since these would tend to overestimate seroprevalence). USA studies were excluded if they did not adjust seroprevalence for race or ethnicity, since these socio-economically related factors associate strongly with both study participation (14, 15) (blood donation, specific jobs, and insurance seeking) and COVID-19 burden (16-18). We focused on studies sampling participants in 2020, since IFRs in 2021 may be further affected by wide implementation of vaccinations (especially among the elderly) and by other changes (new variants and better treatment). Two authors reviewed records for eligibility. Discrepancies were solved by discussion.
Data Extraction
CA extracted each data point and JPAI independently verified the extracted data. Discrepancies were solved through discussion. For each location, we identified the age distribution of cumulative COVID-19 deaths and choose as primary age cutoff the one closest to 70, while placed between 60-70 years (e.g., ≥70, ≥65, or ≥60).
Similar to a previous project (3), we extracted from eligible studies information on location, recruitment and sampling strategy, dates of sample collection, sample size (overall and elderly group), and types of antibody measured (immunoglobulin G (IgG), IgM and IgA). We also extracted, for the elderly stratum, the estimated unadjusted seroprevalence, the most fully adjusted seroprevalence, and the factors considered for adjustment. Antibody titers may decline over time. E.g. a modelling study estimated 3-4 months average time to seroreversion (19). A repeated measurements study (20) suggests even 50% seroreversion within a month for asymptomatic/oligosymptomatic patients, although this may be an over-estimate due to initially false-positive antibody results. To address seroreversion, if there were multiple different time points of seroprevalence assessment, we selected the one with the highest seroprevalence estimate. If seroprevalence data were unavailable as defined by the primary cutoff, but with another eligible cutoff (e.g., ≥70, ≥65, or ≥60), we extracted data for that cut-off.
Population size (overall, and elderly) and numbers of nursing home residents for the location were obtained from multiple sources (see Appendix Table 2).
Cumulative COVID-19 deaths overall and in the elderly stratum (using the primary age cutoff) for the relevant location were extracted from official reports. The total number, i.e., confirmed and probable, was preferred whenever available. We extracted the accumulated deaths until 1 week after the midpoint of the seroprevalence study period, or the closest date with available data.
The proportion of cumulative COVID-19 deaths that occurred among nursing home residents for the relevant location and date was extracted from official sources or the International Long Term Care Policy Network (ILTCPN) report closest in time (2, 21). We preferred numbers recorded per residence status, i.e., including COVID-19 deaths among nursing home residents occurring in hospital. If the latter were unavailable, we calculated the total number of deaths in nursing home residents with a correction (by multiplying with the median of available ratios of deaths in nursing homes to deaths of nursing home residents in the ILTCPN 10/14/2020 report (2) for countries in the same continent). We considered 95%, 98%, and 99% of nursing home residents’ deaths to have occurred in people ≥70 years, ≥65 years and ≥60 years, respectively (22). For other imputations, see the online protocol.
Missing Data
We communicated with the authors of the seroprevalence study and with officers responsible for compiling the relevant official reports to obtain missing information or when information was available but not for the preferred age cut-offs. Email requests were sent, with two reminders to non-responders.
Calculated Data Variables
Infected and deceased community-dwelling elderly
The number of infected people among the community dwelling elderly for the preferred date (1 week after the midpoint of the seroprevalence study period) was estimated by multiplying the adjusted estimate of seroprevalence and the population size in community-dwelling elderly. We used unadjusted seroprevalence, when adjusted estimates were unavailable. We applied a non-prespecified correction for studies that excluded persons with diagnosed COVID-19 from sampling, primarily by using study authors’ corrections, secondarily by adding the number of identified COVID-19 cases in community-dwelling elderly for the location up to the seroprevalence study midpoint.
The total number of fatalities in community-dwelling elderly was obtained by total number of fatalities in elderly minus those accounted for by nursing home residents in the elderly stratum. If the elderly proportion or nursing home residents’ share of COVID-19 deaths were only available for another date than the preferred one, we assumed that the proportions were stable between the time points.
IFR estimation
We present IFR with corrections for unmeasured antibodies (as previously described (3)) as well as uncorrected. When only one or two types of antibodies (among IgG, IgM, IgA) were used in the seroprevalence study, seroprevalence was corrected upwards (and inferred IFR downwards) by 10% for each non-measured antibody. We added a non-prespecified calculation of 95% confidence intervals (CIs) of IFRs based on extracted or calculated 95% CIs from seroprevalence estimates (Appendix Table 1). CI estimates should be seen with caution since they depend on adequacy of seroprevalence adjustments.
Data Synthesis and Analysis
Statistical analyses were done using R version 4.0.2 (23). Similar to a previous overview of IFR-estimating studies (3), we estimated the sample size-weighted IFR of community-dwelling elderly for each country and then estimated the median and range of IFRs across countries. As expected, there was extreme heterogeneity among IFR estimates, thus weighted meta-analysis averages may not be meaningful.
We explored a seroreversion correction of the IFR by Xm-fold, where m is the number of months from the peak of the first epidemic wave in the specific location and X is 0.99, 0.95, and 0.90 corresponding to 1%, 5%, and 10% relative monthly rate of seroreversion. We also added a non-prespecified sensitivity analysis to explore the percentage increase in the cumulative number of deaths and IFR, if the cutoff was put two weeks (rather than 1 week) after the study midpoint.
We expected IFR would be higher in locations with a higher share of people ≥85 years old among the analyzed elderly stratum. Estimates of log10IFR) were plotted against the proportion of people ≥85 years old among the elderly (for population pyramid sources see Appendix Table 2.
Added Secondary Analyses
IFR in younger age-strata has become a very important question since we wrote the original protocol and the studies considered here offered a prime opportunity to assess IFR also in younger age strata. Among the included studies, whenever there were seroprevalence estimates and COVID-19 mortality data available for younger age groups, we complemented data extraction for all available age strata. Studies were excluded if no mortality data were available for any age stratum of maximum width 20 years and maximum age 70 years. We used the same time points as those selected for the elderly data. We included all age strata with a maximum width of 20 years and available COVID-19 mortality information. We corresponded the respective seroprevalence estimates for each age stratum with eligible mortality data. Consecutive strata of 1-5 years were merged to generate 10-year bins. For seroprevalence estimates we used the age strata that most fully covered the age bin for which mortality data were available; or the youngest age groups seroprevalence data from the closest available group with any sampled persons ≤20 years were accepted. E.g. for Ward et al (24), eligible age strata were 0-19 (paired with seroprevalence data for 20-24), 20-24, 25-34, 35-44, 45-54, 55-64. Population statistics for each analyzed age bin were obtained from the same sources as for the elderly. For age strata with multiple estimates from the same country, we calculated the sample size-weighted IFR per country before estimating median IFRs across locations for age groups 0-19, 20-29, 30-39, 40-49, 50-59, and 60-69 years. IFR estimates were placed in these age groups according to their midpoint, regardless of whether they perfectly matched the age group or not, e.g. an IFR estimate for age 18-29 years was placed in the 20-29 years group.
RESULTS
Seroprevalence studies
By March 16, 2021, 1206 SARS-CoV-2 seroprevalence reports were available in the four systematic reviews. Screening and exclusions are shown in Appendix Figure 1 and Appendix Table 2. Twenty-two seroprevalence studies were included, one of which contained two separate surveys.
The 23 seroprevalence surveys (Table 1) (24-47) represented 14 countries (Americas n=6, Asia n=3, Europe n=14). Only three studies were conducted in middle-income countries (one in Dominican Republic, two in India) and the other 20 in high-income countries. Nineteen studies targeted general population participants, 2 enrolled blood donors (27, 30), 1 biobank participants (43), and 1 hemodialysis patients (46). Three studies excluded upfront persons with previously diagnosed COVID-19 from participating in their sample (35, 39, 47). Mid-sampling points ranged from April 2020 to November 2020). Sampling had a median length of 5.7 weeks (range 6 days to 5 months). The median number of elderly individuals tested was 1809 (range 1010-21953). Median seroprevalence was 3.2% (range 0.47%-25.2%). Adjusted seroprevalence estimates were available for 20/23 studies.
Mortality and population statistics
COVID-19 deaths and population data among elderly at each location are shown in Table 1 (for sources, see Appendix Table 3). The proportion of a location’s total COVID-19 deaths that happened among elderly had a median of 53% (range 51%-62%) in middle-income countries and 86% (range 51%-96%) in high-income countries. The proportion of a location’s total COVID-19 deaths that occurred in nursing home residents was imputed for middle-income countries, and had a median of 44% (range 20%-85%) in in high-income countries with available data (for Qatar, the number was imputed). One study (45) included only COVID-19 deaths that occurred in nursing homes and was corrected to reflect also the deaths among nursing home residents occurring in hospitals. Among the population, the elderly group comprised a median of 9% (range 6%-11%) in middle-income countries and 15% (range 0.6%-24%) in high-income countries. People residing in nursing homes were 0.08-0.20% of elderly in middle-income countries and a median of 4.7% (range 0.5%-9.1%) in high-income countries.
Additional data contributed
Additional information was obtained from authors and agencies on four studies for seroprevalence data (26, 29, 34, 46); three studies for mortality data (25, 26, 29); two studies for population data (25, 26); and five excluded studies (clarifying non-eligibility).
Calculated IFRs
For 5 countries with more than one IFR estimate available sample size-weighted average IFRs were calculated. In 14 countries, IFRs in community-dwelling elderly (Figure 1, Table 1) had a median of 2.4% (range 0.3%-7.2%). For 2 middle-income countries, IFR was 0.3% versus 2.8% (range 1.3%-7.2%) in 12 high-income countries. Figure 1 also shows 95% CIs for IFRs based on 95% CIs for seroprevalence estimates. Median IFR in all elderly for all 14 individual countries was 5.5% (range 0.3%-12.1%). In the 2 middle-income countries, IFR in all elderly was 0.3-0.4% and in 12 high-income countries it was 6.8% (range 2.3%-12.1%).
Sensitivity analyses exploring different rates of seroreversion appear in Appendix Table 4. For the scenario with 5% relative monthly seroreversion, median IFR in community-dwelling elderly was still 2.4% (range 0.3%-6.1%) across all countries (0.3% in 2 middle-income countries, and 2.7% in 12 high-income countries. For the sensitivity analysis that explored the percentage increase in IFR if a later cutoff was used for cumulative deaths (two weeks after study midpoint), data were available for 20/23 seroprevalence surveys. There was a median relative increase of 4%, and median IFR in community-dwelling elderly became2.5% (Appendix Table 5).
IFR in the elderly and proportion >85 years
There was steeply increasing IFR with larger proportions of people ≥85 years old (Figure 2). A regression of logIFR against the proportion of people ≥85 years old had a slope of 0.056 (p=0.002), and suggested IFR=0.62%, 1.18%, and 4.29% when the proportion of people >85 in the elderly group was 5%, 10%, and 20%, respectively.
IFR in younger age-strata
We could extract data and calculate IFR on another 84 age-strata observations from 19/23 seroprevalence surveys (three had no mortality data for any eligible non-elderly age stratum (25, 30, 35) and one sampled no individuals <65 years of age (44)). The 19 surveys came from 11 countries. For the age group 0-19 years, only five studies had sampled participants for seroprevalence in the corresponding age group (24, 29, 31, 38, 41); for the other studies, the closest available age group was used. Across all countries (Figure 3), the median IFR was 0.0027%, 0.014%, 0.031%, 0.082%, 0.27%, and 0.59%, at 0-19, 20-29, 30-39, 40-49, 50-59, and 60-69 years, using data from 9, 9, 10, 9, 11, and 6 countries, respectively. Appendix Figure 1 visualizes these estimates against other, previously published evaluations of age-specific IFR.
DISCUSSION
The IFR of COVID-19 in elderly was found to vary widely at locations where seroprevalence studies have enrolled many elderly individuals. IFR in community-dwelling elderly was consistently lower than in elderly overall, and in countries where nursing homes are widely used, the difference was very substantial. In secondary analyses, the aggregated estimates show very low IFR estimates for younger age groups.
Early estimates of case fatality rate (CFR, ratio of deaths divided by documented infections) in the elderly were very high and they played an instrumental role in disseminating both fear and alacrity in dealing with this serious pandemic. Early estimates of CFR from China (48) described CFR of 8% in the age group 70-79 and 14.8% in those ≥80 years. Extremely high CFR estimates were also reported initially from Italy (49) and New York (50). However, the number of infected individuals was much larger than the documented cases (51). Therefore, IFR is much lower than CFR. We are aware of three previous evaluations of age-stratified IFR estimates that combine seroprevalence data with age-specific COVID-19 mortality statistics (4, 5, 52). Levin et al (4) is also the basis for the US CDC pandemic planning scenarios (53). Levin et al report IFR 4.6% at age 75, and 15% at age 85 (4) without separating nursing home deaths. The assessment was based on relatively sparse data for these age groups. The authors counted deaths four weeks after the midpoint of the seroprevalence sampling period, which is the longest among the evaluations, with the argument that there is large potential reporting lag (although available mortality statistics are commonly updated retrospectively for the date of death). Also, almost all included studies came from hard-hit locations, where IFR may be substantially higher (3). Selection bias for studies with higher seroprevalence and/or higher death counts (6) may explain why their estimates for middle-aged and elderly are substantially higher than ours.
O’Driscoll et al (5) modeled 22 seroprevalence studies, and carefully comment how outbreaks in nursing homes can drive overall population IFRs. For young and middle-aged groups, their estimates largely agree with those presented here. Their estimates for elderly are still higher than ours. For ages ≥65 years, their model uses data derived from one location (England) on deaths that did not occur in nursing homes and is validated against other locations with such statistics. This may overestimate the community-dwelling proportion, since deaths of nursing home residents occurring in hospitals are counted in the England community estimates. Conversely our evaluation adds granularity by using deaths in nursing home residents from many countries, and by using seroprevalence estimates from 23 serosurveys with many elderly individuals.
The Imperial College COVID-19 response team (52) presents much higher IFR estimates for elderly overall. They use a very narrowly selected subset of 10 studies in 9 countries, five of which had sampled >1000 elderly people. Their selection criteria required >100 deaths in the location at the seroprevalence study midpoint, which skews the sample towards heavily-hit areas and higher IFRs (6).
Some published studies also present IFRs in elderly people for single locations based on seroprevalence data, but these are unavoidably location-limited (see Appendix text).
For persons 0-19 years, the median IFR was one death per 37,000 persons with COVID-19 infection, followed by estimates of 1:7100 in ages 20-29, 1:3200 in ages 30-39, and 1:1200 in ages 40-49. The Imperial College study (52) has 5-10 times higher estimates for persons 0-19 years and 20-29 years old; otherwise estimates in age groups <50 years are fairly consistent across previous (4, 5) and current analyses despite methodological differences. Thus, they may be used for assessing risk-benefits, e.g. with specific vaccines (54) in young populations.
Both the age distribution and other characteristics of people within the elderly stratum vary between different countries. E.g., obesity is a major risk factor for poor outcome with COVID-19 infection and prevalence of obesity is only 4% in India versus 20-36% in high-income countries analyzed here. Besides differences in risk factor characteristics, documentation criteria for coding COVID-19 deaths may have varied non-trivially across countries. Under- and over-counting of COVID-19 deaths may have occurred even in countries with advanced health systems.
Given that nursing home residents account for many COVID-19 deaths (55), a location’s overall IFR across all ages is largely dependent on how nursing homes were afflicted (5). Spread in nursing homes was disproportionately high in the first wave (8). The share of nursing home deaths decreased markedly in subsequent waves (55) in most high-income countries with some exceptions (e.g. Australia). This change may be reflected in a much lower IFR among the elderly and the entire population after the first wave. Improved treatments (e.g. dexamethasone), and less use of harmful treatments (e.g. hydroxychloroquine, improper mechanical ventilation) may also have decreased IFR substantially in late 2020 and in 2021. With vaccination being promoted preferentially for elderly and vulnerable individuals in 2021, IFR may have decreased even more sharply (6). New variants becoming dominant in 2021 may also be associated with further lower IFR. E.g., in the last week of June 2021, in the UK, where the delta mutation has spread widely, even CFR has been ∼0.1%.
Our analysis has several limitations. First, seroprevalence estimates among elderly reported by the included studies could over- or underestimate the proportion infected. We explored adjusted estimates accounting for 1-10% relative seroreversion per month; however, higher seroreversion is likely (19, 20, 56). Higher seroreversion will affect more prominently studies carried out later in the pandemic. Also, the current estimates do not fully account for the unknown share of people who may have tackled the infection without generating detectable serum/plasma antibodies (e.g., by mucosal, innate, or cellular immune mechanisms) (57-61). Sensitivity estimates for antibody assays typically use positive controls from symptomatic individuals with clinically manifest infection; sensitivity may be lower for asymptomatic infections. All seroprevalence studies may have substantial residual biases despite whatever adjustments (6). Even well-designed general population studies may specifically fail to reach and recruit highly vulnerable populations, e.g. disadvantaged groups, immigrants, homeless, and other people at high exposure risks and poor health.
Second, the number of deaths may be biased for various reasons (3) leading to potential under- or over-counting. Using excess mortality data is an alternative that has caveats, as those data depend heavily on the reference time period; availability in specific age groups can be restricted; and the proportion of deaths that is directly attributable to COVID-19 may be difficult to separate from indirect effects of the pandemic and adverse effects of measures taken. To match the date for seroprevalence sampling (i.e., seroconversion) with cumulative deaths is an exercise with assumptions. Our sensitivity analysis that extended with one week the cutoff for counting deaths showed a negligible change in the median IFR calculation. Most studies included in our analysis had been performed during periods at or after the end of the first wave.
Third, we acknowledge the risk of bias in seroprevalence studies, mortality statistics, and even population statistics. However, assessments of risk of bias are far from straightforward, as illustrated by the discrepant assessments of these seroprevalence studies by other teams (6).
Finally, most available studies come from hard-hit locations that tend to have high IFRs (6). Consideration of age strata diminishes this representativeness bias, but cannot eliminate it. E.g., most countries not represented in the available data may have a shift towards lower ages within the stratum of the elderly. This translates to lower IFR.
Moreover, with the exception of India, all countries analyzed here have population prevalence of obesity 1.5-3-fold higher than the global prevalence (13%); other major risk factors for poor COVID-19 outcome such as smoking history, diabetes, cardiovascular disease, and immunosuppression (9) are also far more common in the high-income countries included in our analysis than the global average. Global IFR may thus be substantially lower in both the elderly and the lower age strata than estimates presented herein.
This overview synthesis finds a consistently much lower IFR of COVID-19 in community-dwelling elderly than in elderly overall, a difference which is substantial in countries where nursing homes are an established form of residency. Very low IFR estimates were confirmed in younger groups (<50). For middle-aged groups and elderly, estimates were lower than in some previous influential work with biased methodological choices (4, 52), but in agreement with other work (5). The estimates presented here may serve as one of several key pieces of information underlying public health policy decisions. With better management and better preventive measures, in particular vaccines, hopefully IFR estimates have already decreased further.
Data Availability
The protocol, data, and code used for this analysis will be made available at the Open Science Framework upon publication.
DECLARATION OF INTERESTS
The authors have no conflicts of interest.
Role of the Funding Source
No funding was received specifically for this work. Outside this work, the Meta-Research Innovation Center at Stanford (Stanford University) is supported by a grant from the Laura and John Arnold Foundation. Dr Axfors is supported by postdoctoral grants from the Knut and Alice Wallenberg Foundation, Uppsala University, the Swedish Society of Medicine, the Blanceflor Foundation, and the Sweden-America Foundation. The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
TRANSPARENCY STATEMENT
The protocol, data, and code used for this analysis will be made available at the Open Science Framework upon publication: https://osf.io/47cgb.