ABSTRACT
Objective To estimate the infection fatality rate of coronavirus disease 2019 (COVID-19) from data of seroprevalence studies.
Methods Population studies with sample size of at least 500 and published as peer-reviewed papers or preprints as of June 7, 2020 were retrieved from PubMed, preprint servers, and communications with experts. Studies on blood donors were included, but studies on healthcare workers were excluded. The studies were assessed for design features and seroprevalence estimates. Infection fatality rate was estimated from each study dividing the number of COVID-19 deaths at a relevant time point by the number of estimated people infected in each relevant region. Correction was also attempted accounting for the types of antibodies assessed.
Results 23 studies were identified with usable data to enter into calculations. Seroprevalence estimates ranged from 0.1% to 47%. Infection fatality rates ranged from 0.02% to 0.86% (median 0.26%) and corrected values ranged from 0.02% to 0.78% (median 0.25%). Among people <70 years old, infection fatality rates ranged from 0.00% to 0.26% with median of 0.05% (corrected, 0.00-0.23% with median of 0.04%). Most studies were done in pandemic epicenters and the few studies done in locations with more modest death burden also suggested lower infection fatality rates.
Conclusions The infection fatality rate of COVID-19 can vary substantially across different locations and this may reflect differences in population age structure and case-mix of infected and deceased patients as well as multiple other factors. Estimates of infection fatality rates inferred from seroprevalence studies tend to be much lower than original speculations made in the early days of the pandemic.
The infection fatality rate (IFR), the probability of dying for a person who is infected, is one of the most critical and most contested features of the coronavirus disease 2019 (COVID-19) pandemic. The expected total mortality burden of COVID-19 is directly related to the IFR. Moreover, justification for various non-pharmacological public health interventions depends crucially on the IFR. Some aggressive interventions that potentially induce also more pronounced collateral harms1 may be considered appropriate, if IFR is high. Conversely, the same measures may fall short of acceptable risk-benefit thresholds, if the IFR is low.
Early data from China, adopted also by the World Health Organization (WHO),2 focused on a crude case fatality rate (CFR) of 3.4%; CFR is the ratio of COVID-19 deaths divided by the number of documented cases, i.e. patients with symptoms who were tested and found to be PCR-positive for the virus. The WHO envoy who visited China also conveyed the message that there are hardly any asymptomatic infections.3 With a dearth of asymptomatic infections, the CFR approximates the IFR. Other mathematical models suggested that 40-70%,4 or even5 81% of the global population would be infected. Influential mathematical models5,6 eventually dialed back to an IFR of 1.0% or 0.9%, and these numbers long continued to be widely cited and used in both public and scientific circles. The most influential of these models, constructed by Imperial College estimated 2.2 million deaths in the USA and over half a million deaths in the UK in the absence of lockdown measures.5 Such grave predictions justifiably led to lockdown measures adopted in many countries. With 0.9% assumed infection fatality rate and 81% assumed proportion of people infected, the prediction would correspond to a global number of deaths comparable with the 1918 influenza, in the range of 50 million fatalities.
Since late March 2020, many studies have tried to estimate the extend of spread of the virus in various locations by evaluating the seroprevalence, i.e. how many people in population samples have developed antibodies for the virus. These studies can be useful because they may inform about the extend of under-ascertainment of documenting the infection based on PCR testing. Moreover, they can help obtain estimates about the IFR, since one can divide the number of observed deaths by the estimated number of people who are inferred to have been infected.
At the same time, seroprevalence studies may have several caveats in their design, conduct, and analysis that may affect their results and their interpretation. Here, data from the first presented full papers (either peer-reviewed or preprint) as of June 7, 2020 were collected, scrutinized, and used to infer estimates of IFR in different locations where these studies have been conducted.
METHODS
Seroprevalence studies
The input data for the calculations of IFR presented here are studies of seroprevalence of COVID-19 that have been done in the general population, or in samples that might approximate the general population (e.g. with proper reweighting) and that have been published in peer-reviewed journals or have been submitted as preprints as of June 7, 2020. Only studies with at least 500 assessed samples were considered, since smaller datasets would entail extremely large uncertainty for any calculations to be based on them. Studies where results were only released through press releases were not considered here, since it is very difficult to tell much about their design and analysis, and this is fundamental in making any inferences based on their results. Some key ones that have attracted large attention (e.g. Spain seroprevalence) are nevertheless considered in the Discussion. Preprints should also be seen with caution since they have not been yet fully peer-reviewed (although some of them have already been revised based on very extensive comments from the scientific community). However, in contrast to press releases, preprints typically offer at least a fairly complete paper with information about design and analysis. Studies done of blood donors were eligible, although it is possible they may underestimate seroprevalence and overestimate IFR due to healthy volunteer effect. Studies done on health care workers were not, since they deal with a group at potentially high exposure risk which may lead to seroprevalence estimates much higher than the general population and thus implausibly low IFR. Searches were made in PubMed (LitCOVID), medRxiv, bioRxiv, and Research Square using the terms “seroprevalence” and “antibodies” as of June 7, 2020. Communication with colleagues who are field experts sought to ascertain if any major studies might have been missed.
Information was extracted from each study on location, recruitment and sampling strategy, dates of sample collection, sample size, types of antibody used (IgG, IgM, IgA), estimated crude seroprevalence (positive samples divided by all samples test), and adjusted seroprevalence and features that were considered in the adjustment (sampling process, test performance, presence of symptoms, other).
Calculation of inferred IFR
Information on the population of the relevant location was collected from the papers. Whenever it was missing, it was derived based on recent census data trying to approximate as much as possible the relevant catchment area (e.g. region(s) or county(ies)), whenever the study did not pertain to an entire country. Some studies targeted specific age groups (e.g. excluding elderly people and/or excluding children) and some of them made inferences on number of people infected in the population based on specific age groups. For consistency, the entire population, as well as, separately, only the population with age <70 years were used for estimating the number of infected people. It was assumed that the seroprevalence would be similar in different age groups, but significant differences in seroprevalence according to age strata that had been noted by the original authors were also recorded to examine the validity of this assumption.
The number of infected people was calculated multiplying the relevant population with the adjusted estimate of seroprevalence. Whenever an adjusted seroprevalence estimate had not been obtained, the unadjusted seroprevalence was used instead. When seroprevalence estimates with different adjustments were available, the analysis with maximal adjustment was selected. When seroprevalence studies had used sequential waves of testing over time, data from the most recent wave was used, since it would give the most updated picture of the epidemic wave.
For the number of COVID-19 deaths, the number of deaths recorded at the time chosen by the authors of each study was selected, whenever the authors used such a death count up to a specific date to make inferences themselves. If the choice of date had not been done by the authors, the number of deaths accumulated until after 1 week of the mid-point of the study period was chosen. This accounts for the differential delay in developing antibodies versus dying from the infection. It should be acknowledged that this is an averaging approximation, because some patients may die very soon (within <3 weeks) after infection (and thus are overcounted), and others may die very late (and thus are undercounted due to right censoring).
The inferred IFR was obtained by dividing the number of deaths by the number of infected people for the entire population, and separately for people <70 years old. The proportion of COVID-19 deaths that occurred in people <70 years old was retrieved from situational reports for the respective countries, regions, or counties in searches done in June 3-7. A corrected IFR is also presented, trying to account for the fact that only one or two types of antibodies (among IgG, IgM, IgA) might have been used. Correcting seroprevalence upwards (and inferred IFR downwards) by 1.1-fold for not performing IgM measurements and similarly for not performing IgA measurements may be reasonable, based on some early evidence,7 although there is uncertainty about the exact correction factor.
RESULTS
Seroprevalence studies
23 studies with a sample size of at least 500 have been published either in the peer-reviewed literature or as preprints as of June 7, 2020.8-30 Dates and processes of sampling and recruitment are summarized in Table 1, sample sizes, antibody types assessed and regional population appear in Table 2, estimated prevalence, and number of people infected in the study region are summarized in Table 3, and number of COVID-19 and inferred IFR estimates are found in Table 4. Four studies (Geneva,10 Rio Grande do Sul17, Zurich28, and Milan29) performed repeated seroprevalence surveys at different time points, and only the most recent one is shown in these tables.
Of the 23 studies, only 6 found some modest differences in seroprevalence rates across some age groups (Oise: decreased seroprevalence in age 0-14, increased in age 15-17; Geneva: decreased seroprevalence in age >50; Netherlands: increased seroprevalence in age 18-30; New York state: decreased seroprevalence in age >55; Brooklyn: decreased seroprevalence in age 0-5, increased in age 16-20; Tokyo: increased seroprevalence in age 18-34). The patterns are not strong enough to suggest major differences in extrapolating across age groups, although higher values in adolescents and young adults and lower values in elderly individuals cannot be excluded.
As shown in Table 1, these studies varied substantially in sampling and recruitment designs. The main issue is whether they can offer a representative picture of the population in the region where they are performed. A generic problem is that vulnerable people who are at high risk of infection and/or death may be more difficult to recruit in survey-type studies. COVID-19 infection seems to be particularly widespread and/or lethal in nursing homes, among homeless people, in prisons, and in disadvantaged minorities. Most of these populations are very difficult, or even impossible to reach and sample from and they are probably under-represented to various degrees (or even entirely missed) in surveys. This would result in an underestimation of seroprevalence and thus overestimation of IFR. Seven studies (Iran,8 Geneva,10 Gangelt,16 Ro Grande do Sul17, Luxembourg20, Los Angeles county22, Brazil [133 cities]25) explicitly aimed for random sampling from the general population. In principle, this is a stronger design. However, even with such designs, people who cannot be reached (e.g. by e-mail or phone or even visiting them at a house location) will not be recruited, and these vulnerable populations are likely to be missed.
Six the 23 studies assessed blood donors in Denmark,12 Netherlands,15 Scotland,18 the Bay Area in California,24 Zurich/Lucerne,28 and Milan.29 By definition these studies include people in good health and without symptoms, at least recently, and therefore may markedly underestimate COVID-19 seroprevalence in the general population. A small set of 200 blood donors in Oise, France13 showed 3% seroprevalence, while pupils, siblings, parents, teachings and staff at a high school with a cluster of cases in the same area had 25.9% seroprevalence.
For the other studies, healthy volunteer bias may lead to underestimating seroprevalence and this is likely to have been the case in at least one case (the Santa Clara study)19 where wealthy healthy people were rapidly interested to be recruited when the recruiting Facebook ad was released. The design of the study anticipated correction with adjustment of the sampling weights by zip code, gender, and ethnicity, but it is likely that healthy volunteer bias may still have led to some underestimation of seroprevalence. Conversely, attracting individuals who might have been concerned of having been infected (e.g. because they had symptoms) may lead to overestimation of seroprevalence in surveys. Finally studies of employees, grocery store clients, or patient cohorts (e.g. hospitalized for other reasons, or coming to the emergency room) may have sampling bias with unpredictable direction.
As shown in Table 2, all studies have tested for IgG antibodies, but only 9 have also assessed IgM and 3 have assessed IgA. Only one study assessed all three types of antibodies. All studies considered the results to be “positive” if any tested antibody type was positive, with the exception of one study (Luxembourg) that considered the results to be “positive” only if both IgG and IgA were detected. The ratio of people sampled versus the total population of the region was better than 1:1000 in only 6 studies (Idaho,9 Denmark blood donors,12 Santa Clara,19 Luxembourg20, Brooklyn27, Zurich28), which means that the estimates can have substantial uncertainty.
Seroprevalence estimates
As shown in Table 3, prevalence ranged from as little as 0.1% to as high as 47%. Studies varied a lot on whether they tried or not to adjust their estimates for test performance, sampling (striving to get closer to a more representative sample), and clustering effects (e.g. when including same household members) as well as other factors. The adjusted seroprevalence occasionally differed substantially from the crude, unadjusted value. In principle adjusted values are likely to be closer to the true estimate, but the exercise shows that each study alone may have some unavoidable uncertainty and fluctuation, depending on the analytical choices preferred. In studies that sampled people from multiple locations, large between-location heterogeneity could be seen (e.g. 0-25% across 133 Brazilian cities)25.
Inferred IFR
Inferred IFR estimates ranged from 0.02% to 0.86% (median 0.26%) and corrected values ranged from 0.02% to 0.78% (median 0.25%). Corrected values exceeding 0.4% were inferred for Netherlands, Milan, Luxembourg, and in one study in New York state. The first two were extrapolated from blood donors data, therefore the IFR may be overestimated, while for New York, another study found a much higher seroprevalence in Brooklyn and thus a lower inferred IFR. Conversely, very low or low IFR (corrected, 0.02-0.07%) was seen in two studies in Japan (Kobe and Tokyo), one in Iran, and one in France.
The proportion of COVID-19 deaths that occurred in people <70 years old varied substantially across locations. All deaths in Gangelt were in elderly people while in Wuhan half the deaths occurred in people <70 years old and the proportion might have been higher in Iran, but no data could be retrieved for this country. When limited to people <70 years old, IFR ranged from 0.00% to 0.26% with median of 0.05% (corrected, 0.00-0.23% with median of 0.04%). All IFR estimates in people <70 years old did not exceed 0.1%, with the exception of New York, Wuhan and Milan.
DISCUSSION
Inferred IFR values based on emerging seroprevalence studies typically show a much lower fatality than initially speculated in the earlier days of the pandemic. It should be appreciated that IFR is not a fixed physical constant and it can vary substantially across locations, depending on the population structure, the case-mix of infected and deceased individuals and other, local factors.
The 23 studies analyzed here are the first to be made available in full papers and they are not fully representative of all countries and locations around the world. Most of them come from locations with overall COVID-19 mortality rates exceeding the global average (60 per million people as of June 7). The median inferred IFR in the locations with a COVID-19 mortality rate below the global average (two studies in Japan, Idaho, Croatia and Wuhan) is 0.15% (corrected, 0.13%), while it is substantially higher for studies done in epicenters with high population mortality rate.
Several studies in hard-hit European countries inferred modestly high IFR estimates for the overall population, but the IFR was still low in people <70 years old. Some of these studies were on blood donors and may have underestimated seroprevalence and overestimated IFR. One study in Germany aimed to test the entire population of a city and thus selection bias is minimal: Gangelt16 represents a situation with a superspreader event (in a local carnival) and 7 deaths were recorded, all of them in very elderly individuals (average age 81, sd 3.5). COVID-19 has a very steep age gradient of death risk.31 It is expected therefore that in locations where the infection finds its way into killing predominantly elderly citizens, the overall, age-unadjusted IFR would be higher. However, IFR would still be very low in people <70 in these locations, e.g. in Gangelt IFR is 0.00% in non-elderly people. Similarly, in Switzerland, 69% of deaths occurred in people >80 years old31 and this explains the higher age-unadjusted IFR in Geneva and Zurich. Similar to Germany, very few deaths in Switzerland have been recorded in non-elderly people, e.g. only 2.5% have occurred in people <60 years old and IFR in that age-group would be ~0.01%. The majority of deaths in most of the hard hit European countries have happened in nursing homes32 and a large proportion of deaths also in the US33 also follow this pattern. Moreover, many nursing home deaths have no laboratory confirmation and thus should be seen with extra caution in terms of the causal impact of SARS-CoV-2.
Locations with high burdens of nursing home deaths may have high IFR estimates, but the IFR would still be very low among non-elderly, non-debilitated people. The average length of stay in a nursing home is slightly more than 2 years and people who die in nursing homes die in a median of 5 months34 so many COVID-19 nursing home deaths may have happened in people with life expectancy of only a few months. This needs to be verified in careful assessments of COVID-19 outbreaks in nursing homes with detailed risk profiling of fatalities. If COVID-19 happened in patients with very limited life expectancy, this pattern may even create a dent of less than expected mortality in the next 3-6 months after the coronavirus excess mortality wave. As of June 7 (week 22), preliminary Euromonitor data35 indeed already show a substantial dent below baseline mortality in France, and a less prominent dent below baseline mortality in Italy and several other European countries.
The estimated IFR of 0.31 in Wuhan may reflect the wide spread of the infection to hospital personnel and the substantial contribution of nosocomial infections to a higher death toll;36 plus unfamiliarity with how to deal with the infection in the first location where COVID-19 arose. Massive deaths of elderly individuals in nursing homes, nosocomial infections, and overwhelmed hospitals may also explain the very high fatality in specific locations in Northern Italy37 and New York. The highest IFR among these 23 studies was seen in a study in Milan.29 Although the estimate may be inflated in that blood donor study, probably IFR was truly very high in Milan. That same study estimated that 2.7% of the Milan population had already been infected by the time the outbreak was first recognized. Another study of seroprevalence in health care workers and administrative hospital staff in Lombardy38 found 8% seroprevalence in Milan hospitals and 35-43% in Bergamo hospitals, supporting the scenario for widespread nosocomial infections among vulnerable patients. The high IFR values in New York are also not surprising, given the vast death toll witnessed. A very unfortunate decision of the governors in New York and New Jersey was to have COVID-19 patients sent to nursing homes. Moreover, some hospitals in New York City hotspots reached maximum capacity and perhaps could not offer optimal care. Use of unnecessarily aggressive management (e.g. mechanical ventilation) and hydroxychloroquine may also have contributed to worse outcomes. Furthermore, New York City has an extremely busy, congested public transport system that may have exposed large segments of the population to high infectious load in close contact transmission and, thus, perhaps more severe disease. A more aggressive viral clade has also been speculated, but this needs further verification.39 Of note, two seroprevalence studies in New York City23,27 give substantially different results. This discrepancy demonstrates the need to conduct different studies with different sampling strategies offering complementary results.
It may not be surprising that IFR may reach very high levels among disadvantaged populations and settings that have the worst combination of factors predisposing to higher fatalities. One may predict also very high IFRs in other select locations with atypically high death toll, e.g. Bergamo or Brescia in Italy,37 and several locations in Belgium, England, or Spain. Preliminary press released data from Belgium (6.9% seroprevalence),40 the United Kingdom (6.8% seroprevalence),41 and from the ongoing Spanish national study (5.2% seroprevalence)42 agree with this expectation. With 5.2% IgG seroprevalence in Spain, crude IFR is 1.11% (corrected to 0.70% after considering sensitivity of 79% and antibody type), but with wide variability across locations, e.g. corrected IFR is 0.73% in Madrid versus 0.17% in Canary Islands. Importantly, hotspot locations are rather uncommon exceptions in the global landscape. Moreover, even in these locations, the IFR for non-elderly individuals without predisposing conditions may remain very low. E.g. in New York City only 0.6% of all deaths happened in people <65 years without major underlying conditions.43 Thus the IFR even in New York City would probably be lower than 0.01% in these people.
Studies with extremely low inferred IFR, Kobe, Tokyo and Oise, are also worthwhile discussing. For Kobe, the authors of the study11 raise the question whether COVID-19 deaths have been undercounted in Japan. Both undercounting and overcounting of COVID-19 deaths is likely to be a caveat in different locations and this is difficult to settle in the absence of very careful scrutiny of medical records and autopsies. The Tokyo data30, nevertheless, also show similarly very low IFR. Moreover, evaluation of all-cause mortality in Japan has shown no excess deaths during the pandemic, consistent with the possibility that somehow the Japanese population was spared. Former immunity from exposure to other coronaviruses, genetic differences, and other unknown factors may be speculated. IFR seems to be very low also in some other Asian countries where, in contrast to Japan, extensive PCR testing was carried out. For example, as of June 7, 2020, in Singapore there were only 25 deaths among 37,910 cases, suggesting an upper bound of 0.07% for IFR, even if no cases had been missed. A similar picture is seen in Qatar. The Wuhan study IFR of 0.31% seems a high-value outlier for Asia. In fact, a smaller study of asymptomatic Hubei returnees44 (n=452, not included in the calculations given sample size <500) showed 4% seroprevalence and would translate to much lower IFR.
For the Oise sample,13 it is possible that it may not be representative of the general population. As discussed above, there is a large difference in the estimated seroprevalence between the high school-based sampling and a small dataset of blood donors from the same area, and the true seroprevalence value may be somewhere between these two extremes that may be biased in opposite directions.
Some seroprevalence studies have also been designed to assess seroprevalence repeatedly spacing out measurements in the same population over time. Preliminary data from Brazil17 are still early to judge for meaningful increases, but the data from Geneva suggest that seroprevalence increased more than 3-fold over three weeks.10 An increase was seen also in Zurich28 and in Milan.29 Notably, the increase corresponds to continued infections during a period where strict social distancing and other lockdown measures were implemented. Data from Finland,45 with repeated measurements over several weeks (available at the Finnish Institute website, but not submitted as full paper yet) conversely show fairly steady seroprevalence in a country that maintained a much lower overall death burden. Preliminary data from the ongoing national Spain serostudy42 also show no major increase in seroprevalence after relaxation of lockdown measures. Serial seroprevalence measurements may offer some evidence on whether different measures were associated with curbed transmission or not, and how these might translate to different IFR values. Any causal inferences need to be extremely cautious. However, it is expected that measures that manage to avoid transmission of the virus to vulnerable high-risk populations may lead to lower values of IFR. Measure packages that do not protect these high-risk populations may lead to higher values of IFR. Serial seroprevalence measurements would also provide evidence on how quickly antibody titers decrease below detection. If decrease is fast, as suggested by some preliminary data,46 numbers of infected people may be underestimated and IFR overestimated.
The only data from a low-income country among the 23 studies examined here come from Iran8 and the IFR estimate appears to be the same or lower than the IFR of seasonal influenza. Iran has a young population with only slightly over 1% of the age pyramid at age >80. The same applies to almost every less developed country around the world. Given the very sharp age gradient and the sparing of children and young adults from death by COVID-19, one may expect COVID-19 IFR to be fairly low in the less developed countries. However, it remains to be seen whether comorbidities, poverty and frailty (e.g. malnutrition) may have adverse impact on risk and thus increase IFR also in these countries.
One should caution that the extent of validation of the antibody assays against positive and negative controls differs across studies. Specificity has typically exceeded 99.0%, which is reassuring. However, for very low prevalence rates, even 99% specificity may be problematic. The study with the lowest estimated prevalence (Brazil)17 has nevertheless evaluated also family members of the people who tested positive and found several family members were also infected, thus suggesting that most of the positive readings are true rather than false positives. Sensitivity also varies from 60-100% in different validation exercises and for different tests, but typically it is closer to the upper than the lower bound. One caveat about sensitivity is that typically the positive controls are patients who had symptoms and thus were tested and found to be PCR-positive. However, it is possible that symptomatic patients may be more likely to develop antibodies than patients who are asymptomatic or have minimal symptoms and thus had not sought PCR testing.47-49 Since the seroprevalence studies specifically try to unearth these asymptomatic/mildly symptomatic missed infections, a lower sensitivity for these mild infections could translate to substantial underestimates of the number of infected people and substantial overestimate of the inferred IFR.
The corrected IFR estimates are trying to account for undercounting of infected people when not all 3 antibodies (IgG, IgM, and IgA) are assessed.7 However, the magnitude of the correction is uncertain and may also vary in different circumstances. Moreover, it is possible that an unknown proportion of people may have handled the virus using immune mechanisms (mucosal, innate, cellular) that did not generate any serum antibodies.50,51 This would lead to an unknown magnitude of underestimation of the frequency of the infection and a respective overestimation of the IFR. At least one study has found indeed that mild SARS-CoV-2 infections may lead to nasal release of IgA, without serum antibody response.52
An interesting observation is that even under congested circumstances, like cruise ships, aircraft carriers or homeless shelter, the proportion of people infected does not get to exceed 20-45%.53,54 Similarly, at a wider population level, values ~47% are the maximum values documented to-date and most values are much lower, yet epidemic waves seem to wane. It has been suggested55,56 that differences in host susceptibility and behavior can result in herd immunity at much lower prevalence of infection in the population than originally expected. COVID-19 spreads by infecting certain groups more than others because some people have much higher likelihood of exposure. People most likely to be exposed also tend to be those most likely to spread for the same reasons that put them at high exposure risk. In the absence of random mixing of people, the epidemic wave may be extinguished even with relatively low proportions of people becoming infected. Seasonality may also play a role in the dissipation of the epidemic wave. It has also been observed that about 50% of people have CD4 cellular responses to SARS-CoV-2 even without being exposed to this virus and this may be due to prior exposure to other coronaviruses.57 It is unknown whether this proportion varies in different populations around the world and whether this immunity may contribute to SARS-CoV-2 epidemic waves waning without infecting a large share of the population.
A major limitation of the current analysis is that the calculations presented in this paper include several preprints that have not yet been fully peer-reviewed. Moreover, there is a substantially larger number of studies that have made press releases about their results and probably several more will become available in the near future. Those that include or allow calculation of IFR estimates in their press releases or preliminary agency reports seem to have values that are similar to those of the 23 studies analyzed here, and most estimates are quite low (e.g. 0.16% in Slovenia, 0.23% in Stockholm, 0.14% in Israel, 0.00% in San Miguel county in Colorado). However, some estimates are high (e.g. Spain)42. Obviously these preliminary results require extreme caution. The plan is to try to update this analysis with new emerging data. More clean, vetted data may make the overall picture more crisp and allow having more granularity on the determinants that lead to higher or lower IFR in different locations.
A comparison of COVID-19 to influenza is often attempted, but many are confused by this comparison unless placed in context. Based on the IFR estimates obtained here, COVID-19 may have infected as of June 7 approximately 200 million people (or more), far more than the ~7 million PCR-documented cases. The global COVID-19 death toll is still evolving, but it is still similar to a typical death toll from seasonal influenza (290,000-650,000),58 while “bad” influenza years (e.g. 1957-9 and 1968-70) have been associated with 1-4 million deaths.59 Notably, influenza devastates low-income countries, but is more tolerant of wealthy nations, probably because of the availability and wider use of vaccination in these countries.58 Conversely, in the absence of vaccine and with a clear preference for elderly debilitated individuals, COVID-19 may have an inverse death toll profile, with more deaths in wealthy nations than in low-income countries. However, even in the wealthy nations, COVID-19 seems to affect predominantly the frail, the disadvantaged, and the marginalized – as shown by high rates of infectious burden in nursing homes, homeless shelters, prisons, meat processing plants, and the strong racial/ethnic inequalities against minorities in terms of the cumulative death risk.60,61
While COVID-19 is a formidable threat, the fact that its IFR is typically much lower than originally feared, is a welcome piece of evidence. The median of 0.26% found in this analysis is very similar to the estimate recently adopted by CDC for planning purposes.62 The fact that IFR can vary substantially also based on case-mix and settings involved also creates additional ground for evidence-based, more precise management strategies. Decision-makers can use measures that will try to avert having the virus infect people and settings who are at high risk of severe outcomes. These measures may be possible to be far more precise and tailored to specific highrisk individuals and settings than blind lockdown of the entire society. Of course, uncertainty remains about the future evolution of the pandemic, e.g. the presence and height of a second wave.63 However, it is helpful to know that SARS-CoV-2 has relatively modest IFR overall and that possibly IFR can be made even lower with appropriate, precise non-pharmacological choices.
Data Availability
All data are included in the manuscript
Footnotes
Funding: METRICS has been supported by a grant from the Laura and John Arnold Foundation
Disclosures: I am a co-author (not principal investigator) of one of the 23 seroprevalence studies.