Abstract
Background The total number of COVID-19 infections is critical information for decision makers when assessing the progress of the pandemic, its implications, and policy options. Despite efforts to carefully monitor the COVID-19 pandemic, the reported number of confirmed cases is likely to underestimate the actual number of infections. We aim to estimate the total number of COVID-19 infections in a straightforward manner using a demographic scaling approach based on life tables.
Methods We use data on total number of COVID-19 attributable deaths, population counts, and life tables as well as information on infection fatality rates as reported in Verity et al. (2020) for Hubei, China. We develop a scaling approach based on life tables and remaining life expectancy to map infection fatality rates between two countries to account for differences in their age structure, health status, and the health care system. The scaled infection fatality rates can be used in combination with COVID-19 attributable deaths to calculate estimates of the total number of infected. We also introduce easy to apply formulas to quantify the bias that would be required in death counts and infection fatality rates in order to reproduce a certain estimate of infections.
Findings Across the 10 countries with most COVID-19 deaths as of April 17, 2020, our estimates suggest that the total number of infected is approximately 4 times the number of confirmed cases. The uncertainty, however, is high, as the lower bound of the 95% prediction interval suggests on average twice as many infections than confirmed cases, and the upper bound 10 times as many. Country-specific variation is high. For Italy, our estimates suggest that the total number of infected is approximately 1 million, or almost 6 times the number of confirmed cases. For the U.S., our estimate of 1.4 million is close to being twice as large as the number of confirmed cases, and the upper bound of 3 million is more than 4 times the number of confirmed cases. For Germany, where testing has been comparatively extensive, we estimate that the total number of infected is only 1.2 times (upper bound: 3 times) than the number of confirmed cases. Comparing our results with findings from local seroprevalence studies and applying our bias formulas shows that some of their infection estimates would only be possible if just a small fraction of COVID-19 related deaths were recorded, indicating that these seroprevalence estimates might not be representative for the total population.
Interpretation As many countries lack population based seroprevalence studies, straightforward demographic adjustment can be used to deliver useful estimates of the total number of infected cases. Our results imply that the total number COVID-19 cases may be approximately 4 times (95%: 2 to 10 times) that of the confirmed cases. Although these estimates are uncertain and vary across countries, they indicate that the COVID-19 pandemic is much more broadly spread than what confirmed cases would suggest, and the number of asymptomatic cases or cases with mild symptoms may be high. In cases in which estimates from local seroprevalence studies or from simulation models exist, our approach can provide a simple benchmark to assess the quality of those estimates.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
CBE, CD, and MM received no external funding to conduct this research.
Author Declarations
All relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.
Yes
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
This research is reproducible. The R source code and information on the data is available at https://github.com/christina-bohk-ewald/demographic-scaling-model.
https://github.com/christina-bohk-ewald/demographic-scaling-model