Abstract
Since the identification of Coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in China in December 2019, there have been more than 17 million cases of the disease in 216 countries worldwide. Comparisons of prevalence estimates between different communities can inform policy decisions regarding safe travel between countries, help to assess when to implement (or remove) disease control measures and identify the risk of over-burdening healthcare providers. Estimating the true prevalence can, however, be challenging because officially reported figures are likely to be significant underestimates of the true burden of COVID-19 within a community. Previous methods for estimating the prevalence fail to incorporate differences between populations (such as younger populations having higher rates of asymptomatic cases) and so comparisons between, for example, countries, can be misleading. Here, we present an improved methodology for estimating COVID-19 prevalence. We take the reported number of cases and deaths (together with population size) as raw prevalence for the population. We then apply an age-adjustment to this which allows the age-distribution of that population to influence the case-fatality rate and the proportion of asymptomatic cases. Finally, we calculate the likely underreporting factor for the population and use this to adjust our prevalence estimate further. We use our method to estimate the prevalence for 166 countries (or the states of the United States of America, hereafter referred to as US state) where sufficient data were available. Our estimates show that as of the 30th July 2020, the top three countries with the highest estimated prevalence are Brazil (1.26%, 95% CI: 0.96 – 1.37), Kyrgyzstan (1.10%, 95% CI: 0.82 – 1.19) and Suriname (0.58%, 95% CI: 0.44 – 0.63). Brazil is predicted to have the largest proportion of all the current global cases (30.41%, 95%CI: 27.52 – 30.84), followed by the USA (14.52%, 95%CI: 14.26 – 16.34) and India (11.23%, 95%CI: 11.11 – 11.24). Amongst the US states, the highest prevalence is predicted to be in Louisiana (1.07%, 95% CI: 1.02 – 1.12), Florida (0.90%, 95% CI: 0.86 – 0.94) and Mississippi (0.77%, 95% CI: 0.74 – 0.81) whereas amongst European countries, the highest prevalence is predicted to be in Montenegro (0.47%, 95% CI: 0.42 - 0.50), Kosovo (0.35%, 95% CI: 0.29 - 0.37) and Moldova (0.28%, 95% CI: 0.23 - 0.30). Our results suggest that Kyrgyzstan (0.04 tests per predicted case), Brazil (0.04 tests per predicted case) and Suriname (0.29 tests per predicted case) have the highest underreporting out of the countries in the top 25 prevalence. In comparison, Israel (34.19 tests per predicted case), Bahrain (19.82 per predicted case) and Palestine (9.81 tests per predicted case) have the least underreporting. The results of this study may be used to understand the risk between different geographical areas and highlight regions where the prevalence of COVID-19 is increasing most rapidly. The method described is quick and easy to implement. Prevalence estimates should be updated on a regular basis to allow for rapid fluctuations in disease patterns.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This project was funded by Public Health England
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Not applicable
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
Data on the numbers of cases and deaths and number of tests for countries and territories were collated and provided by Public Health England (PHE) on the 30th of July 2020, based on either country specific public updates or data collated by the World Health Organisation. Population estimates were extracted from the Our World in Data website. Data for US states were also provided by PHE, collected from the Covidtracking.com website.
https://www.who.int/emergencies/diseases/novel-coronavirus-2019