Abstract
Background Globally, the routinely used case-based reporting and IgG serosurveys underestimate the actual prevalence of COVID-19. Simultaneous estimation of IgG antibodies and active SARS-CoV-2 markers can provide a more accurate estimation.
Methods A cross-sectional survey of 16416 people covering all risk groups was done between 3-16 September 2020 using the state of Karnataka’s infrastructure of 290 hospitals across all 30 districts. All participants were subjected to simultaneous detection of SARS-CoV-2 IgG using a commercial ELISA kit, SARS-CoV-2 antigen using a rapid antigen detection test (RAT), and reverse transcription-polymerase chain reaction (RT-PCR) for RNA detection. Maximum-likelihood estimation was used for joint estimation of the adjusted IgG, active, and total prevalence, while multinomial regression identified predictors.
Findings The overall adjusted prevalence of COVID-19 in Karnataka was 27 ·3% (95% CI: 25 ·7-28 ·9), including IgG 16 ·4% (95% CI: 15 ·1 - 17 ·7) and active infection 12 ·7% (95% CI: 11 ·5-13 ·9). The case-to-infection ratio was 1:40, and the infection fatality rate was 0 ·05%. Influenza-like symptoms or contact with a COVID-19 positive patient are good predictors of active infection. The RAT kits had higher sensitivity (68%) in symptomatic participants compared to 47% in asymptomatic.
Interpretation This is the first comprehensive survey providing accurate estimates of the COVID-19 burden anywhere in the world. Further, our findings provide a reasonable approximation of population immunity threshold levels. Using the RAT kits and following the syndromic approach can be useful in screening and monitoring COVID-19. Leveraging existing surveillance platforms, coupled with appropriate methods and sampling framework, renders our model replicable in other settings.
INTRODUCTION
The global pandemic of SARS-CoV-2 causing coronavirus disease (COVID-19) has raged across the world within a few months. India has the second-highest burden of COVID-19 with 8 ·8 million infected and 130070 deaths, as of 16 November 2020.1 Currently, India has only case-based reporting as the prime strategy through all the epidemic phases. Case-based reporting has the advantages of rationalizing testing, isolating cases, and tracing and quarantining contacts.2 However, it does not provide an estimate of the true burden of the disease, as it picks up mostly sicker people seeking care or those who have better access to health care. Hence, the reported case counts of COVID-19 grossly underestimate the true prevalence of the pandemic. The two rounds of national seroprevalence surveys conducted by the Indian Council of Medical Research (ICMR) indicated that for every reported case, 81-130 infections were missed in the initial survey conducted in May 2020,3 which improved to missing nearly 26–32 infections per reported case by August 2020. This may be underestimated as it captured only IgG prevalence, and the sampling was not representative. Serological surveys, such as those conducted by the ICMR, can help understand the burden of past infections. However, detecting new cases is challenging since 45% of the infected people have mild or no symptoms.4 Furthermore, the estimation of active infections is affected by poor in-person testing due to inaccessibility, stigma, and supply-side inadequacies.
Effective public health measures require understanding the existing burden of disease reliably through epidemiological investigations. Joint estimation of IgG prevalence and active SARS-CoV-2 infections can help detect, manage, and control the disease outbreak. Seroprevalence estimates from the world show varying numbers ranging from 0 ·07% in-hospital patients to 54 ·1% in slum inhabitants.5-13 Although not representative, the ICMR survey results reported 0 ·73% prevalence across the country (May-June 2020), which increased to 7% by the end of September.3 The surveys in slums and non-slums of Mumbai5 showed considerable variation, 54 ·1% (95% CI: 52 ·7 to 55 ·6) and 16 ·1% (95% CI: 14 ·9 to 17 ·4) prevalence, respectively. Serosurveys in a healthcare setting of North India showed prevalence increasing from 2 ·3% in April to 50 ·6% in July.14 However, there are concerns about using only IgG prevalence as a marker of population immunity threshold. These include the inability to detect the IgG antibodies over time, varying sampling methods, the unreliable nature of the predictive value of positive antibody tests with varying sensitivity and specificity of different tests affecting the tests’ reliability, and the presence of other types of the immune response. 15-17 This can be resolved to a great extent by understanding the burden of active infections concurrently along with IgG estimation.
Karnataka has an estimated population of 70 ·7 million spread over 191791 square kilometers. The first confirmed COVID-19 case was reported on 09 March 2020. As of 16 November 2020, there were 861,647 cumulative cases, 27,146 active cases, and 11,529 deaths.1 Our goal was to estimate the IgG prevalence and active prevalence of SARS-CoV-2 infections in Karnataka jointly and assess the variation across geographical regions and risk groups. Here we report the results of what is perhaps the first comprehensive statewide survey in India. The design and analysis methodologies of this survey can serve as the blueprint for other similar surveys.
METHODS
Study setting, design, and sample size
Setting
This was the first round of the proposed serial cross-sectional surveys across the districts of Karnataka. This state has 30 administrative districts. The capital district Bengaluru has approximately 13 ·6 million residents. The study was conducted during 03-16 September 2020.
Design
Each district was a unit of the survey except Bengaluru, which was subdivided into 9 units. From the resulting geographically representative 38 units, health facilities with the expertise to conduct the survey were selected (Figure 1 and Appendix D). The participants included only adults aged 18 years and above. The survey excluded those already diagnosed with SARS-CoV-2 Infection, those unwilling to provide a sample for the test or those who did not agree to provide informed consent. The population was stratified into three risk groups based on community exposure and vulnerability to COVID-19. The Low-risk group comprised pregnant women presenting for antenatal check-up (ANC) clinic, and persons attending the outpatient department for common ailments in the hospitals and their attendees. The Moderate-risk group comprised persons with high contact in the community, bus conductors and autorickshaw drivers, vendors at vegetable markets, healthcare workers, individuals in containment zones, persons in congregate settings (markets, malls, retail stores, bus stops, railway stations), and waste collectors.18 The High-risk group comprised the elderly (60 years of age and above) and persons with co-morbid conditions (chronic liver disease, chronic lung disease, chronic renal disease, diabetes, heart disease, hypertension, immunocompromised condition, malignancy).
Sample size
Assuming 10% prevalence, the minimum sample size of 432 per cluster, for a target 95% confidence level, a margin of error 0 ·05, and design effect 3, led to a total sample size of 16416 across the 38 units.
Sample collection and laboratory testing
From participants in the low-risk group (Figure 2), we collected both nasopharyngeal and oropharyngeal swab samples for the RT-PCR test following the ICMR protocol and 4 ml of venous blood for the IgG antibody test. In the moderate and high-risk groups, we collected two swab samples in different media for the antigen and RT-PCR tests and 4 ml of venous blood.
The rapid antigen detection test (RAT) was done using the Antigen Standard Q COVID-19 Ag detection kit, a rapid chromatographic immunoassay for the qualitative detection of antigens specific to SARS-CoV-2. The RT-PCR test was done on all low-risk participants and on those who tested negative on the RAT (Figure 3) through the current ICMR-approved testing network. For antibody testing, the collected venous blood sample was left undisturbed at room temperature for 30 minutes for clotting, then centrifuged at 3000 rpm, and the serum was transported to the laboratory by maintaining a cold chain. SARS-CoV-2-specific IgG antibodies were detected using a commercially available, validated, and ICMR-approved kit (Covid Kavach Anti SARS-CoV-2 IgG antibody detection ELISA, Zydus Cadila, India)19. The test was performed as per the manufacturer’s instructions. The results were interpreted as positive or negative for SARS-CoV-2 IgG antibodies based on the cut-off value of optical densities obtained with positive and negative samples provided in the kit.
Data collection
After obtaining written informed consent, information on basic demographic details, exposure history to laboratory-confirmed COVID-19 cases, symptoms suggestive of COVID-19 in the preceding one month, and clinical history were recorded on a web-based application designed specifically for the study and were linked to the samples using the ICMR Specimen Referral Forms for COVID-19. The category, symptoms, contact, and comorbidity information for participants were gathered using the web-based application. RAT/RT-PCR results were entered into the ICMR test-data portal. The IgG antibody test results were retrieved directly from the labs. A consolidated line-list of all the participants was then created. From this, subsets of participants in risk-categories, subcategories, age groups, sex, and geographical units were used to jointly estimate IgG prevalence, active infection, and total burden in the respective categories. Symptoms and comorbidity data, which were part of the consolidated line-list, were used in the regression. Only anonymized data with no personal identifiers were used for the analysis.
Ethical considerations
The Institutional Ethics Committee (IEC) of the Indian Institute of Public Health-Bengaluru campus reviewed and approved the study (vide. IIPHHB/TRCIEC/174/2020). Participants’ test results were available and shared with them by the concerned health facility.
Statistical analysis
To enable joint estimation of IgG prevalence, active infection, and total prevalence, we first modelled an individual to be in one of four disease states: having active infection but no IgG antibodies, having IgG antibodies but no evidence of active infection, having both IgG antibodies and active infection, and finally having neither active infection nor IgG antibodies. The disease state of the individual is, however, hidden and can only be inferred from the RAT, the RT-PCR, and the IgG antibody test outcomes. This leads to a parametric model for the probabilities of test outcomes (observations) given the disease-state probabilities (parameters of the model), after taking the sensitivities and the specificities of the tests into account.
To get the joint estimates of the parameters in a stratum, we use maximum likelihood estimation, which ipso facto provides estimates already adjusted for the sensitivities and the specificities of the tests. The joint estimation is an extension of the Rogan-Gladen formula.20 The procedure also accounts for the protocol-induced variation of test-types across participants. Confidence intervals are obtained by invoking asymptotic normality of the maximum likelihood estimates, with their covariance matrix being approximated by the inverse of the Fisher information matrix of the parametric model. Further, weighted adjusted estimates for Karnataka were obtained after weighing each district’s prevalence estimates by the population fraction in that district. We computed odds ratios by restricting attention to the relevant subcategories.
To identify the weights on various independent/explanatory variables (symptoms, comorbidities, etc.) for predicting past infection and active infection, we use multinomial regression to regress the test outcomes on the independent variables. The procedure can be embedded within the framework of the generalized linear model with multinomial logit functions along with a custom link function that accounts for not only the test-type variability across participants but also the tests’ sensitivities and specificities. Important explanatory variables are captured using the Wald test. The details are given in the supplementary material provided.
Role of the funding source
The funders of the study had no role in the study design, data collection, data analysis, data interpretation, or the writing of the report. They did not participate in the decision to submit the manuscript for publication. The principal investigator (GRB) and key investigators had full access to all of the data. The corresponding author had final responsibility for the decision to submit for publication.
RESULTS
Of the 16585 people surveyed in the different risk categories, we present the results for 15624 individuals whose RAT plus RT-PCR and COVID Kavach ELISA antibody test results have been line-matched (Appendix C). A total of 16585 IgG results were provided. The results of 513 were not considered due to missing information and the inability to match the participant in the database; 448 entries were further not mapped to the line-list because of manual data-entry errors or because data was not retrievable from the ICMR portal. Also, 18 IgG samples were inconclusive (Figure 1 in Supplementary Material/Appendix C).
IgG prevalence
The overall weighted adjusted seroprevalence of IgG is 16 ·4% (95% CI: 15 ·1 – 17 ·7). This was as of 03 September 2020 and at the state level, obtained after adjusting for the serial sensitivities and specificities of all tests (Table 1).
Active infection
We estimate that 12 ·7% (95% CI: 11 ·5—13 ·9) of the seemingly unsuspected participants in the general population, or an estimated 89,88,313 (95% CI: 81,39,023—98,37,602) people, were having active infection (as on 16 September 2020). This is based on the numbers that tested positive on RT-PCR/RAT and after taking into account the IgG outcomes and the serial sensitivities and specificities of all tests (Table 1).
Overall COVID-19 prevalence
The overall adjusted prevalence of COVID-19 at the state level was 27 3% (95% CI: 25 ·7 – 28 ·9) as of 16 September 2020 (combined IgG and active infection (Table 1)).
Stratifications
The seroprevalence of IgG among males and females were similar, but the active infection was higher in males than females (15 ·5% vs · 8 ·4%) (Table 1, Figure 4). Thus, the overall prevalence was higher in males than in females (29 ·8% vs. 21 ·9%). Estimates of both seroprevalence and total prevalence were higher in the elderly population and low among the less-than-30-years-old population (Figure 5). The high-risk population had a higher prevalence (31 ·7% (CI: 29 ·1 – 34 ·2)), followed by the moderate-risk population (25 ·4% (CI: 23 ·0 – 27 ·8)) and then the low-risk population (20 ·7 (CI: 18 ·4 – 23 ·0)) (Table 1 and Figure 4).
Case-to-infection ratio (CIR)
At the state level, it is estimated that for every RT-PCR confirmed case detected, there were 40 undetected infected individuals as of 16 September 2020 (Table 3 and Figure 6). This is estimated by using 484,954 reported number of cases in Karnataka1 and the adjusted prevalence of COVID-19 (27 ·3%) against SARS-CoV-2. The cases-to-infections ratio ranges from 10 to 111 across units.
Infection fatality rate (IFR)
As of 03 September 2020, the IFR due to COVID-19 in Karnataka is estimated as 0 ·05%, with more than half the units (21 out of 38) above state IFR. The highest was estimated in the Dharwad district (0 ·23%) (Table 3 and Figure 6).
District/unit variations across the state
The IgG prevalence was highest in Vijayapura district (23 ·9%) and lowest in Bagalkot district (4 ·1%). The state capital Bengaluru had an IgG prevalence of 22% (95% CI: 19 ·1 – 24 ·9). The active infection was highest in Ballari (34 ·5%) and lowest in Bidar (0 ·7%). Bengaluru’s active infection was an estimated 9 ·2% (95% CI: 7 ·1 - 11 ·3). The overall COVID-19 prevalence was lowest in Dharwad district (8 ·7%) and highest in Ballari district (43 ·1%) (Table 2, Figures 7 and 8). The overall COVID-19 prevalence in Bengaluru was estimated to be 29 ·8% (95% CI: 26 ·5 −33). Within Bengaluru itself (with N = 3617 samples), we estimated that BBMP West had the highest IgG against SARS-CoV-2 and prevalence of COVID-19. In contrast, BBMP Mahadevapura had the least (Figures 7 and 9, and Supplementary Table 1). Again, BBMP RR Nagar had the highest active infection within Bengaluru, and BBMP East had the lowest (Figure 7, Supplementary material Table 1). Districts with high cases infections ratio (more than 40) are Vijayapura, Belgaum, Chitradurga, Tumakuru, Raichur, Ramanagar, Haveri, Chamarajanagar, Bidar, Davanagere, Yadgir, Kalaburagi, Kolar, Kodagu, Mandya, Chikmagalur, Ballari, Bengaluru Rural, Hassan (Table 3 and Figure 6). To summarize in a sentence, there is differential exposure to the disease across the state (Figure 8).
Explanatory variables
A generalized linear model-based multinomial regression indicated that nausea, headache, chest-pain, rhinorrhoea, cough, sore throat, muscle ache, fatigue, chills, and fever are significant variables that predict active infection (p-value < 0 ·05 for a Wald test). Fever is the most significant variable for predicting active infection among the symptoms. Additional variables that predict active infection are attendance at the Out-patient Department (OPD) of the hospitals and contact with COVID-19 positive patients (again with p-value < 0 ·05 for the Wald test). Diarrhea, chest-pain, rhinorrhea, and fever predict the presence of IgG antibodies to some extent, with diarrhea having the highest weight among the three. Additional variables that predict the presence of IgG are professions that involve greater contact with the public (bus conductors or auto drivers, vegetable vendors), residence in containment zones, time since the first 50 cases in the district, and the level of the district’s and the taluk’s urbanization (Figure 10 and Table 4).
We found that the RAT is more sensitive on symptomatic individuals since 543 were positive out of 798 RT-PCR-confirmed infected participants with symptoms yielding a sensitivity of 68 ·0% versus 348 being positive out of 742 RT-PCR-confirmed infected participants without symptoms yielding a sensitivity of 46 ·9%.
DISCUSSION
This is the first study in India, and probably elsewhere, that jointly estimates the proportions of people who already had the SARS-CoV-2 Infection (IgG antibody positive) and who currently have an active infection (RT-PCR / RAT positive).
The study has several additional strengths. First, we conducted the study throughout the state of Karnataka using the sentinel sites, thereby leveraging the state’s comprehensive surveillance platform. The sampling frame serves as a reference standard and can be used for population-representative surveillance in the future. Second, we used a serological test for IgG with high sensitivity (0 ·921) and specificity (0 ·977), thereby yielding a better predictive value for a positive test. Lastly, we assessed the prevalence in the key subgroups of populations with differential risk of contracting the SARS-CoV-2 virus.
An estimation of the IgG prevalence alone would have assessed the state’s burden at 16 ·4% prevalence. In contrast, the dual assessment of viral markers and antibodies gave us not only the IgG prevalence but also evidence of active infection of 12 ·7% and a total COVID-19 burden of 27 3%. This significantly larger estimate calls for an entirely different response from the state and highlights our survey’s benefits. We reported that 1 ·8% (95% CI: 1 ·2 – 2 ·3) of the population shows both viral RNA and IgG antibodies. The IgG antibodies form 14-21 days after exposure to the virus, while the RT-PCR test will likely return positive between 7-21 days after exposure. The correlates and implications of the simultaneous presence of viral RNA and IgG antibodies might require further examination in future studies.
Across risk-groups, the elderly and those with comorbidities had a higher prevalence of COVID-19, 21-23 suggesting that they are at higher risk of contracting the infection. Despite similar exposure, higher prevalence in them offers the possibility of infection with lower viral dose or that the younger age groups mounted a protective immune response.
The reported IFR due to COVID-19 of 0 ·05% is likely an underestimate, and a function of how well the districts report death data in the state. Studies worldwide found that the IFR of COVID-19 ranged from 0 ·17% to 4 ·16%.24-26 The low IFR reported in our study concurs with similar results in India and Asia, including China and Iran.27-29 A systematic review of published literature until July 2020 reported the IFR across populations as 0 ·68% (0 ·53%–0 ·82%).7 The IFR reported in the survey matches the reported estimates in Mumbai (0 ·05-0 ·10%), Pune (0 ·08%), Delhi (0 ·09%), and Chennai (0 ·13%).5,12,13,30 Districts with low CIR suggest that this might be the actual proportion by which we might be missing cases in Karnataka.
Our regression analysis determined which symptoms accurately predict active and past infections. Among the symptoms, we found that diarrhea, chest pain, rhinorrhea, and fever predict the presence of past Infection (IgG antibodies). This suggests that COVID-19 may have consequences that last beyond the active infection period. Diarrhea might suggest that the gastrointestinal tract manifestations might stay longer and might have implications to explore oral vaccines. We also found that ILI symptoms and history of contact with a COVID-19 positive patient can predict active SARS-CoV-2 infection.
The low-risk participants being recruited from hospitals may suggest the existence of a bias in the estimate. The protocol mitigated this by sampling systematically only among pregnant women and attendees of OPD. These participants are likely to have come from afar, thus providing information on prevalence outside the immediate hospital vicinity. The sampling from the congregate settings in the neighborhood of the hospital was made systematic to reduce sampling bias. The elderly and those with comorbidities were to be taken from an elderly list (from the census) and a list (compiled in April 2020) of vulnerable individuals with non-communicable diseases. Any deviation from the protocol would have introduced a bias. Hence, we used a design effect of 3 to account for these factors.
The low sensitivity and cost of RAT consideration led to a survey design in which the low-risk participants were not administered the RAT. Further, due to logistical issues, serum samples from one of the taluk hospitals were not available, and the corresponding participants had no antibody test outcomes. Our statistical methodology was designed to handle these issues and make the best use of all the available data.
The progress of the pandemic has been non-uniform, given the wide variation of COVID-19 burden from 8 ·7% to 43 ·1% across various regions of the state. Regions with low IgG prevalence require a targeted public health response. A revision in the testing strategy may be required in districts with high CIR. Given the predictive power of specific symptom complexes for active infection, the state should employ syndromic surveillance to detect active transmission areas. The high sensitivity of the RAT in symptomatic participants indicates that it is better-suited as a point-of-care test when people present themselves with symptoms.
In conclusion, our comprehensive survey and analysis provide insights on the state of the pandemic in the different districts of Karnataka and the varying levels of prevalence across the different stratifications based on age, gender, and risk. We also provide important epidemiological metrics such as IFR, CIR and their variation across geographical regions and population strata (Figure 11). Indeed, establishing district-level facility-based surveillance to systematically monitor the trend of Infection in the long term to inform local decision-making at the district level would facilitate and augment the necessary public health response towards the COVID-19 epidemic in Karnataka. It also helps identify regions with high severity of the disease, identify at-risk populations, and enable evidence-based intervention and resource allocation to manage the pandemic effectively. Repetition of the survey can better inform changes in the extent and speed of transmission and help evaluate the potential impact of containment strategies over time in different parts of the state. Above all, this study’s findings hold significant potential to improve clinical management and guide public health interventions to reduce the burden of COVID-19 in India and other lower and middle-income countries.
CONTRIBUTORS
The survey was a collaborative effort of the Department of Health and Family Welfare, National Institute of Mental Health and Neuro-Sciences, Indian Institute of Public Health - Bangalore, Indian Institute of Science, Indian Statistical Institute (Bangalore Centre), UNICEF, MS Ramaiah Medical College, Bangalore Medical College, and others. The protocol was designed by Prof Giridhara R. Babu and his team at the IIPH, Bangalore, along with the following members of the Technical Advisory Committee – Dr. Lalitha R, Dr. Lalitha K, and Dr. Pradeep B. The Technical Advisory Committee chaired by Prof Dr. M. K. Sudarshan reviewed and provided feedback on the design and implementation of the survey. Dr. M. R. Padma, Dr. Mohammed Shariff, under the supervision of Dr. Parimala Maroor, Project Director IDSP, coordinated the implementation at the state level. The technical review group chaired by the Director, DHFWS, approved the conducting of the study. Mr. Jawaid Akhtar, Mr.Pankaj Kumar Pandey reviewed the protocol, led the implementation and were involved in writing and reviewing the manuscript. Professors Giridhara R. Babu, Siva Athreya, and Rajesh Sundaresan planned and executed the data analysis, arrived at the findings, and wrote the first draft and revisions of the manuscript. All authors reviewed and approved the final manuscript.
DECLARATION OF INTEREST
We declare no competing interests.
DATA SHARING
The data are accessible to researchers upon formal request for data addressed to the Commissioner, Health and Family Welfare Services, Government of Karnataka.
Data Availability
The data are accessible to researchers upon formal request for data addressed to the Commissioner, Health and Family Welfare Services, Government of Karnataka.
Funding
National Health Mission, Government of Karnataka.
Acknowledgments
We would like to express our thanks to: Dr Arundathi, IAS, MD – NHM, Dr. Patil Om Prakash R Director – DHFWS, Dr. Prakash, State nodal officer for COVID19 and State Surveillance Unit for their support; DSOs, DAPCU officers, AMOs & Medical officers, District Microbiologists and District Epidemiologists for coordinating and implementing survey and providing guidance for sample collection as per sample size to health facility lab staff and coordinating for sample transportation to mapped RT-PCR & antibody testing ICMR labs; Lab Nodal Officer and staff of ICMR labs for RT-PCR testing and IgG antibody testing; Mr. Ramesh and team for providing a robust web platform for data collection; Lab technicians, Counsellor –ICTC & NCDC, Staff Nurse, Health workers for filling data in the survey App, collection of samples and sending samples to higher labs; Ms. Manjushree, Entomologist, DHFWS for helping in fetching RAT/RT-PCR results from ICMR Portal, Ms. Maithili Karthik and Ms. Sindhu ND, PHFI, for help with the line list matching; Mr. Nihesh Rathod, Indian Institute of Science, for the generation of Karnataka and Bengaluru Urban Conglomerate maps; Nitya Gadhiwala and Abhiti Mishra of the Indian Statistical Institute for help in collation of COVID-19 data from Karnataka state bulletins and R graphics; All the study participant for providing their consent to be part of this survey.