Abstract
In April 2021, after successfully enduring three waves of the SARS-CoV2 pandemic in 2020, and having reached population seropositivity of about 50%, Delhi, the national capital of India was overwhelmed by the fourth wave. Here, we trace viral, host, and social factors contributing to the scale and exponent of the fourth wave, when compared to preceding waves, in an epidemiological context. Genomic surveillance data from Delhi and surrounding states shows an early phase of the upsurge driven by the entry of the more transmissible B.1.1.7 variant of concern (VOC) into the region in January, with at least one B.1.1.7 super spreader event in February 2021, relatable to known mass gatherings over this period. This was followed by seeding of the B.1.617 VOC, which too is highly transmissible, with rapid expansion of B.1.617.2 sub-lineage outpacing all other lineages. This unprecedented growth of cases occurred in the background of high seropositivity, but with low median neutralizing antibody levels, in a serially sampled cohort. Vaccination breakthrough cases over this period were noted, disproportionately related to VOC in sequenced cases, but usually mild. We find that this surge of SARS-CoV2 infections in Delhi is best explained by the introduction of a new highly transmissible VOC, B.1.617.2, with likely immune-evasion properties; insufficient neutralizing immunity, despite high seropositivity; and social behavior that promoted transmission.
Introduction
After escaping relatively unscathed during the first wave of the COVID-19 pandemic, and after a brief respite from November 2020 until Feb 2021, India witnessed a ferocious second Covid-19 wave, accounting for about half the cases worldwide in the first week of May 2021. SARS-CoV-2 had spread widely throughout India in the first wave, with initial results from the Indian Council of Medical Research third national serosurvey finding 21.4% of adults and 25.3% of 10 to 17 year children to be seropositive. Serosurveys carried out by State / Municipal authorities in Indian megacities had showed even higher seropositivity with the national capital, Delhi, reporting 56% seropositivity in February, 2021. Yet this seems to have offered little protection, with Delhi amongst the most affected cities in April 2021. Relatively well collected multi-modal data for Delhi, including continued genomic surveillance since April 2020 makes this an intriguing case study. Here, we examine the possible reasons why; dissecting the viral, host, and social factors contributing to the scale and exponent of the pandemic upsurge that totally overwhelmed the healthcare infrastructure of the national capital.
Results
Epidemiological characteristics of sequential SARS-CoV-2 outbreaks in Delhi
Records of all tests performed, positive cases, positivity rate and deaths, were accessed through the Integrated Disease Surveillance Programme (IDSP, NCDC) database. Since the first case of Covid-19 was detected in Delhi in March 2020, the city has experienced ups-and-downs in the number of cases and the positivity rates, with four distinct waves, in June, September, and November 2020, and now in April 2021. The first and the third waves were associated with spikes in mortality and stressed healthcare systems but were successfully managed with social restrictions and use of micro-containment strategies focusing on test, trace, treat. After touching the high of almost 9000 cases daily and a positivity rate of about 15% during the third wave in November 2020, new cases steadily declined, with only 1% positivity between December 2020 and March 2021. This reversed and started increasing from third week of March, 2021, shooting to 30% by end of April, with almost 30,000 cases being reported per day. Fig 1 shows weekly cases, positives and positivity rate for 2021 leading up to and including the April 2021 surge. Further analysis focuses on the contributing factors.
Seropositivity in Delhi
We have previously shown in a cohort-based serosurvey that high seropositivity was associated with declining test positivity rate across India in September, but there is a 20-30% decline in neutralizing activity over a 6 month period (Naushin et al., 2021). A follow-up study of the same cohort was conducted in January and February 2021, with semi-quantitative anti-nucleocapsid, quantitative anti-spike, and a neutralization surrogate assay being measured. 473 of 1115 employees were seropositive across ten Delhi sites where we have serial data. When compared to the prior serosurvey in August/September 2020, that seropositivity had increased across these sites in Delhi (Figure 2). The seropositivity was unevenly distributed, with samples collected from laboratories and public-office employees having much higher seropositivity than samples collected from family members living at residential complexes. The median level of anti-spike antibodies in this sample from Delhi was well below 132 U/ml, the cutoff for convalescent plasma titres, suggesting that the protection against infection may be limited to a subset of the sero-positive fraction (Figure 2B) (USFDA, 2021). Further, in 35 sero-positive cases from 2020 under follow-up, retested in April 2021 and with three or more time-points since 2020, three showed a greater than 100 U and more than 50% increase in antibody levels after a decline, suggestive of possible reinfections and limited immunity.
Genomic character of the SARS-CoV-2 outbreak in Delhi
To determine whether SARS-CoV2 variants may be responsible for the April 2021 outbreak in Delhi, we sequenced and analyzed the community samples from Delhi from the previous outbreak in November 2020 until May 2021 and related it to effective reproductive number (Rt). It can be seen in Fig 3A that detection of B.1.1.7 VOC was minimal until January 2021, increasing to about 20% in February and 40% in March 2021. This was associated with slight increase in Rt from about 1.2 to 1.6. B.1.617 lineages increased from below 5% in Feb 2021 to about 10% in March before overtaking B.1.1.7 in April and rising to about 60% of all samples. The sub-lineage B.1.617.2, in which the escape mutation E484Q is lost and a new mutation T478K is gained, showed the maximum rise going from less than 10% of B.1.617 to about 80%. The rise of B.1.617, more specifically B.1.617.2, was paralleled by a large increase in positivity rate and increase in Rt to about 2 (Fig. 3B). We analyzed the Ct values of random positive samples tested on a single high-throughput automated machine (Roche 6800) at a single center (NCDC) over this period to understand whether there was any evidence of the VOC being associated with higher viral load and thereby infectivity. As seen in Fig. 3B, when compared to the stable baseline from Dec 2020 to Feb 2021, Ct values dropped (99% CI) in March when B.1.1.7 was dominant, falling further in April (99% CI), with the rise of B.1.617 (Fig. 3C). The total change of about -6 Ct corresponds to about 50-fold increase in viral load. The B.1.1.7 variant has been previously reported to be 43 to 90% more transmissible than prior variants as the carriers are more infectious due to the increased viral load (Davies et al., 2021; Volz et al., 2021). Interestingly, the case fatality rate (CFR) was significantly reduced during the surge period in Delhi (95% CI). The Rt was hovering near 1 from December to February but increased significantly from late March to mid-April (95% CI). The CFR values, which were constant from December till February, 2021, witnessed a significant decrease with a simultaneous increase in Rt from March (Fig. 2A). This may be due to the sudden decrease of B.1.1.7 in the Delhi region which reportedly causes high CFR (Davies et al., 2021). The CFR was rising again towards the end of the period. Since CFR may be due to an amalgamation of multiple factors, including the short-term collapse of the healthcare system, there is currently no clear evidence linking B.1.617.2 to change in CFR. This will be reexamined once final information is available on all deaths and COVID-related deaths over this period.
The origin of SARS-CoV-2 outbreaks in North India in 2021
The April 2021 outbreak in Delhi was preceded by outbreaks in the states of Kerala, Maharashtra and Punjab. While no VOC was identified in Kerala in Jan 2021, the outbreak in Maharashtra has been related to B.1.617.1 and in Punjab to the introduction of B.1.1.7. These were found to be phylogenetically related, with a strong phylogenetic connection between Delhi and Punjab for B.1.1.7, and between Delhi and Maharashtra for B.1.617 lineages, as shown in Figure 4A and B. The B.1.1.7 outbreak in Punjab was notable for multiple polytomies in the phylogenetic tree and clusters spanning multiple districts, suggesting a mass super-spreader event (Fig 4C). This is relatable to mass public gatherings and rallies held in different parts of North India since January 2021 and highlights the important role of social factors in SARS-CoV-2 outbreaks.
Expansion of B.1.1.7 and B.1.617 lineages in North India
To determine whether the outbreak in Punjab and Delhi may have led to seeding of B.1.1.7 across North India, for which these are the main travel hubs, we analyzed the temporal trend of the SARS-CoV-2 lineages (Figure 4 D-H). Despite limitations of lower sequencing than for Delhi, we see the same recurring pattern with initial seeding by B.1.1.7 in February to March and replacement by B.1.617.2 in April 2021.
Prevalent evidence of SARS-CoV-2 variant B.1.617.2 in post-vaccinated individuals
We sequenced all samples of vaccination-breakthrough cases at NCDC, Delhi, over the period of the study. Two VOC lineages were seen amongst 27 cases: B.1.617.1 (n=2; 8%), B.1.617.2 (n=19; 76%) Other samples had the background B.1 lineages (16%) (Figure 5a). It is noted that when compared to population prevalence (Figure 3A), B.1.617.2 was over-represented and B.1.1.7 was not even detected in vaccination breakthroughs, suggesting higher breakthrough risk of B.1.617.2 compared to B.1.1.7. There were some sequence variations within the lineages, as expected in an evolving outbreak. The lineage-defining mutations, frequent mutations and phylogenetic analysis are shown (Fig 5b).
Structural insights into spike protein mutations showed unique features of B.1.617.2 that may contribute to increased transmissibility
Our data shows that B.1.617.2 has higher transmissibility than B.1.1.7 in North India, with initial seeding by B.1.1.7 being consistently replaced by B.1.617.2, even in Punjab, where B.1.1.7 had reached near 100% prevalence. This is surprising since there is no known mutation in B.1.617 lineage or in B.1.617.2 that is associated with such high transmissibility. Stronger immune escape was initially considered a possible explanation for dominance of B.1.617 over B.1.1.7 (Hoffmann et al., 2021), given high seropositivity in Delhi. The lower representation of B.1.1.7 than B.1.617 lineages in vaccination breakthroughs, suggests that this is a plausible advantage to B.1.617 variants especially B.1.617.2. However, we see the same trend across various states including those with low seropositivity in the CSIR-cohort, such as Uttarakhand or Himachal Pradesh, suggesting that transmissibility is higher, independent of immune escape. Therefore, to understand the recent evolution of spike protein that may contribute to higher transmissibility, positional mapping of mutations on the structural model representing four lineages is shown (Figure 6). The B.1.1.7 lineage, first detected in September 2020, has been reported to carry several spike mutations, including two deletions in the N-terminal domain (NTD) and N501Y mutation at the critical receptor binding domain (RBD) that interacts with ACE2 human receptor (Rambaut et al., 2020). In addition, A570D, P681H, T716I, S982A and D1118H were also reported. In comparison, B.1.617 lineage carrying two unique RBD mutation combo (E484Q and L452R), however, lacked the NTD deletions (Ferreira et. al., 2021). The B.1.617.1 harbored similar set of two RBD mutations with slight changes. On the other hand, the rapidly spreading B.1.617.2 in India showed unique combination of seven spike mutations.
To understand the molecular interactions, we generated a structural model representing B.1.617.2 spike protein, as shown in Figure 7. The resulting map provides insights into the plausible mechanisms of regulation of virus entry and binding. It contains seven mutations in the spike protein, excluding the predominant D614G substitution. Three of these mutations, two substitutions (T19R and R158G) and one deletion (ΔE156-F157), were found in NTD region. The six nucleotides spanning the entire stretch of deletion (ΔE156-F157) was juxtaposed with R158G mutation, with one nucleotide of glycine contributed from E156 and the rest two nucleotides contributed by R158, resulting in GGA codon for glycine. The mutations within NTD occur on N1 and N3 loops that composes the prominent mABs recognition sites Chi et al, 2020; Naveenchandra, et al.,2021). In our mutated model, R19 and G158 residues are surface accessible while in wild-type the T19, E156-F157 were relatively buried.
We also found a unique RBD mutation T478K, in addition to previously reported L452R in B.1.617.2. Previous reports have associated L452R amino-acid change with antibody binding and has been classified as an escape variant (McCallum et al.,202; Liu et al., 2021). T478, on the other hand, is previously unidentified and is present directly on the receptor binding motif (RBM). The inherent long side-chain of mutated lysine reduces the gap with the ACE2 receptor as compared to the wild type. The distance between spike: K478 and ACE2:L85 is 8.3 Å, while the threonine maintains a distance of 10 Å. However, these distances may vary with other conformational states of spike RBD. We also observed significant increase in number of ACE2 residues around these mutated sites (L452R, T478K) in B.1.617.2. In contrast with 10 residues, 24 residues of ACE2 were in close proximity to the mutated side chains of B.1.617.2 RBD mutations. In addition, we observed two mutations (P681R and D950N) in proximity to the S1/S2 cleavage site. Previous reports have also shown that the dominant D614G, although distal to the furin cleavage site has an allosteric effect on conformational changes leading to RBD opening (Gobeil et al. 2021). We speculate that a combination of these mutations at three critical sites leads to enhanced transmissibility of B.1.617.2 lineage.
Conclusions
Northern India underwent seeding with B.1.1.7 in February 2021 followed by regional outbreaks that stretched the healthcare system but remained within its capacity. From March 2021 onwards, after introduction of B.1.617 and its sub-lineage B.1.617.2, there was a further rapid rise in new cases with displacement of B.1.1.7 and other B.1.617 lineages by B.1.617.2. Our data indicates B.1.617.2 shows high transmissibility and surges without any increase in CFR. We estimate the transmissibility to be as much as 50% greater than B.1.1.7. Viral load of B.1.617.2 appears to be higher than B.1.1.7 and based on data from India and UK, so does vaccination break-through rate. While immune escape seems less for B.1.617.2 compared to B.1.351 or P.1 (Li et al., 2021), overall, we note that B.1.617.2 is capable of creating very fast-rising outbreaks with vaccination breakthroughs. We would re-emphasize that prior infections, high seropositivity and partial vaccination are insufficient impediments to its spread, as seen in Delhi, and strong public health response will be needed globally for its containment.
Methodology
Nasopharyngeal and throat swab samples from Covid-19 confirmed cases with CT value < 25 was collected and transported to Biotechnology Division, NCDC from the various testing sites across the States as per the sampling strategy of Central Surveillance Unit (CSU) of Integrated Disease Surveillance Programme (IDSP), NCDC. Twenty seven post-vaccinated Covid-19 positive samples were also included in this study. All the patient details were filled on patient identification form and was accompanied with the samples. The study was approved by the NCDC Ethics Review Committee vide approval 2020/NERC/14. More than 9000 samples were received at NCDC from more than 9 states and union territories from November, 2020 till May, 2021. The RNA was isolated using MagNA Pure RNA extraction system (Roche). The samples were processed as per the CovidSeq (Illumina) protocol for sequencing on the NextSeq 550 (Illumina) instrument according to the manufacturer’s instructions. A total of 376 samples per lot were processed in batches of 94 with indexes A-D and 1.4pmol of the library was loaded on the 75cycle High Output flow cell. The run was over in 10 hours with approximately 20Gb data generated that was processed on the Dragen COVID Lineage Application v3.5.1 (Illumina). The raw data generated in binary base call (BCL) format from NextSeq 550 was demultiplexed to FASTQ file using bcl2fastq v2.20. Further the samples files in FASTQ format are provided as input for the COVID Lineage application. The reads are aligned to a SARS-CoV-2 reference genome (NC_045512). The minimum acceptable alignment score was set to 12, so any alignment results scoring lower are discarded. The coverage threshold and virus detection threshold were set to 20 and 5 respectively. The variant calling target coverage is set to 50 which specifies the maximum number of reads with a start position overlapping any given position. A total of 8,477 sequences out of initial 11,001 were used for further analysis. The pipeline performs Kmer-based detection and then performs Map/Align, Variant calling and Consensus Sequence generation. The lineage/clade analysis performed using Pangolin and NextClade.
The lineage data was segregated according to date of collection (DOC), State, Age, Sex. State wise distribution of B.1.617.2 compared with B.1.617.1, B.1.1.7 and ‘other variants’ over a period of five months (January-May, 2021) was performed to ascertain the prevalence of each variant within these geographical areas. The number of tests, confirmed cases and positivity rate were accessed from the national databases. The case fatality ratio (CFR) was calculated as per the WHO recommended method, as follows:
Serosurvey
The serosurvey was conducted through a voluntary participation wherein personnel working at CSIR labs/centers and their family members gave their blood samples in May -Sep 2020 (Phase 1) and Feb-March 2021 (Phase 2). The study was approved by the Institutional Human Ethics Committee of CSIR-IGIB vide approval CSIR-IGIB/IHEC/2019–20 and carried out in over 40 CSIR laboratories and centers spread across the country. In this study, samples from 10427 adult subjects were obtained of which; 1399 samples were from Delhi based laboratories and offices in Phase 1 and 9918 samples were obtained of which 1115 samples were from Delhi in Phase 2. Blood samples (6 ml) was collected in EDTA vials from each participant and analyzed on site or transported to CSIR-IGIB, New Delhi for Analysis. Elecsys Anti-SARS-CoV-2 kit from Roche Diagnostics was used to detect antibodies to SARS-CoV-2 NucleoCapsid antigen. It is a qualitative kit which was used for screening and a Cut-off index COI >1 was considered seropositive. Positive samples were further tested for quantitative antibody titers using the same manufacture’s kit directed against the spike protein (S-antigen). An antibody levels >0.8 U/ml was considered sero-positive as per manufacturer’s protocol. The detection range of this kit is from 0.4 U/ml to 250 U/ml. For samples, where values of >250 U/ml were obtained; appropriate dilutions were made. Neutralizing antibody (NAB) response directed against the spike protein (RBD site) was assessed using GENScript cPass kit which is a surrogate virus neutralization test (sVNT). A value of 30% or above was considered to have neutralizing ability.
Protein annotation and modelling
In total of 8,477 SARS-CoV-2 genomes were annotated for amino-acid substitutions by SnpEff version 4.5 (Cingolani et al., 2012). The annotation was done according to the known reference genome of SARS-CoV-2 i.e. NC_045512 in the NCBI database (Wang et al.,2020). The structural model of spike in 1 RBD-up state was generated using cryoEM structure of the spike 1 RBD-up state (PDB ID: 6VSB) as a template (Wrapp et al., 2020). To generate ACE2 bound structure, we took the X-ray structure of human ACE2 bound to the RBD domain with PDB ID: 6M0J (Lan Nature. 2020). Detailed modelling methodology is mentioned in our previous work (Fatihi et al, 2021). The structural mutant model of B.1.617.2 variant was generated using the structural model of ACE2-bound 1 RBD-up spike conformation as a reference. Each chain was mutated for missense mutations using ChimeraX (Goddard et al., 2018) whereas deletions in each chain were introduced by employing Coot (Emsley et al, 2010).
Data Availability
Sequence data is being continuously uploaded on the GISAID.
Acknowledgments
Support from Ministry of Health and Family Welfare, Council of Scientific and Industrial Research, and Department of Biotechnology, India, is acknowledged.