Abstract
Antibodies raised against highly prevalent human seasonal coronaviruses (sCoVs), which are responsible for the common cold, are known to cross-react with SARS-CoV-2 antigens. This cross-reactivity prompts questions about their protective role against SARS-CoV-2 infections and COVID-19 disease severity. However, the relationship between sCoV exposure and SARS-CoV-2 correlates of protection have not been clearly identified. Here we performed a cross-sectional analysis of cross-reactivity and cross-neutralization to three SARS-CoV-2 antigens using pre-pandemic serum from four different groups: pediatrics and adolescents (<21 yrs of age), persons 21 to 70 yrs of age, persons older than 70 yrs of age, and persons living with HCV or HIV. We find that antibody cross-reactivity to SARS-CoV-2 antigens varied between 1.6% and 15.3% depending on the cohort and the isotype-antigen pair analyzed. We also demonstrate a broad range of neutralizing activity (0-45%) in pre-pandemic serum that interferes with SARS-CoV-2 spike attachment to ACE2. While the abundance of sCoV antibodies did not directly correlate with neutralization efficiency, by using machine learning methodologies, we show that neutralizing activity is rather dependent on the latent variables related to the pattern ratios of sCoVs antibodies presented by each person. These were independent of age or sex, and could be accurately predicted by comparing the relative ratios of IgGs in sera directed to NL63, 229E, HKU-1, and OC43 spike proteins. More specifically, we identified antibodies to NL63 and OC43 as being the two most important predictors of latent variables responsible for protection, and 229E as being the least weighted. Our data support that exposure to sCoVs triggers various cellular and immune responses that influence the efficiency of SARS-CoV-2 spike binding to ACE2, and may impact COVID-19 disease severity through various other latent variables.
Introduction
Four endemic human sCoVs (229E, OC43, NL63, and HKU1) are highly prevalent worldwide and cause common and recurrent respiratory infections (1–3). A retrospective prevalence study of acute respiratory infections in France found that OC43 is the most prevalent sCoV followed by NL63, HKU1, and finally 229E (4). While nearly every adult has been exposed and demonstrates humoral responses to one or several of these sCoV (5, 6), immunity to each specific sCoV wanes over time. A 35-year longitudinal study revealed that reinfections by a same sCoV were regularly observed at 6 and 9 months post-infection, but most were most frequently observed after 12 months (3). These findings highlight that sterilizing immunity to these viruses is short-lived and that the relative ratios of antibodies against any one of the four sCoVs is highly variable and dependent on the most recent exposure. This study also indicates that exposure to one sCoV does not offer complete protection against infection by the other viruses, but it is unclear how prior immunity affects disease severity.
Seasonal CoVs generally share an overall low degree of sequence similarities with SARS-CoV-2 proteins and only the alphacoronavirus NL63 also utilizes ACE2 as an entry receptor (7, 8). Nevertheless, conserved epitopes on the spike S2 domain are believed to be responsible for antibody and T-cell cross reactivity between sCoVs and other human betacoronaviruses such as SARS-CoV-1, SARS-CoV-2 and MERS (9–12). Furthermore, the nucleoprotein (N) is one of the most conserved antigens across all human CoVs (13). In fact, antibodies against sCoV antigens impede the specificity of serological tests against SARS-CoV-2 when full length S and N antigens are used in the assay (reviewed here: (14)). While neutralizing activity directed against the SARS-CoV-2 S protein, more specifically the receptor binding domain (RBD), have been extensively described, non-RBD neutralizing epitopes have also been identified on both the S1 and S2 subunits (15–18). Furthermore, antibodies can exhibit non-neutralizing effector functions that restrict SARS-CoV-2 infection by binding to viral protein epitopes expressed on the surface of infected cells, including S and N antigens (19–22).
While antibody and T-cell responses of SARS-CoV-2 convalescent individuals have been well characterized (22–24), the influence of pre-existing immunity, both humoral and cellular, from exposure to sCoVs is not yet clearly established. While several recent studies have investigated the impact of prior immunity to sCoVs on SARS-CoV-2 infection, neutralization, and disease severity (1, 6, 9, 11, 25-29), there is some discordance in the findings. Some studies have shown weak or no cross-protective antibodies in pre-pandemic blood of donors in vitro (1, 6, 29), while several other studies have demonstrated various lines of evidence for antibody and T cell cross-neutralization, and even protection against COVID-19 disease severity (9, 23, 27, 30-32).
Here, we assessed the seroprevalence of sCoVs, along with the neutralization activity and cross-reactivity to three SARS-CoV-2 antigens, in pre-COVID-19 pandemic sera across four study groups. Our results provide insight into population subgroup variations in the seroprevalence of the four sCoVs, antibody isotype-specific cross-reactivity to SARS-CoV-2 spike (S), nucleocapsid (N), the receptor binding domain (RBD). We also assessed the neutralizing potential of antibodies in pre-pandemic serum that interfere with S binding to the ACE2 receptor. While neutralization of S binding to ACE2 was detected to various degrees, this activity remained low compared to the neutralizing potential resulting from SARS-CoV-2 infection. However, using several machine learning approaches, we were able to identify a functional dependence on prior sCoV exposure that includes both directly measured and latent variables. These enabled us to model neutralization via Gaussian process regression (GPR) of sCoV seroprevalence. In agreement to previous studies, we do not find a direct correlation between sCoV antibodies levels and neutralization. However, we do identify a strong predictive correlation between neutralization of S binding and relative ratios of the different sCoV antibodies. These data support the idea that latent variables associated with sCoV exposure have a predictive and protective role against SARS-CoV-2 infection and possibly disease severity.
Results
We acquired human serum and plasma samples, drawn prior to 2019, thus ensuring no previous exposure to SARS-CoV-2. We performed a cross-sectional profiling where a total of 580 patients were grouped into four cohorts based on age and viral infection. Cohorts were: pediatric samples of children and young adults less than 21 years of age (n=193), adults 21-70 yrs of age (n=273), adults greater than 70 years of age (n=64), and persons living with HCV or HIV (n= 50) of whom nine were followed longitudinally (Table 1). Nine patients received two longitudinal blood draws and sCoV seroprevalence was assessed at both time points (separated by 6-12 months) for a total of 589 samples. Each cohort was sex-balanced with mean age equivalence across sexes (Table 1).
We first investigated the prevalence of pre-exposure to the four endemic sCoVs in the pre-COVID-19-pandemic cohorts (Fig.1). Although the four sCoVs (HCoV-OC43, HCoV-229E, HCoV-HKU1, and HCoV-NL63) are prevalent worldwide (33). However, there seems to be a pattern of reinfections by sCoVs that cycles every 12 months (3). We screened pre-COVID-19 serum samples for sCoV anti-S IgG (229E, OC-43, HKU-1, and NL63) using our high-throughput ELISA platform (34, 35). As a reference to frame the dynamic range of our assays, SARS-CoV-2 anti-S data of confirmed negative (post-pandemic) (n=115) and SARS-CoV-2 PCR-confirmed positive samples (n=43) is presented (Table S1). We then compared the levels of IgG antibodies against sCoV S proteins in all four cohorts to the IgG antibody levels observed in confirmed cases of SARS-CoV-2 (Fig. 1). Our results show that a majority of pre-pandemic samples were seropositive for IgG antibodies against OC43, NL63, and 229E S proteins (Fig. 1 and Table 2).
One major hurdle in estimating the seroprevalence of IgGs against the four sCoVs is the lack of a true negative reference population. To overcome this challenge, we used a small subset of pediatric samples aged from 1 to 2 years old and established the cut-off at two standard deviations of their mean (Fig. S1). The advantage of using these samples is that it limits the probability that these individuals were exposed to the virus due to their young age, although there is still the possibility of vertically-transferred IgG from the mother through breastmilk (36–38). From these cut-offs, we were able to estimate a relative seroprevalence of sCoVs in the four cohorts. We found that children and adolescents and young adults less than 21 years of age had significantly lower 229E, OC43, and HKU1 sCoV seroprevalence than all other cohorts (Fig. 1 and Table 2). For example, seroprevalence of HKU-1 in persons < 21 yrs of age was about 34.7%, which agrees with Dijkman et al. who found HKU-1 seroconversion in pediatric individuals to be around 36% (39). Additionally, the total cohort weighted HKU-1 average seroprevalence of 60.7% is in agreement with Severance et al. which calculated the seroprevalence of HKU-1 at 59.2% (40). Furthermore, this lower seroprevalence in pediatric samples corroborates previous observations where seroconversion from 229E and NL63 exposure occurs on average 3.5 years following birth (41) (Fig. S1). Across all cohorts, we found an overall size-weighted seroprevalence of 82.9% for OC43, 82.1% for 229E, and 74.6% for NL63 (Table 2). Our calculated seroprevalence for OC43 and 229E is in line with previous studies (5, 40, 42).
To ensure the specificity of sCoV detection, we tested co-reactivity between the S proteins of each of the sCoV along with an additional antigen (RBD or N) on a subset of samples (Fig. 2A). It is expected that upon virus exposure and seroconversion, antibodies are detected against both S and N, and S and RBD antigens of the same virus. Antigen discordance in single-positive samples is likely the result of S cross-reactivity to other sCoV. We then compared the proportion of samples that were dual-positive for S and N of OC43 and HKU-1, and S and RBD of 229E (Fig. 2B). Positive concordance between the spikes and alternative antigens (i.e., N or RBD) for the four cohorts ranged between 67 – 76% (Fig. 2B). These results highlight that there is a relatively high level of cross-reactivity between the S proteins of sCoVs which compromises assay specificity. Therefore, these data indicate that an accurate determination of sCoVs seroprevalence should include more than one antigen to minimize the potential of S protein cross-reactivity.
We next assessed the prevalence of IgA, IgM, IgG and IgE binding to SARS-CoV-2 antigens S, RBD, and N in pre-pandemic serum samples (Fig. 3 and Tables 2 and S2-S7) using our high throughput serological assay (31). Cut offs were set at 2 standard deviations (2 SD) above the mean values of the presumed negative sample population. Among pre-pandemic cohorts, 4.6% and 4.9% of samples displayed IgG reactivity against the SARS-CoV-2 RBD and S antigens, respectively (Fig. 3G and 3H, and Table 2). In contrast, the N antigen appeared to be more cross-reactive, showing a higher number of samples that were positive for IgG cross-reactive antibodies (6.7% - 15.3%) (Fig. 3I and Table 2). Given the importance of N as a common target for serological assays (43–45), efforts to accurately measure cross-reactivity to this antigen are critical to account for false positive results. This is especially important when considering that the N antigen of the four sCoVs has on average the highest level of sequence homology to the N of SARS-CoV-2, and that a recent infection by a sCoV could exacerbate such cross-reactivity through immunological imprinting (1, 14). Regardless of their levels, anti-N antibodies are unlikely to offer protection against SARS-CoV-2 entry given that they do not neutralize the virus. Finally, the role of IgE antibodies in mediating immune responses against SARS-CoV-2 infection is currently unknown although IgE antibodies can play a major role in allergic reactions and inflammation (46). Our results show that cross-reactive sCoV IgE antibodies are rare and are unlikely to have a detectable role in influencing SARS-CoV-2 infections (Fig. 3J-3L, and Table S7).
We further investigated cross-reactivity of the IgG1, IgG2, IgG3, and IgG4 isotype subclasses against the three SARS-CoV-2 antigens (Fig. S2, S3, and Tables S2-S7). Our analysis revealed that most of the IgG subclass antibodies in pre-pandemic sera only exhibited low levels of cross-reactivity with SARS-CoV-2 antigens. Detection of cross-reactive IgG3 antibodies is interesting, as it has been shown that SARS-CoV-2 S-specific IgG3 subclass levels increase with COVID-19 disease severity (47). We further wanted to assess the false discovery rate (FDR) when two or three antigens are used for the analysis. For this, we performed a comparison of all the serology results across three antibody isotypes and three antigens to determine if a positive sample is positive for only one or multiple antigens and/or isotypes (Figs. 5, S3 and Tables 2 and S7). We found that in most cases, a positive sample is restricted to only one isotype and/or one viral antigen. This highlights the importance and value of considering simultaneously more than one antigen for SARS-CoV-2 serological testing. We calculated the FDR for each specific cohort disaggregated by individual antigen-isotype/subtype (Tables S2-S6), and we also calculated the FDR for each combination of antigen and isotype/subtype (Tables 1 and S7). While the IgG FDR for our complete pre-pandemic cohort ranged from 5% to 11% when probing individual antigens, these values dropped to between 0.85% to 1.58% when probing two antigens, and was 0.51% when all three antigens are used to determine seropositivity (Table 2).
Previous studies have reported the presence of cross-reactive antibodies to SARS-CoV-2 S protein in pre-pandemic blood samples (1, 11, 26, 48, 49). However, while generally limited information is available pertaining to levels of cross-neutralization of SARS-CoV-2, some information has been published with regards to the neutralization of S-pseudotyped viruses (1, 29). Given the importance of neutralizing antibodies that target the SARS-CoV-2 S trimer (26, 50–52), we evaluated the relative efficacy of sCoV antibodies to inhibit SARS-CoV-2 S-ACE2 interactions using a protein-based surrogate neutralization ELISA (snELISA) assay (48). Inhibition of S-ACE2 interaction varied from 0-45%, with a single patient showing inhibition level of 89% (<22 yr old, male; epilepsy), with median levels ranging between 18-23% (Fig. 5, Table 3). Very low neutralizing activity was previously observed by others in a different experimental system that measured infection by S-pseudotyped viruses (1, 29). By contrast, we identified a moderate yet distinct pattern of neutralization between cohorts of pre-pandemic samples.
Next we asked if there was an association between sCoV antibody levels with the levels of neutralization observed. Using fuzzy c-means unsupervised clustering, we clustered samples into 10 distinct groups of human sera from 507 of our 580 patients from which we had sufficient material to monitor the degree of inhibition of SARS-CoV-2 S / ACE2 interaction and cross-reactivity with SARS-CoV-2 RBD, S, and N proteins (Fig 6A). Subjects stratified independently of age and sex or other prior viral infection (HIV & HCV) (Fig. 6A, Table S8). We found that these clusters defined a continuum of SARS-CoV-2 neutralization in patients ranging from 0% inhibition to 45% (Fig. 6B). Interestingly, this serum sample also displayed high IgG cross-reactivity against RBD (O.D.= 2.172) and S (O.D.=2.707) along with high IgA cross-reactivity against RBD (O.D.=0.275) and S (O.D.=0.233). Additionally, the presence of autoantibodies against ACE2 that block spike binding could also theoretically contribute to the neutralization given that autoantibodies are detected in epilepsy (42–44). Additional neutralization assays using S-pseudotyped viruses are on-going to confirm this observation.
We then further sub-divided these patient clusters into three groups: Non-Responders, Partial Responders, and Responders based on their similarity to newly infected SARS CoV-2 patients (Day 1-9 from testing positive) with respect to SARS-CoV-2 neutralization (Table S1). Sera from 3% of all subjects (N=16, Cluster 3) did not neutralize binding of SARS CoV-2 S to ACE2 (Non-Responders, Fig. 6B). Sixty eight percent of all subjects were Partial Responders, exhibiting significantly higher neutralizing activity than Non-Responders, yet significantly lower neutralizing activity than newly infected COVID-19 patients (Fig. 6B). Twenty-nine percent of subjects were Responders, exhibiting significantly higher neutralizing activity than Non-Responders and comparable activity to that of newly infected COVID-19 patients (Fig. 6B). We asked what pattern of sCoV reactivity against their respective S proteins discriminated these groups. We found no statistical difference in the levels of antibodies against OC43-S, 229E-S, HKU-1-S, or NL63-S (Fig. 6C). Moreover, the vast majority of cross-reactive SARS-CoV-2 antibodies were directed against the N antigen (Tables 2 and S7). These data suggested that absolute abundance of sCoV anti-S IgG antibodies did not dictate neutralization.
To identify critical features underlying neutralization response, we used Pearson’s correlation analysis to determine major linear correlations as well as signed distance correlation to obtain information about linear, non-linear and indirect, and more distant associations. Through these analyses, we were able to discern a pattern of seroprevalence to sCoVs that closely associated with the capacity to neutralize S / ACE2 interactions. First, the percent inhibition of S / ACE2 binding in the Responder group positively associated with higher levels of SARS-CoV-2 cross-reactive IgG Spike and IgG RBD antibodies (at a Pearson’s correlation above 0.2 and p<0.05), as expected (Fig. 6D), and additionally with abundances of IgM RBD, IgM S and IgA S antibodies (at a distance correlation level above 0.2 and p<0.05) (Fig. 6E). Of these five neutralization correlates, abundances of OC43 S IgG associated with higher levels of SARS-CoV-2 cross-reactive IgG RBD and IgA NP. Additionally, levels of NL63 S and HKU-1 S IgG positively associated with levels of cross-reactive IgG NP (Fig. 6E). At lower distance correlation level of 0.17 there was also direct correlation between NL63 and the percent inhibition (not shown). Conversely, abundances of anti-229E S IgG negatively correlated with both SARS-CoV-2 IgM RBD (Pearson’s correlation, distance correlation, Fig. 6D) and IgG RBD (distance correlation, Fig. 6E). Finally, HKU-1 S IgG levels negatively correlated (via distance correlation) with IgM S and IgG RBD (Fig. 6E).
Taken together, these data suggest that NL63 S and, to a lesser extent, OC43 S antibody levels positively associate with correlates of percent inhibition and 229E and HKU-1 negatively associate with correlates of percent inhibition. These data are consistent with Selva et al. who have shown that elderly individuals with higher sCoV exposure show increased levels of cross-reactive SARS-CoV-2 IgA and IgG responses, while children show elevated SARS-CoV-2 IgM and is associated with less frequent exposures (53). Here, we show that these responses link to OC43 S abundance. In our analysis, cross-reactive SARS-CoV-2 IgM levels negatively correlated with 229E and HKU-1 anti-S antibodies. Thus, a higher abundance of these antibodies is associated with a decreased neutralizing probability following exposure to these two sCoVs according to our model. Percent inhibition is in turn correlated with anti-SARS-CoV-2 IgM (IgM RBD and IgM Spike), IgA Spike and IgG (IgG Spike and IgG RBD), possibly indicating a different source of protection in our responder group members belonging to different age groups. These results are in excellent agreement with the further statistical analysis of the feature significance for classification, regression to percent neutralization determined using Relieff. Although there was no statistically significant difference in the levels of antibodies against sCoVs between the three groups (Fig. 6C), Relieff selection of the most significantly different features related to the continual value of percent neutralization demonstrated that levels of NL63 and OC43 have the positive, i.e., significant, classification weight to percent neutralization (Fig. 6G). This result is supported by their positive correlation with IgG NP, IgG RBD and IgA NP. Finally, IgG RBD, IgG Spike, IgM Spike and IgM RBD have the highest classification weight according to Relieff analysis, in agreement with the correlation analysis in Figures 6D and 6E providing additional confirmation that the pattern of sCOV S antibodies in Responders underlies neutralization capacity.
To test this hypothesis computationally, we asked whether we could use only sCoV S reactivity to predict neutralization of the SARS-CoV-2 S /ACE2 interaction (Fig. 6F). We used a Gaussian process regression (GPR) method to explore the functional relationship between sCoV S antibody levels and the percent inhibition of ACE2. The GPR model, using sCoV antibody levels as predictors, provided excellent agreement with percent inhibition of ACE2 for all three patient groups (Non-responders, Partial Responders and Responders, Fig. 6G). Because GPR optimizes variable projection, weights, as well as covariance as a function of the latent variables, these results indicate that sCoV antibody measurements, when combined with latent variables, (i.e., variables that are not measured but can be determined as functions of measured values), dramatically improve prediction of percent inhibition of ACE2 interaction. GPR analysis also provides a confidence interval of this prediction, i.e., error level of the prediction (indicated in dashed lines in Fig. 6F). The significance of a variable in the developed GPR model, measured as the weight of the predictor indicates major significance of NL63 in modeling followed by OC43 and HKU-1. In this analysis predictor weight of 229E is significantly lower. This is once again in agreement with the correlation analyses demonstrating 229E s indirectly negatively correlated with percent inhibition (Fig. 6E). Taken together, these data link the capacity of pre-pandemic sera to neutralize SARS CoV-2 S / ACE2 binding to both specific patterns of elevated antibody levels against NL63 S and OC43 S as well as additional unknown factors, yet to be identified, conferred by this specific pattern of cross-reactive antibodies to SARS-CoV-2 S (Fig. 6E).
Discussion
Overall, our data has highlighted that there is a very large range in antibody levels against sCoVs. Interestingly, measuring seroprevalence of sCoVs using a single antigen is prone to specificity challenges due to cross-reactivity between sCoV antigens. We show that only 67% to 76% of spike-positive samples were also positive when probed with a second antigen from the same virus (Fig. 2B). This indicates that we are over-reporting seroprevalence of individual sCoVs. Additionally, while there is no statistical difference in prevalence between adults and elderly individuals, there appears to be a modest decrease in prevalence in younger individuals for 229E, OC43 and HKU-1. A recent paper by Selva et al. also observed that children have fewer exposures to sCoVs than older adults (53). Exactly how high levels of antibodies against sCoVs or recency of infection by sCoVs may influence COVID-19 disease severity remains unclear. However, our data highlights relatively high levels of cross-reactivity against certain SARS-CoV-2 antigens, with N being the most frequently detected at 11% overall prevalence, while S displayed about 5% prevalence (Table 2). These values are comparable to another study that showed 16.2% cross reactivity to N and 4.2% for S (1). Small differences between our two studies can be accounted for by cohort composition that affects certain antigen/isotype pairs disproportionately (Fig. 3, S2 and tables S2-S6).
Although some studies have indicated that no detectable SARS-CoV-2 neutralization activity was detected in pre-pandemic blood samples using various types of assays (1, 29, 53), the snELISA that we used was able to discern small variations from 0% to 45% in S-to-ACE2 binding inhibition. Although snELISAs are not capable of providing the same complete assessment of whole virus entry neutralization as measured when using live replicative virus, this methodology does enable an isolated and focused assessment of one parameter of the neutralization which is the binding of the viral spike to its host receptor. Small variations at this level can have profound implications during an actual infection in a human host that could perhaps be missed using cell lines to assess neutralization with live virus, the current gold standard. Spike-pseudotyped lentiviruses are also useful surrogates to study neutralization and viral entry, but lentiviruses bud from the cell surface and likely harbor a different lipid and host molecule compositions in their envelope than SARS-CoV-2 that mostly egress through exocytosis and possibly also through the lysosomal pathways (54, 55). Given these differences in the biochemical and immunological composition of the envelopes of lentiviruses and betacoronaviruses, neutralization data obtained with pseudotyped lentiviruses should not be considered as definitive. Nevertheless, in agreement with previous studies, we also did not identify a direct link between the abundance of sCoV S antibodies and neutralization of S/ACE2 interactions (Fig. 6C). However, our data do indicate that cross-reactive IgG against S and RBD of SARS-CoV-2 have a positive correlation with neutralization efficiency, and also to a lesser degree, IgM against S and RBD, and IgA against S (Fig. 6D and 6E). An even more interesting observation is how NL63 correlates with cross-reactive N IgGs (Fig. 6D), a similar observation was also made by another group (30). While N antibodies are unlikely to be neutralizing, their Fc effector function, their role in CD8+ T cell responses and /or the cytokines and immune factors involved in their production may all have an indirect influence on other antiviral pathways in a live infected host.
Despite a small number of studies reporting undetectable cross-neutralization or unlikely protection by sCoVs antibodies (1, 29, 53), there is a large body of evidence that supports a role for prior exposure to sCoVs with milder COVID-19 disease severity and/or detectable immunity to SARS-CoV-2 (6, 9, 12, 27, 31, 32, 56, 57). Using machine learning approaches, we present here data that reconcile these differences by providing evidence that it is the relative pattern of prior sCoV antibody levels in sera, and not total levels, that predict neutralization intensity levels. Additionally, GPR analysis shows that latent variables, that have not been identified in this work but that can be inferred from the measures of sCoVs, provide a model that accurately predicts the percent inhibition of S / ACE2 binding.
While evidence continues to grow in support of the protective outcome of prior exposure to sCoVs, NL63 in particular, it is well understood that humoral responses are only one arm of the immune system. With latent factors being here identified to exert an influence over S/ACE2 binding, an understanding of how sCoV may diminish the severity of COVID-19 will require a broader investigation. Evidence that cross-reactive T cells also play a role in protection is also gaining momentum (31). In future work, it will be critical to evaluate if our GPR analysis holds true in predicting COVID-19 severity by way of sCoV antibody ratios.
Materials and Methods
Additional experimental procedures are described in detail in the Supplementary Material section.
Study approval
Use of human samples for this study was approved by the University of Ottawa Ethics Review Board: Certificates H-04-20-5727, H-04-21-6643 and H-07-20-6009.
Pre-pandemic cohort
The current study collected pre-pandemic serum or plasma samples separated into four cohorts for a total of 580 enrolled patients. The cohorts selected for inclusion in this study are described in (Table 1). Serum and plasma samples were sourced from diverse source including the Eastern Ontario Regional Laboratory Association (EORLA), the Ottawa Hospital (TOH), and the Icahn School of Medicine at Mount Sinai. Specimens were collected between April 1 2015 and March 4 2020. All the pre-pandemic patient samples were obtained prior to the COVID-19 pandemic in Ottawa. Pediatric samples were acquired from the BC Children’s Hospital Biobank (BCCHB) in Vancouver, BC, Canada (REB#: H-07-20-6009). HIV and HCV sere was obtained from The Ottawa Hospital (TOH) and Ottawa Hospital Research Institute (OHRI). The patient’s demographic information and clinical outcomes were obtained from hospital records.
For all ELISAs the pre-COVID-19 serum samples were thawed to 40C and diluted in dilution buffer [1X PBS + 0.1% Tween (Fisher #BP337-500) (PBS-T), + 1% non-fat milk] in 96-deep conical bottom 2ml storage plates (Costar #3961) one hour prior to use.
Post-Pandemic samples
As a reference group for the neutralization and seasonal experiments, serum samples from individuals post-2019 were used. Samples were acquired within several different research studies and separated by their provenance. The control group for the neutralization (Fig. 5) were SARS-CoV-2 positive and hospitalized individuals (mild and severe) and were further separated by day post positive PCR test. The individuals for reference group in Figure 1 were SARS-CoV-2 positive convalescent individuals. The antibody levels of this group were monitored using the same protocol and platform used to assess the seroprevalence of sCoVs. The negative reference group was comprised of individuals with no history of SARS-CoV-2 infection who are currently enrolled in a surveillance study. Basic demographic information was collected and combined in Table S1.
Detection of anti-SARS-CoV-2 Responses in Serum Samples by ELISA
The ELISA procedure is a modified method of a recent published study (16). Antibody responses in pre-pandemic serum samples of all four cohorts were tested for the presence of antibody isotypes (IgA, IgM, IgG, and IgE) and IgG subtypes (IgG1, IgG2, IgG3, and IgG4) using this ELISA binding assay. SARS-CoV-2 antigens S, RBD, and N were diluted in sterile 1X PBS (Multicell #311-010-CL) to 2ug/mL and used to coat a 96-well plates (VWR #62402-959) using 50uL and incubated on a 4°C shaker overnight. The next day, the plates were washed three times with PBS-T and were subsequently blocked with buffer (PBS-T + 3% non-fat milk powder) for one hour. The patient serum samples were diluted 1/50 in dilution buffer with a final volume of 100uL.
Accordingly, the control calibration antibodies were prepared with appropriate dilutions to generate the calibration curve (anti-COVID Ig). The dilutions for each curve were as follows: total IgG and IgG1 1/5000 followed by 1:2 serial dilutions; IgM, IgA, and IgG2 1/4000 followed by 1:2 serial dilutions; and IgG3 and IgG4 1/10,000 followed by 1:2 serial dilutions. Blocking solution on ELISA plates was removed after one hour followed by the addition of 100uL of the diluted serum samples and control antibodies to the appropriate wells. The plates were incubated with samples for two hours at room temperature (RT) on a shaker (700rpm). After two hours, plates were washed three times with 200uL PBS-T followed by the addition of secondary-HRP antibodies (1:3000) diluted in dilution buffer. 50uL of the appropriate diluted secondary antibody-HRP was added to each well and the plates were incubated at RT for one hour of shaking (700rpm). After one hour of incubation, the plates were washed four times and developed using OPD tablets (Sigma, P9187) dissolved in 20mL of water for infection (WFI) (Thermo Fisher A1287301). After ten minutes incubating in the dark, 50µl of 3M HCl (Fisher #7647-01) was added to each well to stop the reaction and the optical density (OD) at 490nm was measured using a BIO-TEK Power Wave XS2 Plate Reader. Wells filled with dilution buffer in place of serum represented background controls and were subtracted from the patient serum values.
Calculation of Cut-Off Values
To establish consistent and unbiased cut-offs, a systematic approach for each combination of antigen and isotype was used. Using the blank subtracted values from the serological assays a preliminary cut off was established at 2 standard deviations (2SD) of the mean of all values. Values over that threshold were then excluded from further calculation of the final cut-off. To establish the final cut-off, the mean and standard deviation were recalculated using exclusively values under the preliminary cut-off. A final cut-off was then established at 2SD of the mean of the distribution under the preliminary cut-off. Values over that final cut-off value were considered positive (ratio to cut-off equal or higher than 1) and values under were considered negative (ratio to cut-off lower than 1).
To establish the seasonal coronavirus cut-offs, a set of pediatric samples aged between 1-2 years old was used as the reference population due to the high prevalence of seropositivity in adults. The same two pass exclusion strategy at 2SD of the mean was used on that reference group as described above.
Surrogate Neutralization ELISA (snELISA) assay
To evaluate potential neutralization activity against SARS-CoV-2 S protein, we carried out a surrogate neutralization ELISA based on (25). This neutralization assay was performed using 384-well Immuno plates (Thermofisher, 460372) coated with full SARS-CoV-2 S protein diluted in PBS-T (30uL/well of 2ug/mL), followed by an overnight rotating incubation at 40C. The next day, plates were washed two times with 50μl of PBS-T and further incubated with 40uL blocking buffer for one hour on a plate shaker at a speed of 400 rpm. After incubation the plates were washed with PBS-T followed by the addition of pre-COVID-19 serum patient samples diluted 1:4 in dilution buffer (20 μL/well). The plates were further incubated for one hour and washed three times with PBS-T. Biotinylated ACE2 was diluted in dilution buffer to 0.05ng/uL, followed by 20 μL being added to each well (1 ng/well) and a one hour incubation (400 rpm). After incubation the plates were washed three more times with PBS-T followed by incubation with 20uL of 5ng/uL Streptavidin-Peroxidase polymer diluted in dilution buffer (Sigma#S2438) for 1 hr at 400 rpm. After incubation the plates were washed three times with 60uL of PBS-T. 20uL of SuperSignal ELISA Pico Chemiluminescent substrate diluted 1:2 in PBS was added to each well to measure the luminescence (RLU) signal. Neutralizing activity was determined by calculating the % inhibition against SARS-CoV-2 S proteins using the following equation:
Statistical and Machine learning analyses
Data analysis was conducted using GraphPad Prism v9, R and Matlab 2021a (Matworks Inc.). Univariate analysis was performed using one-way or two-way ANOVA where appropriate. Welch corrections were applied to all analyses. Post hoc tests were Games-Howell’s multiple comparisons when group sizes were greater than 50 or Dunnett T3 when less than 50. Sample clustering was performed using fuzzy c-means (FCM) clustering method (fcm function running under Matlab). FCM allows each feature to belong to more than one group by providing a degree of membership, “belonging”, to each cluster through maximizing proximity between similar features and distance between dissimilar features. FCM is based on the minimization of the objective function: , where m ɛ (1, ∞) is the “fuzzyfication” factor (with m=1 equating FCM to crisp K-means clustering), is the membership degree for feature xi to the cluster j with cj defining the cluster center. Higher membership value indicates stronger belonging to the cluster with membership value of 1 indicating that feature is only associated with the single cluster. Sample clustering was based on serum antibody cross-reactivity to 4 different SARS-CoV-2 antigens (N, RBD, Spike) for IgA, IgM and IgG and capacity of these antibodies to inhibit Spike-ACE2 binding measured for 507 samples. FCM clustering fuzzyfication factor m was 1.3 with FCM optimization running until minimum improvement in objective function is 1e-5.
Correlation analysis was performed using Pearson and Signed distance correlation metrics calculated on z-score normalized features within each group divided into Responders and combined set of Non-responders and Partial responders. Signed distance correlation was performed using an in-house Matlab routine based on the distance correlation method presented by Szekely and Rizzo (Székely, et al (2007). Ann. Statistics. 35 (6): 2769–2794.) with correlation signs derived from Pearson correlation analysis. Distance correlation p-values were calculated using Student’s t cumulative distribution function (tcdf function in Matlab). Correlation was determined for 507 samples. Out of this set 12 samples were missing measurements for sCoV. Therefore, correlation calculations were performed both for only 495 samples excluding samples with missing data and for KNN imputed dataset (with N=10 using Euclidean distance). Two approaches obtained the same result. Both Pearson and Distance correlations with p<0.05 and absolute value above 0.2 were shown in the Figure 5. Correlation was performed separately for Responder group (Neutralization value ≥ 26%) and remaining subjects (Neutralization value <26%). Shown are correlation values for the responder group. Group of Nonresponders and Partial responders had no correlation for the represented groups and shown threshold levels.
Feature selection was performed using machine learning method Relieff (Robnik-Sikonja, M., and I. Kononenko. (2003). “Theoretical and empirical analysis of ReliefF and RReliefF.” Machine Learning, 53, 23–69.). Relieff is used to predict classification rank, i.e. within the group significance, for each feature in the training set classification, of serum antibody cross-reactivity to 4 different SARS-CoV-2 antigens (N, RBD, Spike) for IgA, IgM and IgG and separately for antibodies for seasonal viruses (OC43, HKU-1, 229E, NL63). Relieff provides feature selection for classification, i.e. regression, to a continual variable in this case the % inhibition of ACE2-Spike protein interaction. Relieff ranks predictor features by weight for regression to response vector. Negative predictor weight indicates that this is not a good predictor feature and large positive weights are assigned to important predictors. Feature weight decreases if it differs from that of features in nearby instances of the same class more than nearby instances of the other class. Feature selection was performed on data following imputation of missing values for seasonal viral loads using KNN method with N=10 and Euclidean distance.
Gaussian process regression (GPR) method was used for the development of a model of percent inhibition of ACE2-Spike protein interaction from the antibody levels for seasonal viruses. GPR is a nonparametric, probabilistic model that introduces latent variables as a function of measured features (here viral vectors) where the new feature function, f(x), has joint Gaussian distribution (for all features) with optimized kernel that is used to ensure feature distance dependent correlation. Optimized model is: y = h(x)Tβ + f(x) - where y is the fitted function (target), x are the variables in this case measurement of seasonal virus antibodies, h(x) are a set of basis functions that transform the original feature vector x and β is the vector of basis function coefficients, weights, for contribution of h(x) to the model. Kernel function used in the model was rational quadratic following optimization. Both feature selection and GPR were performed for 507 samples were missing values for 12 samples for sCoV measures were imputed using KNN with N=10 and using Euclidean distance measure. GPR provides the regression model with confidence interval as well as predictor weights for the model.
Sample classification was performed using two-layer neural network analysis with 10 hidden neurons and trained with Levenberg-Marquardt backpropagation algorithm with cross-validation with 70% training, 15% test and 15% validation sample division. Classification was performed with sample labels as responsive and non-responsive and using seasonal virus measurements only.
Additional materials and methods are presented in Supplementary Materials.
Data Availability
All the study raw data will be made publicly available upon request.
Funding Sources
This study was supported in part by a COVID-19 Rapid Response grant by the Canadian Institute of Health Research (CIHR) and by a grant supplement by the Canadian Immunity Task Force (CITF) to M-A Langlois and by the National Research Council of Canada Collaborative R&D Initiative Pandemic Response Challenge Program Grant to SAL Bennett and M Cuperlovic-Culf (PR031-1).
Author contributions
YG, VS, EM, GL carried out the experiments. M-AL & YG designed the study. YG, VS, SALB, MCC and M-AL wrote the manuscript. YD contributed to the design of spike, N, ACE2 constructs, expression and purification procedures. RAB sourced specimens, co-designed high-throughput analysis. AMC facilitated serum sample processing. SALB and MCC contributed to sample collection, data processing, and performed all of the sCOV analyses with YG and M-AL. All of the authors have edited and contributed to the review of the manuscript.
SUPPLEMENTARY MATERIALS
Supplemental Methods
SARS-CoV-2 ELISA Antigens and Antibodies
To evaluate the functional antibody responses of sCoV antibodies against different SARS-CoV-2 proteins, SARS-CoV-2 S, RBD, and N proteins were used as coating antigen (see antigen production). The following secondary and control antibodies were used during ELISA experiments.
Secondary antibodies
Anti-human IgG-HRP (NRC anti-hIgG#5-HRP fusion), anti-human IgA-HRP (Jackson ImmunoResearch, 109-035-011), anti-human IgM-HRP (Jackson ImmunoResearch, 109-035-129), anti-human IgE-HRP (Sigma A9667-2ML), anti-human IgG1-HRP (Southern Biotek 9054-05), anti-human IgG2-HRP (Southern Biotek 9060-05), anti-human IgG3-HRP (Southern Biotek, 9210-05), and anti-human IgG4-HRP (Southern Biotek, 9200-05).
Control antibodies
Anti-SARS-CoV-2 S CR3022 Human IgG1 (Absolute Antibody, Ab01680-10.0), Anti-SARS-CoV-2 S CR3022 Human IgA (Absolute Antibody, Ab01680-16.0), Anti-SARS-CoV-2 S CR3022 Human IgM (Absolute Antibody, Ab01680-15.0), Anti-SARS-CoV-2 S CR3022 Human IgE (Absolute Antibody, Ab01680-14.0), Anti-SARS-CoV-2 S CR3022 Human IgG1 (Absolute Antibody Ab01680-10.0), Anti-SARS-CoV-2 S CR3022 Human IgG2 (Absolute Antibody Ab01680-11.0), Anti-SARS-CoV-2 S CR3022 Human IgG3 (Absolute Antibody Ab01680-12.1), and Anti-SARS-CoV-2 S CR3022 Human Ig4 (Absolute Antibody Ab01680-13.12).
Presence of Reactive IgGs to sCoVs
To determine if the pre-COVID-19-pandemic sera contained IgGs against the four sCoVs, the S proteins of 229E and NL63 were purchased from creative diagnostics (DAGC134-HCoV-229E; DAGC133-NL63), while OC43 and HKU-1 S proteins were produced for the study (see Antigen Production below). Using our High Throughput Serological Platform (Hamilton Microlab Star, Biotek 405Ls) we automated the Elisa protocol described in this paper with slight modifications. The serological assay was performed in 384 high binding plates (Thermo Fisher, 460372) using a volume of 12.5 uL to coat each well. Plates were blocked using 80uL of blocking buffer per well. The concentration of antigen, antibodies, and sample diluents were kept as above, although the volume was reduced to 10uL per well. To develop the plates a Luminescent substrate (Thermo Fisher, 37069) was used and RLU measured by a Biotek Neo2 plate reader.
Antigen Production
To produce the RBD of SARS-CoV-2 a plasmid encoding the RBD (MN908947) containing amino acid 319-541 with an N-terminal secretory protein sequence was generously given to us by Dr Florian Krammer (Mount Sinai, NYC). The 229E RBD (P15423) was cloned in pCAGGs plasmid with a hexa-his Tag (C-term) with secretory signal (N-term) and transfected into 293F cells maintained in Freestyle 293 expression media (Thermo Fisher, 12338018) at 37°C, 7% CO2, with shaking (125rpm). A total of 600 millions cells in 200mL of media were transfected with 200ug of the respective plasmid using ExpiFectamine (Thermo Fisher, 14525). Three days post-transfection the cells were spun down at 4000g for 20min at 4°C and supernatants filtered through a 0.22um stericup vacuum filter (Millipore Sigma, S2GPU10RE). The filtered supernatant was purified using a Ni-NTA resin (Qiagen, 30210), and washed four times with a washing buffer containing 20nM imidazole. The protein of interest was then eluted with three column volumes of the elution buffer containing 234mM of imidazole. The eluted volume was concentrated and buffer exchanged for PBS using a 10kDa Amicon filter (Millipore Sigma, UFC901008). Concentration was measured and integrity verified by SDS-Page.
The spike ectodomain (SARS2: MN908947; OC43: AAT84362.1; HKU1: Q0ZME7.1) cDNA constructs with furin site mutated, two stabilizing prefusion proline mutations, the human resistin as trimerization partner and a C-terminal FLAG-(His)6 (SARS2) or FLAG-Twin Strep Tag-(His)6 (OC43 and HKU1) tag were cloned into pTT241 vector and transfected in CHO2353 to generate stable pools. SARS2 spike was purified by IMAC as described previously while OC43and HKU1 spikes were purified by IMAC followed by StrepTrap HP (Cytiva) affinity chromatography according to the manufacturer’s instructions. All purification steps were performed at room temperature. Integrity and purity of the purified spikes was analyzed by SDS-PAGE and analytical size-exclusion ultra-high performance liquid chromatography (SEC-UPLC).
The nucleocapsid (SARS2: YP_009724397; OC43: AY391777; HKU1: HM034837) cDNAs with a C-terminal FLAG-Twin Strep Tag-(HisG)6 tag were synthesized by Genscript (Cricetulus griseus codon bias) and cloned in the cDNA into pTT5™ expression plasmid. Expression was achieved by transient gene expression in CHO55E1 cells using polyethylenimine as a transfection reagent. The protein was purified from the clarified cell culture supernatant harvested at day 7 post-transfection. Following centrifugation and filtration, clarified supernatant was loaded on a Nickel Sepharose Excel column (Cytiva Life Sciences). The column was washed with 50 mM sodium phosphate buffer pH 7.0 containing 25 mM imidazole and 300 mM NaCl and protein was eluted with 50 mM sodium phosphate buffer pH 7.5 containing 300 mM imidazole and 300 mM NaCl. Nucleocapsid protein was further purified by affinity chromatography on a StrepTrap™ XT column (Cytiva Life Sciences) equilibrated in 100 mM Tris pH 8.0, 150 mM NaCl (Buffer W). Following washing with 5 column volumes of Buffer W, bound proteins were eluted with Buffer W containing 50 mM biotin and 1 mM EDTA. Purified nucleocapsid protein was buffer exchanged in DPBS using a Centripure P100 column, sterile-filtered through 0.2 μm membrane, aliquoted and stored at -80°C. All purification steps were performed at room temperature. Integrity and purity of the purified nucleocapsid was analyzed by SDS-PAGE and analytical size-exclusion high performance liquid chromatography (SEC-HPLC).
The human ACE2 (Q9BYF1: aa 20-613: TIEE…WSPY) cDNA with an N-terminal human interleukin-10 signal peptide (MHSSALLCCLVLLTGVRA) followed by a Twin-StreptagII – (His)6 – FLAG tag was synthesized by Genscript with codon-optimization for expression in CHO cells.A biotin acceptor peptide sequence (BAP: GLNDIFEAQKIEWHE) was added in-frame at the C-terminus of ACE2. The cDNA was cloned into pTT5 and ACE2-BAP cDNA was expressed by transient gene expression in CHO55E1 cells as described above with the addition of 5% (w:w) of pTT5-BirA (E. coli biotin ligase) expression plasmid. Clarified culture supernatant harvested at 8 days post-transfection was purified by IMAC on nickel Sepharose excel as described above. IMAC eluate was loaded on a Strep-Tactin XT Superflow (IBA, Gottingen, Germany), following manufacturer’s instructions. Pooled Strep-Tactin eluate (buffer-exchanged into DPBS) was stored at -80°C.
Acknowledgments
The authors would like to thank Anne-Claude Gingras for helpful comments on the manuscript. M.-A.L. holds a Canada Research Chair in Molecular Virology and Intrinsic Immunity. S.A.L.B holds a University Research Chair in Neurolipidomics. Y.G. holds a Canadian Institute of Health Research (CIHR) Frederick Banting and Charles Best graduate scholarship (CGS-M).
Footnotes
Author middle initial added Patrick M. Giguere