Persistence and detection of anti-SARS-CoV-2 antibodies: immunoassay heterogeneity and implications for serosurveillance

Serologic studies have been critical in tracking the evolution of the COVID-19 pandemic. The reliability of serologic studies for quantifying the proportion of the population that have been infected depends on the extent of antibody decay as well as on assay performance in detecting both recent and older infections. Data on anti-SARS-CoV-2 antibodies persistence remain sparse, especially from infected individuals with few to no symptoms. In a cohort of mostly mild/asymptomatic SARS-CoV-2-infected individuals tested with three widely-used immunoassays, antibodies persisted for at least 8 months after infection, although detection depended on immunoassay choice, with one of them missing up to 40% of past infections. Simulations reveal that without appropriate adjustment for time-varying assay sensitivity, seroprevalence surveys may underestimate infection rates. As the immune landscape becomes more complex with naturally-infected and vaccinated individuals, assay choice and appropriate assay-performance-adjustment will become even more important for the interpretation of serologic studies.


Introduction
Serosurveys have played an important role during the COVID-19 pandemic by helping track the true extent of transmission in different populations (1)(2)(3)(4), and estimating key epidemiologic indicators such as the infection fatality ratio (IFR) (5)(6)(7). More than 400 serosurveys were published by the end of 2020, using dozens of different immunoassays, designed to detect antibodies targeting primarily all or part of the spike (S) or nucleocapsid (N) proteins of the SARS-CoV-2 virus (8). The accuracy of serology-based estimates depends on the immunoassay antibody targets and their performance in detecting both recent and historic infections. At this stage of the pandemic, successive epidemic waves in different parts of the world create a diverse mix of people infected over different times in the past, challenging SARS-CoV-2 serosurveillance because of this increasingly heterogeneous immunological landscape.
Anti-SARS-CoV-2 antibody levels tend to decay after the convalescent period, which can lead to increasing chances that immunoassays provide a negative result (9,10). When immunoassays are used as a proxy for historic infections, as typically done in serosurveys, these are considered false negatives. Furthermore, post-infection antibody kinetics appear to be differential by infection severity, with severe infections leading to larger increases in antibodies than mild or asymptomatic infections (11). False negatives can lead to severe underestimates of the true infection attack rate unless appropriately understood and accounted for in analyses (12). However, few studies have characterized antibody kinetics past six months after infection (13,14), and few have described these kinetics in mild and asymptomatic infections (15)(16)(17)(18), which comprise the vast majority of infections in the community (19). Quantifying the changes in antibody detection for available immunoassays across the spectrum of infection severity and age is therefore essential in providing robust estimates of key epidemiological features of the COVID-19 pandemic (12,20).
Here we quantify changes in anti-SARS-CoV-2 antibody levels up to 9 months from plausible dates of infection in a cohort of seropositive individuals recruited through serosurveys conducted in Geneva, Switzerland. We use three different immunoassays to compare test performance, including the reduction in sensitivity as time since infection increases. Our results provide critical insights into the interpretation of population-based serologic studies as the immune landscape becomes more complex due to a mix of recent and distal infections and vaccination.

Cohort recruitment and characteristics
For this study, participants recruited between April and July 2020 through serosurveys conducted in the canton of Geneva, Switzerland (2,29), returned for a follow-up blood draw in November 2020 (Figure 1a). A total of 354 participants from those surveys that had a positive Euroimmun anti-S1 IgG (hereafter EI) test result at baseline constituted the EIpositive cohort. Participants in this cohort were aged between 18 and 84 years, and 52% (183/354) were women (Table 1). Less than half reported having had an RT-PCR test prior to the baseline visit (148, 42%), 58 of whom reported a positive result (90 negative, positivity rate of 39%). Ten percent (37/354) of participants reported having had no COVID-19compatible symptoms before the baseline visit, while 69% reported 4 symptoms or more. The four most frequently reported COVID-19-compatible symptoms were fever (66%), fatigue (63%), headache (58%) and taste and smell loss (55%) ( Figure S1). The majority of these participants did not require hospitalization (334/354, 94%), and only 2 required a stay in an ICU (0.6%, with missing information for 5 hospitalized participants). The median period between baseline and follow-up visits was 165 days (range: 115-224 days) (Figure 1b). Twenty percent (71/354) of participants reported having performed a SARS-CoV-2 virologic test (RT-PCR or rapid antigen) between visits including 4 (6%) reporting a positive result. No participant reported to have been hospitalized between their baseline and follow-up visits.
We also followed a cohort of 187 participants who had a negative EI test result at baseline (EI-negative cohort) selected from previous participants to have a similar sex ratio and age range as the EI-positive cohort (Table 1). Three individuals (3/18) in this EI-negative cohort reported having a positive RT-PCR test prior to the baseline visit. Thirty eight percent of participants reported having had no COVID-19-compatible symptoms and none reported having been hospitalized prior to the baseline visit. A total of 61 participants in the EInegative cohort (33%) reported having had a SARS-CoV-2 virologic test (RT-PCR or rapid antigen) between visits with 15 (25%) reporting a positive result.

Antibody detection and decay with three immunoassays
In addition to the Euroimmun anti-S1 IgG test, samples from both study visits were tested with two Roche total Ig assays, one targeting anti-RBD and the other anti-N antibodies (Materials and Methods). Within the EI-positive cohort, 93.2% (330/354) and 95.2% (337/354) were positive for the Roche-N and Roche-RBD assays, respectively (2-by-2 confusion matrices given in Fig. S2). At the individual-level, 26% (91/354) of those in the EIpositive cohort became seronegative with the EI test at followup (i.e., sero-reverted, Table  2). Sero-reversions were much less frequent with the Roche-N assay (1.2%, 6/330) and none were detected with the Roche-RBD assay. We identified no significant differences in the proportion of participants seroreverting across age groups and sex for all three tests (Table S1).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2021. ; Beyond seroreversions, we quantified the change in test readout between visits for the Roche-RBD immunoassay, as the other two tests are considered qualitative or semiquantitative by the manufacturers. We found that 17% (61/354) of participants had a significant decrease in their test readout and 66% (235/354) had a significant increase. When subdivided by sex and age class, women had a significantly lower proportion of decaying as well as a higher proportion of increasing Roche-RBD responses, and no differences were found between age classes (Table S1).

Figure 2: Test readout trajectories between baseline and follow-up visits.
The cohort was composed of 354 participants with positive Euroimmun anti-S1 (EI) test at baseline.Test readout units and thresholds for positivity are assay-specific (Materials and Methods), Roche-RBD values below the limit of quantitation (0.4 U/mL) were set to the limit of quantitation for plotting and analysis. The dynamic range of both the EI and Roche-N tests are limited compared to the Roche-RBD thus leading to censoring of extremely high and low values. Baseline and follow-up samples were tested with different reagent lots of the EI immunoassay whereas the same Roche-N and Roche-RBD reagent lots were used for all samples (Appendix, section S2). Trajectories for the EI-negative cohort are given in Figure S3. Table 2: Serostatus and test readout changes between visits. Serostatus changes are given with respect to the baseline number of positives (negatives) for reversion (conversion) for each test. Statistics by sex and age group given in Table S1. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

Estimation of time-varying test sensitivity and impact on serosurveillance
The trajectories of antibody detection in both cohorts, combined with data from assay validation studies ( We estimated large differences in sensitivity between tests, which depended on the delay between infection and the date of serologic assessment (Fig. 3a). After an initial rise in the first few weeks post infection, sensitivity peaked after 52 days ( . Sensitivity estimates of Roche-RBD remained close to the peak value up to the maximal modeled time, with a 88% posterior probability that sensitivity was still increasing after 284 days. No significantly different results were obtained for both Roche tests when restricting the analysis to the 127 samples mentioned above (Fig. S8). We note that these estimates account for the 17.4% (95% CrI: 11.9-23.6) probability of SARS-CoV-2 infection between visits that is jointly estimated in the modelling framework (Materials and Methods, Appendix section S3).
To explore the potential implications of not accounting for time-varying sensitivity when estimating the proportion of the population infected through serosurveys, we simulated an array of serosurveys with different shapes and using different immunoassays. We simulated serosurveys in epidemics shaped like that of Geneva, and conducted either i) shortly after a single wave as the one in spring 2020, ii) long after this single wave, or iii) after two successive waves (Materials and Methods). When serosurveys were simulated just after a . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2021. ; single epidemic wave (i), with times post-infection ranging from 0 to 115 days (Fig. 3b), seroprevalence estimates for all three tests using conventional adjustment for sensitivity have 95% credible intervals covering the true values of simulated seroprevalence between 10 and 30%, after which estimates had a slight tendency towards overestimation. Instead, when simulating serosurveys longer after a single wave (180-250 days post-infection), as the true underlying seroprevalence increased, estimates based on the EI assay grew increasingly biased, with an under-estimation of up to 15% when the simulated seroprevalence was 90% (Fig. 3c). In contrast, estimates based on the two Roche assays tended to overestimate seroprevalence by 5% in this scenario. Finally, the results obtained for the two epidemic waves scenario stood in between, with a less severe seroprevalence underestimation for EI test (10% underestimation at 90% seroprevalence). The two Roche tests remained in closer agreement with the true seroprevalence throughout the simulated range with the 95% CrI covering the true value at seroprevalence below 30%, and slightly overestimating by around 3% it for higher seroprevalence (>60%). b-c-d) Simulation scenarios of seroprevalence estimation if the decay in sensitivity is not accounted for. Scenario in b) is assumed to occur one month after the first epidemic wave peak in Geneva, with corresponding distribution of days between infection and the serosurvey ; scenario in c) the serosurvey occurs after a single wave and 180 days after the epidemic peak; and scenario in d) assumes the serosurvey occurred one month after . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2021. ; the peak of the second epidemic wave, yielding a bimodal distribution of days post infection (insets, vertical dashed line at x=0 indicates infections that occurred on the serosurvey date).

Discussion
In a cohort of mostly mild/asymptomatic SARS-CoV-2-infected individuals, we found that antibodies targeting either the nucleocapsid (N) or the spike (S) proteins of the virus generally persist for at least 8 months after infection, although their detection depends on immunoassay choice. We found that the initial measurements taken within 4.5 months of participants' infections were consistent across the three assays used. However, results diverged between assays upon re-evaluation 4-8 months later, with about one-in-four participants sero-reverting according to the EI IgG assay, as opposed to the Roche anti-N and anti-RBD total Ig tests, which respectively detected only few and no sero-reversions. Through simulation analyses, we show that without appropriate quantification and adjustment for time-varying assay sensitivity, seroprevalence surveys may underestimate the true number of cumulative infections in a population.
The persistence of seropositivity in this cohort of mostly mild infections provides encouraging prospects for the continued use and interpretation of population-based serosurveys to track the progress of the pandemic. However, we show that the differences in seropositivity rates between commercially-available tests that had been observed in the early phase of the COVID-19 pandemic can be amplified at longer follow-up times (21)(22)(23). Decaying assay sensitivity like the one we observed here for the EI assay may lead to the under-estimation of cumulative attack rates as shown by our simulations, and seroprevalence corrections may be warranted (4). We note that the simulations in this work are based on the epidemic curve in Geneva, Switzerland, and that seroprevalence estimation bias caused by decaying sensitivity will depend on the timing of serosurveillance with respect to the number and amplitude of preceding epidemic waves, with larger and more distal waves being more severely under-estimated. Even for mild infections, which are thought to elicit less robust immune responses (24), the sensitivity of anti-RBD and anti-N total Ig Roche tests remained close to 100% after more than 8 months post-infection. These results suggest that both Roche immunoassays are suitable for seroprevalence estimation at longer times postexposure.
From an immunological perspective these results raise questions related to the dynamics and duration of immunity against SARS-CoV-2 infection. Previous studies suggest that anti-RBD antibody measurement correlates with neutralization titers at least up to 4 to 6 months post exposure (11,25). Given that we did not perform neutralization assays, it remains unclear whether the persistence of antibodies we report here are a proxy of continued immune protection. We also highlight the discrepancy between the decrease of anti-S1 IgG as measured by the Euroimmun assay and the increase of total anti-RBD Ig measured by the Roche-RBD assay, the latter having been observed in other studies using the same test, which however correlated poorly with neutralizing antibody measurements (16). The increase in readout values from the Roche-RBD assay, and to a lesser extent for Roche-N, over time for many participants may be explained by aspects of the assay design which leads to preferential binding of higher avidity antibodies (which increase after infection) or . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2021. ; https://doi.org/10.1101/2021.03.16.21253710 doi: medRxiv preprint due to the fact that they are measuring total immunoglobulins versus a single isotype (26).
Our results come with a number of limitations. While our estimates of seroreversion with the EI assay were in line with previously published data (10,27,28), we tested baseline and follow-up samples with the EI assay at different times and with different reagent lots (inter-lot coefficient of variation of 30%, Fig. S4). Furthermore, our internal quality control (pooled COVID-19 positive patient serum) in the follow-up lot had significantly lower readout values than most lots used for baseline samples (Supplementary Appendix section S2). To explore the potential effects of this lot-to-lot variability in our assessment of EI performance over time, we conducted sensitivity analyses by matching tests run in one baseline batch that had similarly low readout estimates as the follow-up batch (Fig. S5) and found similar estimates of the proportion who sero-reverted and time-varying sensitivity ( Fig. 3a and Supplement). While this reagent inter-lot variability did not appear to have great impacts on our specific results, standardization of readout values (e.g., through the use of monoclonal antibodies) can help ensure compatibility between lots and labs. Secondly, statistics on changes in serostatus and test response may have been influenced by (re-)exposures during the period between baseline and follow-up visits. We attempted to account for the observed 17% seroconversion rate among initially seronegatives in the model of time-varying sensitivity but were unable to do so explicitly in other analyses. Finally the cohorts in this study did not include children, whose long-term antibody dynamics have not yet been well documented in the published literature.
Through quantifying anti-SARS-CoV-2 antibody persistence in a cohort of mostly mildly symptomatic and asymptomatic infections, we confirm that antibodies remain detectable after at least 8 months post-infection. Using multiple immunoassays, we illustrate that test choice matters and can greatly affect the interpretation of results from population-level serologic studies, especially as the immune landscape becomes a more complex mix of recent and old infections. While the Roche anti-RBD total Ig assay used in these analyses appears to have excellent performance across the spectrum of clinical severity over time, measurement of antibody responses targeting the spike RBD alone in partially vaccinated populations will be difficult to interpret and multi-epitope tests may become increasingly useful. Continued multi-assay, multi-epitope characterization of post-infection kinetics over longer periods can help continue to allow for appropriate analyses and interpretation of data from serosurveillance efforts aimed at tracking the evolution of this pandemic.

Recruitment
The results presented here correspond to the follow-up of participants recruited between April and July 2020 at one of two previous serosurveys: SEROCoV-POP (2) and SEROCoV-WORK+ (29). Selected participants from these two studies who were seropositive on the at their first study visit (referred to as 'baseline') were invited to return for another serologic test in November 2020 (referred to as 'follow-up'). To estimate the infection risk in the community over the period between the two study visits, we also selected 187 participants initially negative on the Euroimmun anti-. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2021. ; S1 test with similar sex ratio and age range who returned for a second visit. Procedures in all cases (first two serosurveys for baseline visit and follow-up visit) were similar: all participants gave written informed consent, completed a questionnaire and provided a venous blood sample. This study was approved by the Geneva Cantonal Commission for Research Ethics (CCER project number 2020-00881).

Statistical Analyses
We determined the proportion who seroconverted (negative to positive) or seroreverted (positive to negative), and test the significance in proportions between sex and age class using a 2-sample test for equality of proportions with continuity correction (Table S1). We also compared test readout values at each visit and classified each participant's response as decreasing, increasing or stable for Roche-RBD test readouts (the only quantitative test in this study). We assessed significance of changes in test readout values between visits taking into account the inter-lot variance of our internal positive control serum. The coefficient of variation of the Roche-RBD test readout was set to 7.6% ( Supplementary Fig.  S4). Significance of response changes was based on the z-score of the difference between follow-up and baseline results at a significance level of 5%.
We developed a statistical model to jointly infer each test's specificity and sensitivity accounting both for changes in sensitivity with time post-infection due to antibody decay as well as possible unknown SARS-CoV-2 infection times including the possibility of infection between visits (Fig. S6). Inference is drawn in a Bayesian framework that allows to incorporate multiple sources of test validation data as well as RT-PCR test results when available (details given in Supplementary Appendix, sections S3 and S4).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 20, 2021. ; We then used simulations to illustrate how seroprevalence estimates could be biased if only correcting for sensitivity and specificity using the Gland-Rogen estimator (30) with data from typical validation studies in the literature and package inserts (single time invariant sensitivity, short follow-up times and more representative of severe infections). We considered three hypothetical scenarios of serosurveillance sampling using the infection histories associated with Geneva's epidemic curve as an example, the first one occurring one month after the peak of the first wave, the second one as if it had occurred five months after the single wave, and the third one month after the peak of the second wave (Fig. 1). This results in distinct distributions of time since infection, unimodal around short (~1 month) and longer (~5 months) times post infection for the first and second, and bimodal with a peak at short and one at longer times (~8 months) post infection for the third. We then used these distributions to simulate test results based on our estimates of specificity and time-varying test sensitivity. We finally estimated the seroprevalence correcting for test performance, but using the conventional approach with a single value for sensitivity, inferred in a Bayesian hierarchical framework that incorporates multiple validation sources (31). For each scenario we simulated 2000 samples with seroprevalence ranging from 10 to 90% and compared these simulated data to the seroprevalence estimates that ignore changes in sensitivity. Details on seroprevalence estimation using the "conventional" approach are described in the Supplementary Appendix section S4.