Abstract
Background The SARS-CoV-2 pandemic necessitated rapid and global responses across all areas of healthcare, including an unprecedented interest in serological immunoassays to detect antibodies to the virus. The dynamics of the immune response to SARS-CoV-2 is still not well understood.
Methods We measure SARS-CoV-2 antibody levels in plasma samples from 880 people in Northern Ireland by Roche Elecsys Anti-SARS-CoV-2 IgG/IgA/IgM, Abbott SARS-CoV-2 IgG and EuroImmun IgG SARS-CoV-2 ELISA immunoassays to analyse immune dynamics over time. Using these results, we develop a ‘pseudo gold standard’ reference cohort against which to assess immunoassay performance. We report performance metrics for the UK-RTC AbC-19 rapid lateral flow immunoassay (LFIA) against a characterised panel of 304 positives established using the ‘pseudo gold standard’ system and 350 negative samples.
Results We detect persistence of SARS-CoV-2 IgG up to 140 days (20 weeks) post infection, across all three antibody immunoassays, at levels up to 4.4 times the cut-off for a positive result by Roche measurement. Using our ‘pseudo gold standard’ cohort (n=348 positive, n=510 negative) we determine the sensitivity and specificity of the three commercial immunoassays used (EuroImmun; Sens. 98.9% [97.7-99.7%]; Spec. 99.2% [98.4-99.8%]; Roche; Sens. 99.4% [98.6-100%]; Spec. (96.7% [95.1-98.2%]; Abbott; Sens. 86.8% [83.1-90.2%]; Spec. (99.2% [98.4-99.8%]). The UK-RTC AbC-19 lateral flow immunoassay using shows a sensitivity of 97.70% (95.72%-99.34%) and specificity of 100% (100.00-100.00%).
Conclusions Through comprehensive analysis of a large cohort of pre-pandemic and pandemic individuals, we show detectable levels of IgG antibodies, lasting up to 140 days, providing insight to immunity levels at later time points. We propose an alternative to RT-PCR positive status as a standard for assessing SARS-CoV-2 antibody assays and show strong performance metrics for the AbC-19 rapid test.
Introduction
The World Health Organization declared a pandemic in March 2020 due to severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), identified late 2019 in Wuhan, China, causing COVID-19 disease1,2.
A global race ensued to develop diagnostic assays, with the most common being viral RNA detection (RT-qPCR assays), to detect acute infection3. RT-qPCR assays are labour and reagent intensive, limited by a short temporal window for positive diagnosis, and exhibit potential for false negative results4. Evidence suggests sensitivity of RT-qPCR can be as low as 70%5. Lockdown measures and “flattening the curve” strategies meant many infected individuals were instructed to self-isolate and were not offered a diagnostic RT-qPCR, with much of the testing limited to patients admitted to hospital, who perhaps reflect a more severely infected cohort. Consequently, a potentially large number of cases were unconfirmed or undetected6.
The ability to accurately detect SARS-CoV-2 specific antibodies, which develop after an immune response is evoked, is vital for building biobanks of convalescent sera for treatment, monitoring immune response to infection and assessing responses to vaccination programmes. Knowledge of antibody levels, indicative of prior virus exposure, will play an important role in public health policy and understanding SARS-CoV-2 epidemiology, including determining seroprevalence. Knowledge of the timing for when antibody levels can be measured is essential to allow accurate reporting.
Commercial serology immunoassays are mostly laboratory-based and measure IgG antibody levels in plasma or serum. Alternatively, lateral flow immunoassays (LFIAs), require a finger prick blood sample and can be used at point-of-care (POC) or in the home; particularly important in the context of lockdown enforcement during the pandemic. Currently, a limited number of laboratory-based chemiluminescence immunoassays are approved for use in the UK including the Roche Elecsys Anti-SARS-CoV-2 IgG/IgA/IgM against the SARS-CoV-2 Nucleocapsid antigenic region (Roche Diagnostics, Basel, Switzerland) and the Abbott SARS-CoV-2 IgG assay against the same antigenic region (Abbott Diagnostics, Abbott Park, IL, USA).
The complexities of the humoral immune response to SARS-CoV-2 is a much-debated topic. The ‘sero-silent’, those who do not make IgG, may remain asymptomatic or conversely may become particularly unwell and unfortunately die7. Others produce IgG against different antigenic regions of the virus such as the spike or nucleocapsid protein. Patients who remain asymptomatic may mount a humoral immune response which is short-lived, with detectable levels of antibody falling after 2 months8. This, alongside potentially low sensitivity and lack of RT-PCR test availability across the UK has hindered development of well characterised serology samples known to contain IgG antibodies to SARS-CoV-2.
Herein, we describe use of Roche and Abbott commercial immunoassays, as well as the EuroImmun Anti-SARS-CoV-2 ELISA-IgG against the S1 domain of the spike antigenic protein of SARS-CoV-2 (EuroImmun UK, London, UK) to characterise a large number (880) of pre-pandemic and pandemic COVID-19 blood samples from within Northern Ireland and report on longevity of IgG antibodies detected. Presently, there is no gold standard assay for comparison. Therefore, we describe a ‘pseudo gold-standard’ against which to evaluate assays. Furthermore, we present results of an independent validation of the UK-RTC AbC-19 POC LFIA against a cohort of 304 known positives according to this ‘pseudo gold-standard’ system and 350 known negative samples for IgG to SARS-CoV-2.
Methods
Study Design
This study was approved by Ulster University Institutional Ethics committee (REC/20/0043), South Birmingham REC (The PANDEMIC Study IRAS Project ID: 286041Ref 20/WM/0184) and adhered to the Declaration of Helsinki and Good Clinical Practice.
Participant samples
The flow of participant samples is summarised in Figure S1. All participants provided informed consent with no adverse events. An online recruitment strategy was employed, with the study advertised through internal Ulster University email, website and social media. A BBC Newsline feature providing the pandemic study email address also prompted interest from the general population.
The first 800 respondents who expressed interest were provided with an online patient information sheet, consent form and health questionnaire and invited to register to attend a clinic. Exclusion criteria related to blood disorder or contraindication to giving a blood sample. To enrich the cohort for positive samples, further participants were invited if they had previously tested PCR positive or had the distinctive symptom of loss of taste and smell. Blood sampling clinics were held at locations around Northern Ireland between April and July 2020 resulting in collection of 263 10ml EDTA plasma samples from 263 separate study participants. A small cohort (n=19) of anonymised plasma samples were obtained from Avellino, USA. Additional anonymised plasma samples were obtained from Southern Health and Social Care Trust (SHSCT) Biochemistry Laboratory (n=195), and Northern Ireland Blood Transfusion Service (NIBTS, n=184) through convalescent plasma programs.
Pre-pandemic samples (prior to June 2019, n=136) were obtained from Ulster University ethics committee approved studies with ongoing consent and from NIBTS (n= 200, more than 3 years old). Plasma samples were used at no more than 3 freeze-thaw cycles for all analyses reported within this manuscript.
Clinical information
Basic demographic information and data about positive RT-PCR result and time from symptom onset was provided by PANDEMIC study participants through the secure online questionnaire. Anonymised participant samples from USA, SHSCT and NIBTS were provided with age, gender and time since PCR-positive, where a previous test had been carried out.
Laboratory-based immunoassays
Details of laboratory immunoassays are summarised in supplementary methods and Table S1.
UK-RTC AbC-19 LFIA
UK-RTC AbC-19 POC LFIA testing was conducted at Ulster University according to manufacturer’s instructions (details in Table S1). Assays were performed as cohorts, with samples in batches of 10, with one researcher adding 2.5μL of plasma to the assay and a second adding 100μL of buffer immediately following sample addition. After 20 minutes, the strength of each resulting test line was scored from 0-10 according to a visual score card (scored by 3 researchers; Figure S2). A score ≥1 is positive. Details of samples for analytical specificity and sensitivity analysis are available in Supplementary methods.
Statistical analysis
As per Daniel WW9 a minimum sample size based on prevalence can be calculated using the following formula: , where n = sample size, Z = Z statistic for a chosen level of confidence, P = estimated prevalence, and d = precision. Assuming a prevalence of SARS-CoV-2 of 10% and a precision of 5%, we estimate that the required sample size at 99% confidence (Z = 2.58) to be 240 individuals. If the true prevalence is lower, 5%, the estimated required sample size given a precision of 2.5% is 506 individuals. A minimum sample size of 200 known positives and 200 known negatives was required for validation as per MHRA guidelines for SARS-CoV-2 LFIA antibody immunoassays10.
Statistical analysis was conducted in in R v 4.0.211. To assess discordance between test results, data was first filtered to include individuals with an Abbott test result in the range ≥0.25 & ≤1.4, with a 2 x 2 contingency table produced that comprised all possible combinations of [concordant|discordant] test results [within|outside of] this range. A p-value was derived via a Pearson χ2 test after 2000 p-value simulations via the stats package.
To define a gold-standard, a positive result was determined as any individual who was positive by 2 out of 3 immunoassays, while negative was defined as negative by 2 out of 3 immunoassays. EuroImmun Borderline results were excluded. Sensitivity, specificity, and accuracy were then derived via the pROC package, with 95% confidence intervals produced after 2000 bootstraps. ROC analysis was performed via the pROC package. To compare test result (Positive|Negative) to age, a binary logistic regression model was produced with test result as outcome – a p-value was then derived via χ2 ANOVA. To compare time against test result (encoded continuously), a linear regression was performed. We calculated median per time-period and then converted these to log [base 2] ratios against the positivity cut-off for each assay. All plots were generated via ggplot2 or custom functions using base R12.
Results
Antibody levels in plasma from 880 individuals were assessed using the three SARS-CoV-2 immunoassays; EuroImmun IgG, Roche Elecsys IgG/IgM/IgA and Abbott Architect IgG (Table S1). This included a negative cohort of 223 pre-pandemic plasma samples collected and stored during 2017 to end of May 2019 to determine assay specificity. Of the 657 participants whose samples were collected post-pandemic, 265 (40.33%) previously tested RT-PCR positive with a range of 7-173 days since diagnosis. A total of 225 participants gave time since self-reported COVID-19 symptoms, with a range of 5-233 days from symptom onset, whilst 198 had no symptom or PCR data available.
Laboratory based antibody immunoassays
A positive result for antibody on one or more of the three laboratory immunoassays was recorded for 385/657 (58.6%) post-pandemic participants. By EuroImmun ELISA, 346 were positive, 20 borderline and 291 were negative. The Roche assay detected 380 positive and 277 negative, whilst Abbott determined 310 positive and 347 negative (Table S2). The median age across all age groups combined was lower for participants testing positive across each of the immunoassays (median [sd] for positive versus negative, respectively: EuroImmun, 41 [13.16] vs 48 [12.95]; Roche, 42 [13.08] vs 48 [13.00]; Abbott, 41 [13.18] vs 47 [13.09]). (Figure S3, p<0.0001). When segregated by age group, however, differences were less apparent in certain groups (Figure S4). Excluding the pre-pandemic cohort, this gap reduced but remained statistically significant EuroImmun, 41 [13.18] vs 45 [12.49]; Roche, 42 [13.15] vs 45 [12.49]; Abbott, 41 [13.26] vs 44 [12.63]) (p<0.01) (median [sd] for positive versus negative). Of note, out of 265 individuals with a previous positive RT-PCR result, 14 (5.2%) did not show any detectable antibodies by all three immunoassays, with no association with age, gender or time between test and blood draw (data not shown).
The three commercial laboratory immunoassays provide a ratio value that increases with IgG antibody titre. When correlation between these values is assessed, good overall agreement is observed between the three immunoassays (Figure 1, Figure S5). As highlighted by Rosadas et al., we also see significant disagreement in the Abbott 0.25-1.4 range when compared to EuroImmun and Roche (Figure 1a,b; chi-square p-values: EuroImmun vs Abbott, p<0.001; Roche vs Abbott, p<0.001)13.
Duration of humoral response to SARS-CoV-2
We found IgG antibodies could still be detected in individuals (excluding pre-pandemic) across all three immunoassays used up to week 20 (day 140) (Figure 2). We note a statistically significant decrease in signal with respect to time across each assay (p-value [slope]): EuroImmun, p=0.036 [-0.785]; Roche, p=0.002 [-0.125]; Abbot, p<0.0001 [-3.585]. These remained statistically significant after adjustment for age. Antibody levels (expressed as a ratio of median result per timepoint divided by positivity cut off; Figure 2d) peaked at Week 1-2 for EuroImmun (1.33) and Abbott (1.64), though reached highest levels at Week 8-12 when measured by Roche (5.45). By week 21-24, median score for all tests had dropped below the positivity cut off, though a small number of RT-PCR positive samples remained above the positive cut off at these later timepoints (Figure 2).
Developing a ‘pseudo gold-standard’
Use of a previous positive RT-PCR for SARS-CoV-2 viral RNA as a gold standard for the presence of IgG has limitations, therefore we developed a ‘pseudo gold-standard’ based on laboratory immunoassay antibody results, in which a positive result by two or more immunoassays classes a sample as antibody positive (n=348), whilst a negative result on two or more immunoassays classes a sample as negative (n=510). EuroImmun borderline results were excluded from this analysis (n=22). When assessed against the pseudo-gold standard, EuroImmun performed with highest accuracy (Sens. 98.9% [97.7-99.7%]; Spec. 99.2% [98.4-99.8%]; Acc. 99.1% [98.4-99.7%]). Roche showed highest sensitivity and high accuracy (Sens. 99.4% [98.6-100%]; Spec. (96.7% [95.1-98.2%]; Acc. (97.8% [96.7-98.7%]), whilst Abbott performed poorly on sensitivity but had the highest specificity (Sens. 86.8% [83.1-90.2%]); Spec. (99.2% [98.4-99.8%]); Acc. (94.2% [92.7-95.7%]).
Area under the curve indicates best performance by EuroImmun 0.99 (0.984; 0.997), followed by Roche 0.98 (0.972; 0.989), then Abbott 0.93 (0.912; 0.948) (Figure 3).
UK-RTC AbC-19
Using the commercial immunoassays described we established a well characterised serology sample set of ‘known positive’ and ‘known negative’ for IgG antibodies to SARS-CoV-2 to evaluate performance metrics for the UK-RTC AbC-19 Rapid POC LFIA.
This LFIA detects IgG antibodies against the spike protein antigen, so we therefore required all samples to be positive by the EuroImmun SARS-CoV-2 IgG ELISA, which likewise detects antibodies against the S1 domain14. In line with our ‘pseudo gold standard’ system, samples were also required to be positive by a second immunoassay (Roche or Abbott). To analyse specificity, we assessed 350 plasma samples from participants classed as ‘known negative’ on the AbC-19 LFIA. All samples were from individuals confirmed to be negative across all three laboratory assays (Roche, EuroImmun, Abbott). Using these positive n= 304 and negative n=350 antibody cohorts, we determined a sensitivity of 97.70% (95.72%-99.34%) and specificity of 100% (100.00-100.00%) for the AbC-19 LFIA (Table 1).
When used as intended in POC settings, the AbC-19 LFIA provides binary positive/negative results. However, when assessing LFIA in the laboratory, each test line was scored against the scorecard by three independent researchers (0 negative, 1-10 positive; Figure S2). Compared to quantitative outputs from the Abbott, EuroImmun and Roche assays, the AbC-19 LFIA shows strong correlation (Abbott r=0.86 [p<0.001]; EuroImmun r=0.88 [p,0.001]; Roche r=0.83 [p<0.001]; Figure 4, Figure S5).
Analytical specificity and sensitivity of AbC-19 test
We observed no cross-reactivity across samples with known H5N1 influenza, Respiratory syncytial virus, Influenza A, Influenza B, Bordetella Pertussis, Haemophilus Influenzae, Seasonal coronavirus NL63 and 229E on the AbC-19 LFIA. (n=34 samples, n=8 distinct respiratory viruses; Table S3). Against a panel of external reference SARS-CoV-2 serology sample, the AbC-19 LFIA detected antibodies with scores commensurate to the EuroImmun ELISA scores (Figure S6, Table S4).
Discussion
Serological antibody immunoassays are an important tool in helping combat the SARS-CoV-2 pandemic. One difficulty faced in validation of antibody diagnostic assays has been access to samples with known SARS-CoV-2 antibody status. As previously described, there is no clear gold standard for reference against which to assess SARS-CoV-2 immunoassays. A positive RT-PCR test has been used previously as a reference standard, although limited by a high rate of false negatives, failure in some cases to develop IgG antibodies (sero-silence or lack of antibody against the same antigenic component of the virus as the immunoassay uses as a capture antigen), or the lack of RT-PCR testing availability early in the pandemic3,5,15. Self-assessment of symptoms for COVID-19 disease is a poor indicator of previous infection, even among healthcare workers16. Asymptomatic individuals may be unaware of infection and others may harbour pre-existing immunity or elucidate a T cell response.
Our results show strong correlation between all three immunoassays, with shortcomings in the Abbott system output 0.25-1.4 range, as described previously, suggesting an overestimated positive cut-off (Figure 1)13. Our detection of antibodies 140 days after RT PCR positive status (20 weeks, and beyond in a small number of samples) indicates persistence IgG antibodies to both the spike protein and nucleocapsid protein, despite typical patterns of antibody decay after acute viral antigenic exposure being rapid17. Where others have reported SARS-CoV-2 antibodies decline at 90 days, we also noted a statistically significant decline over time but levels remain detectable at 140 days (Figure 2). We note that IgG levels reach their peak as late as Week 8-12 (Roche ratio 5.45 times threshold cut-off)18, though this may be an artefact of lower number of participants at earlier timepoints (Figure 2d). Longitudinal studies on SARS-CoV convalescent patients suggests that detectable IgG can still be present as long as 2 years after infection20. Further studies are needed on large cohorts with sequential antibody immunoassays performed on symptomatic and non-symptomatic individuals as well as those with mild and severe COVID-19. This is vital to inform vaccine durability, so-called ‘immune passports’ and in the definition of a protective threshold for anti-SARS-CoV-2 antibodies. This broader detection window supports the use of these immunoassays for seroprevalence screening19.
To assess sensitivity and specificity, we developed a ‘pseudo gold-standard’ against which to analyse assays, which does not rely on a single test as reference. A similar approach was used in a recent seroprevalence study in Iceland, whereby two positive antibody results were required to determine a participant positive15. The sensitivity and specificity we observed for these laboratory immunoassays did not differ greatly from Public Health England evaluations, though EuroImmun performed with higher sensitivity (98.9%) than the reported 72% and Abbott with lower (86.8%) than the reported 92.7%21–23.
Our evaluation of performance metrics for the rapid POC finger-prick AbC-19 LFIA gave 97.7% sensitivity and 100% specificity. Strong correlation was observed in quantitative score between results on all immunoassays with the highest observed between EuroImmun and AbC-19 LFIA (Figure S5). This is to be expected, given both the AbC-19 LFIA and EuroImmun ELISA detect antibodies against spike protein. For the assessment of immunity to prior natural infection as well as to immunisation, it is important to note IgG antibodies against SARS-CoV-2 spike protein detected by laboratory-based EuroImmun ELISA and AbC-19 POC LFIA are known to correlate with neutralizing antibodies, which may confer future immunity 24,25.
The ‘pseudo gold-standard’ system may artificially raise the threshold for positive sample inclusion, by requiring a positive result by two immunoassays. However, similar issues have been raised when using previous RT-PCR result or definitive COVID-19 symptoms as inclusion criteria given these will likely skew a cohort towards more severe disease, which may increase immune response5. In the absence of a clear gold standard test, our ‘pseudo-gold standard’ system relies on no single test and instead takes an average of three. We observed a high number of positive participants in our cohorts (n=348; 40.9%), a reflection of biases in the recruitment methods, enriching for positive samples.
The validation of the AbC-19 LFIA reported excellent performance metrics and we note it uses plasma from venous blood samples, as opposed to the use of a finger prick blood sample. A matrix study comparing the result from finger prick blood versus plasma is under investigation by this PANDEMIC Study team @Ulster University. When this LFIA was used on our cohort, a number of the positive results scored low, 1/10 (using the score card under laboratory conditions) with a faint test band, visible to a trained laboratory scientist but perhaps difficult to identify as positive by the general public conducting a self-test (Figure S6). This faint line may be reflective of the longer time from infection. If this AbC-19 LFIA is to be rolled out as a home-testing kit, it is important to determine if members of the public observe the same results as observed in the laboratory. Again, this is the aim of an ongoing usability study by the PANDEMIC study team @Ulster University.
Lastly, it is important to consider prevalence when interpreting an assays’ sensitivity and specificity as in a low prevalence scenario, as even slightly lowered performance metrics (such as the MHRA required 98%) can result in large numbers of false negative and false positive results10,26. Given the high sensitivity of the AbC-19 LFIA observed in this study, a false positive result is unlikely, however false negatives may occur. For prevalence studies, this may underestimate true prevalence (though could be corrected statistically). At an individual level, a false negative result may cause anxiety or over-caution-particularly if we progress through this pandemic and become more informed about SARS-CoV-2 immunity. A positive antibody test may be linked to prevention of future infection following COVID-19 illness or post-vaccination.
Data Availability
Published data may be shared upon request to the corresponding author.
Declaration of interests
At the time of this study TM and JML acted as advisors to CIGA HealthCare, an industrial partner in the UK Rapid Test Consortium. No personal financial reward or renumeration was received for this advisory role. At the time of submission of this manuscript TM and JML no longer held these advisory positions.
All other authors have no potential conflict of interest to report.
Funding statement
Funding for this study was obtained from UK-RTC to cover all laboratory costs incurred.
Acknowledgements
We are extremely grateful to all the people of Northern Ireland who took part in this study and gave blood during the pandemic. We are indebted to the phlebotomists-Geraldine Horrigan, David Hunter and Pamela Taylor who conducted all the blood draws whilst ensuring the highest possible level of safety to the participants. We are also extremely grateful to Kingsbridge Private Hospital for sponsorship and providing everything needed for blood collection including the clinical space. We acknowledge Dr Tony Byrne for use of the SafeWater laboratory and Professor Gareth Davison for laboratory space and equipment during the pandemic within a locked down University.