Comparative performance of five commercially available serologic assays to detect antibodies to SARS-CoV-2 and identify individuals with high neutralizing titers

Accurate serological assays to detect antibodies to SARS-CoV-2 are needed to characterize the epidemiology of SARS-CoV-2 infection and identify potential candidates for COVID-19 convalescent plasma (CCP) donation. This study compared the performance of commercial enzyme immunoassays (EIAs) to detect IgG or total antibodies to SARS-CoV-2 and neutralizing antibodies (nAb). The diagnostic accuracy of five commercially available EIAs (Abbott, Euroimmun, EDI, ImmunoDiagnostics, and Roche) to detect IgG or total antibodies to SARS-CoV-2 was evaluated from cross-sectional samples of potential CCP donors that had prior molecular confirmation of SARS-CoV-2 infection for sensitivity (n=214) and pre-pandemic emergency department patients for specificity (n=1,102). Of the 214 potential CCP donors, all were sampled >14 days since symptom onset and only a minority had been hospitalized due to COVID-19 (n=16 [7.5%]); 140 potential CCP donors were tested by all five EIAs and a microneutralization assay. When performed according to the manufacturers’ protocol to detect IgG or total antibodies to SARS-CoV-2, the sensitivity of each EIA ranged from 76.4% to 93.9%, and the specificity of each EIA ranged from 87.0% to 99.6%. Using a nAb titer cutoff of ≥160 as the reference positive test (n=140 CCP donors), the empirical area under receiver operating curve of each EIA ranged from 0.66 (Roche) to 0.90 (Euroimmun). Commercial EIAs with high diagnostic accuracy to detect SARS-CoV-2 antibodies did not necessarily have high diagnostic accuracy to detect high nAbs. Some but not all commercial EIAs may be useful in the identification of individuals with high nAbs in convalescent individuals.


INTRODUCTION
performance of commercially available SARS-CoV-2 serologic assays, as most previous studies 85 have been deemed to have a high risk of bias, particularly due to the use of small sample sizes 86 and/or exclusion of specimens from asymptomatic SARS-CoV-2 infections and mild or 87 moderate cases of  Commercial SARS-CoV-2 EIAs may have an additional role in the implementation of COVID-89 19 convalescent plasma (CCP) therapy programs. 2,7 The FDA recently issued an EUA for CCP 90 therapy. 8 Indeed, observational evidence suggests CCP is likely safe and efficacious, particularly 91 when administered early in the disease process. 9-12 Higher IgG antibody titers to the S1 protein 92 in CCP transfused to COVID-19 patients have been associated with decreased mortality. 12 93

Ethics statement 105
This study used stored samples and data from two parent studies that were approved by The 106 Johns Hopkins University School of Medicine Institutional Review Board. All samples were de-107 identified prior to laboratory testing. Both studies were conducted according to the ethical 108 standards of the Helsinki Declaration of the World Medical Association. 109

Study specimens 110
To test the clinical sensitivity of SARS-CoV-2 EIAs, we included stored plasma specimens from 111 a convenience sample of potential CCP donors that were recruited in the Baltimore, MD and 112 Washington DC area (n=214). 13 Individuals were eligible for enrollment if they had a 113 documented history of a positive molecular assay test result for SARS-CoV-2 infection 114 (confirmed by medical chart review or shared clinical documentation) and met standard self-115 reported eligibility criteria for blood donation. Demographic information of included CCP 116 donors is shown in Supplemental Table 1. Among included CCP donors, there was a median of 117 44 days from diagnosis until sample collection (interquartile range, 38-50 days). Although all 118 included CCP donors were symptomatic at the time of SARS-CoV-2 infection, less than 10% 119 had a history of hospitalization due to COVID-19. To test the clinical specificity of SARS-CoV-120 2 EIAs, we included stored serum specimens from an identity-unlinked serosurvey conducted in 121 2016 among adult patients attending the Johns Hopkins Hospital Emergency Department 122 (n=1,102). Both parent studies were cross-sectional and no individual contributed multiple 123 specimens. All plasma/serum samples were stored at -80C until assays were performed. 124 for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

SARS-CoV-2 EIAs 125
Plasma/serum specimens were analyzed using five commercially available EIAs: the Euroimmun 126 Anti-SARS-CoV-2 ELISA, the Epitope Diagnostics, Inc. (EDI) Novel Coronavirus COVID-19 127 IgG ELISA Kit, the ImmunoDiagnostics SARS-CoV-2 NP IgG ELISA kit, the Abbott-Architect 128 SARS-CoV-2 IgG chemiluminescent microparticle immunoassay (CMIA) and the Roche 129 Diagnostics Elecsys®Anti-SARS-CoV-2 E-CLIA (Table 1). These commercially available EIAs 130 were selected either because data on performance characteristics for the assay are limited and/or 131 the assay has received an EUA by the FDA. The target antigen for each EIA is the nucleocapsid 132 protein with the exception of the Euroimmun ELISA for which the target antigen is the S1 133 protein. The Roche assay measures total antibodies to SARS-CoV-2, whereas the others measure 134 only IgG to SARS-CoV-2. EIAs were conducted according to the manufacturers' instructions. 135 The intended use of each EIA is the qualitative detection of antibodies; however, each EIA 136 provides semi-quantitative output normalized by a calibrator. For simplicity, we refer to the 137 normalized continuous output of each EIA as a "ratio" value. The manufacturers' ratio cutoffs to 138 qualitatively indicate seropositivity, indeterminate serostatus, or seronegativity for SARS-CoV-2 139 antibodies are provided ( Table 1). Specimens were tested by each EIA based on sample volume 140 availability and assay kit availability at the time of testing. 141

Microneutralization assay 142
Plasma nAb titers were quantified against 100 fifty percent tissue culture infectious doses 143 (TCID50) using a microneutralization (NT) assay in VeroE6-TMPRSS2 cells, which has been 144 previously described. 13,17 In brief, plasma was diluted 1:20 and subsequent two-fold dilutions. 145 Infectious virus was added to the plasma dilutions at a final concentration of 1x10 4 TCID 50 /ml. 146 for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.31.20184788 doi: medRxiv preprint After a 1-hour incubation at room temperature, 100L of each sample dilution was added to 6 147 wells in a 96-well plate of VeroE6-TMPRSS2 cells, 18 and incubated for 6 hours at 37˚C. The 148 inocula were removed from the plate, fresh media was added, and the plate was incubated at 149 37˚C for 48 hours. The cells were fixed with 4% formaldehyde (in each well), incubated for 4 150 hours at room temperature, and stained with Napthol Blue Black (Sigma-Aldrich). We calculated 151 a nAb titer area under the curve (AUC) value for each sample using the exact number of wells 152 protected from infection at every dilution. Samples with no neutralizing activity were assigned a 153 value of one-half the lowest measured AUC. 154

Statistical analysis 155
The diagnostic accuracy of each EIA to detect IgG or total antibodies to SARS-CoV-2 was 156 examined using CCP donor specimens as reference standard positive and pre-pandemic 157 specimens as reference standard negative. For each EIA, non-parametric, empirical receiver 158 operating curve analysis (ROC) was performed to calculate the area under the receiver operating 159 curve (AUROC). This analysis was also done using the manufacturers' cut-offs. Sensitivity (%) 160 was calculated as 100 x (Positive/[Positive + False-Negative]). Specificity (%) was calculated as 161 100 x (Negative/[Negative + False-Positive]). For these analyses, an available-case approach was 162 used for each EIA and indeterminate results were considered to be seronegative. Three separate 163 sensitivity analyses were conducted: (1) we performed head-to-head comparisons, (2) we 164 considered indeterminate specimens as positive, and (3) we excluded indeterminate specimens. 165 Exact binomial (Clopper-Pearson) 95% confidence intervals (CI) were calculated for estimates. 166 The remaining analyses were conducted in CCP donors that had data for all five EIAs and nAb 167 titers (n=140). The correlation of EIA ratios and nAb AUC values were examined using 168 for use under a CC0 license.
This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.31.20184788 doi: medRxiv preprint spearman's correlation coefficients () with 95% CIs estimated over 1000 bootstrap iterations. 169 We evaluated four binary cut-offs of the nAb AUC value to indicate "high" nAbs titers: 20, 170 40, 80, and 160. For each nAb AUC cut-off, we evaluated the performance of each EIA to 171 discriminate between low and high nAb titers using empirical ROC analysis. 172 According to the recent EUA for CCP therapy, all CCP donors will be required to be antibody 173 positive for SARS-CoV-2. Thus, we also calculated the positive percentage agreement and 174 negative percentage agreement between each binary nAb threshold and each EIA using the 175 manufacturer's cut-offs originally recommended for SARS-CoV-2 serostatus in the CCP donor 176 population (indeterminates were considered as seronegative). 177 Statistical analyses were performed in Stata/MP, version 15.2 (StataCorp, CollegeStation, TX) 178 and R statistical software. 179 analysis using the manufacturers' cutoffs. For the Abbott and Roche assays, the AUROCs were 190 similar in the empirical analysis and the analysis using the manufacturers' cut-offs. 191 Using the manufacturers' cut-offs, the sensitivity of each EIA to detect SARS-CoV-2 antibodies 192 ranged from 76.4% to 93.9%, whereas the specificity of each EIA ranged from 87.0% to 99. 6%. 193 Both the Abbott and Roche assays had comparable characteristics as each other with higher point 194 estimates for sensitivity and specificity compared to the ELISAs. Considering 195 indeterminate/borderline specimens as seropositive as opposed to seronegative decreased the 196 specificity of EDI; however, excluding indeterminate/borderline specimens had minimal impact 197 on estimates (Supplemental Table 2). Similar estimates were also obtained in direct comparisons 198 (Supplemental Tables 3, 4). It is also notable that among the 140 CCP donor specimens that 199 were tested by all five EIAs, there were 6 (4.3%) specimens that were seronegative (or 200 indeterminate) for SARS-CoV-2 by all five EIAs. The median time from COVID-19 diagnosis 201 for these 6 individuals was 46 days (range, 33-54). Interestingly, there were 2 false-positive 202 specimens of the 500 pre-pandemic specimens tested by both Abbott and Roche (one of which 203 was false-positive on both assays). 204 Among pre-pandemic samples, there was greater variation in the distribution of ratio values for 205 ELISAs than for the Abbott and Roche assays (Supplemental Figure 1), consistent with the 206 higher specificity observed for the Abbott and Roche assays. For the Abbott, Roche and 207 ImmunoDiagnostics assays, the value of three times the standard deviation above the mean value 208 from all the pre-pandemic samples was below the cutoff used to define a positive sample. 209 Among the 140 CCP donor specimens, the median nAb AUC value was 60 (interquartile range: 210 10, 150). The prevalence of nAb AUC 20 was 65.7% (n=92), the prevalence of nAb AUC 40 211 for use under a CC0 license.
This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.31.20184788 doi: medRxiv preprint was 57.1% (n=80), the prevalence of nAb AUC 80 was 45.7% (n=64), and the prevalence of 212 nAb AUC 160 was 25.0% (n=35). There were significant positive correlations between nAb 213 AUC values and EIA ratio values for all EIAs examined (Figure 1), but the strongest correlation 214 was observed for the Euroimmun assay (=0.81 [95%CI: 0.74-0.85]) and weakest correlation 215 was observed for the Roche assay (=0.40 [95%CI: 0.25-0.54]). With "high" nAb titers as the 216 reference positive, there was substantial between-assay variability in the empirical AUROCs of 217 each EIA, but changing the threshold used to define a "high" nAb titer did not substantially 218 impact the AUROCs of a given EIA (Figure 2). For instance, for all four nAbs thresholds 219 evaluated, all empirical AUROC point estimates for the Euroimmun assay were90, whereas all 220 AUROC point estimates for the Roche assay were <0.75. For the Euroimmun assay and nAB 221 test at a threshold of 160, the EIA ratio cut-off with the highest overall percent agreement 222 (86%) was 6.0 (positive percent agreement was 77% and negative percent agreement was 89%). 223 Table 3 shows the positive percentage agreement (sensitivity) and negative percentage 224 agreement (specificity) of each assay with the four nAb test thresholds when using the EIA 225 manufacturers cut-offs for seropositivity. All EIAs had a positive percent agreement with "high" 226 nAbs exceeding 90%, regardless of the threshold for high nAbs. However, there was poor 227 negative percentage agreement between each EIA and nAbs. For all EIAs, the negative 228 percentage agreement decreased with increasing threshold for high nAbs. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

DISCUSSION 231
We observed substantial variability in the performance characteristics of five commercially 232 available EIAs for the detection of antibodies to SARS-CoV-2 and detection of high nAb titers in 233 convalescent individuals. The Roche and Abbott assays had high diagnostic accuracy for the 234 detection of antibodies against SARS-CoV-2. However, the Roche assay ratios weakly correlated 235 with nAb titers and poorly identified persons with high nAb titers. In contrast, the Euroimmun 236 assay ratios had the highest correlations with nAb titers and high discriminative capacity for 237 detecting high nAbs. This variability in assay performance should be considered when selecting 238 an EIA to detect antibodies against SARS-CoV-2 and/or high nAbs among recovered persons. 239

Consistent with our findings, there is growing evidence that both the Abbott and Roche assays 240
have comparable performance characteristics that are often superior to many other commercially 241 available ELISAs to detect IgG or total antibodies against SARS-CoV-2 in convalescent 242 individuals. 19,20 Although we did not include "challenge" specimens to examine potential cross-243 reactivity of antibodies to other pathogens, others have shown limited evidence of cross-244 reactivity for the Euroimmun,EDI,Roche,and Abbott assays. 19,[21][22][23][24][25] Data on the performance of 245 the ImmunoDiagnostics ELISA to detect SARS-CoV-2 antibodies are limited. 246 Large public health laboratories and large blood collection centers often rely on automated 247 serological platforms-like those by Roche and Abbott-for screening of multiple pathogens 248 including SARS-CoV-2. While our data support the use of Roche and Abbott to detect SARS-249 CoV-2 antibodies, their utility to detect high nAbs in CCP donors is less clear. Similar to prior 250 reports, we observed varying degrees of positive correlations between commercial EIA ratios 251 and neutralizing titers. 14,26 It is perhaps unsurprising that the Euroimmun ELISA ratios 252 for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.31.20184788 doi: medRxiv preprint correlated best with nAb titers since it detects S1-specific antibodies-a subset of which are 253 responsible for virus neutralization-while the other assays we assessed detect N-specific 254 antibodies which lack virus neutralization activity. Accordingly, our empirical ROC analysis also 255 indicates Euroimmun may have better performance in discriminating high nAb titers, as 256 compared to the Abbott and Roche assays. Interestingly, using the manufacturer's cut-off, 257 Jaaskelainen et al. found the Abbott assay had greater positive and negative percent agreement 258 with nAb activity than the Euroimmun assay. 27 In our study, the Abbott assay was also better 259 able to discriminate high nAbs than the Roche assay, which is in contrast to a study by Tang et 260 al. that found similar performance between the Abbott and Roche assays. 28 However, similar to 261 Tang et al., we found that applying the manufacturer's cutoffs for the commercial EIAs 262 (including Euroimmun) led to suboptimal negative percentage agreement with high nAbs near 263 the FDA recommended nAb titer cut-off of 1:160. 28 Larger comparative studies are needed to 264 determine the optimal EIA and cut-off to discriminate nAb levels in convalescent donors, 265 including other promising EIAs that were not included in these evaluations. 29 266 This study has limitations. First, the data were cross-sectional, so we were unable to capture the 267 influence of longitudinal antibody dynamics on diagnostic accuracy. Second, there were several 268 types of specimens that were not included in the evaluation, such as samples from early in 269 SARS-CoV-2 infection (e.g., <14 days post-symptom onset), samples from individuals who were 270 asymptomatic when infected with SARS-CoV-2, and samples from convalescent individuals 271 who were infected >6 months ago-all of which could potentially influence our estimates of 272 assay sensitivity. Third, the samples used to examine assay specificity were not well-273 characterized due to the identity-unlinked design of the JHHED serosurvey. However, given that 274 we used samples from patients in an inner-city emergency department that delivers primary care 275 for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020 to the local underserved community, several included patients who were likely seeking care for 276 viral respiratory illnesses. Finally, the samples evaluated were primarily from the Baltimore-277 Washington D.C. region, and results may not be generalizable elsewhere. 278 Implementation of the appropriate EIAs to detect SARS-CoV-2 antibodies will require careful 279 consideration of the inferential purpose (e.g., individual-vs. population-level inference), context 280 (e.g., prevalence in target population), operational feasibility (e.g., high-throughput platform vs. 281 manual ELISA) and the underlying test performance characteristics of the assays. Although the 282 output ratio results for commercially available EIAs correlate with nAb titers, EIA ratios should 283 not be universally considered a surrogate for nAb titers. This is particularly relevant for 284 programs that are currently scaling CCP therapy per new FDA guidelines. Ratios from some 285 commercial EIAs, however, may help inform prediction models that can also incorporate other 286 predictors of high nAb titers. These models could prove useful in the identification of optimal 287 CCP donors in the absence of accurate and reliable high-throughput tests for nAb titers. 288 for use under a CC0 license.
This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted September 2, 2020. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

387
Four thresholds for a high nAb AUC value were examined as the reference positive test.

388
for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.31.20184788 doi: medRxiv preprint This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.