Estimating the false positive rate of highly automated SARS-CoV-2 nucleic acid amplification testing

Molecular testing for infectious diseases is generally both very sensitive and specific. Well-designed PCR primers rarely cross-react with other analytes, and specificities seen during test validation are often 100%. However, analytical specificities measured during validation may not reflect real-world performance across the entire testing process. Here, we use the unique environment of SARS-CoV-2 screening among otherwise well individuals to examine the false positivity rate of high throughput so-called 'sample-to-answer' nucleic acid amplification testing (NAAT) on three commercial assays: the Hologic Panther Fusion(R), Hologic Aptima(R) transcription mediated amplification (TMA), and Roche cobas(R) 6800. We used repetitive sampling of the same person as the gold standard to determine test specificity rather than retesting of the same sample. We examined 451 people repetitively sampled over 7 months via nasal swab, comprising 7,242 results. During the study period there were twelve positive tests (0.17%) from 9 people. Eight positive tests (0.11%, five individuals) were considered bona fide true positives based on repeat positives or outside testing and epidemiological data. One positive test had no follow-up testing or metadata and could not be adjudicated. Three positive tests (three individuals) did not repeat as positive on a subsequent collection, nor did the original positive specimen test positive on an orthogonal platform. We consider these three tests false positives and estimate the overall false positive rate of high-throughput automated, sample-to-answer NAAT testing to be approximately 0.041% (3/7242). These data help laboratorians, epidemiologists, and regulators understand specificity and positive predictive value associated with high-throughput NAAT testing.


43
During assay validation, clinical tests are specifically interrogated for their sensitivity and 44 specificity. (1) Sensitivity is defined by the ability of the test to return a positive result in 45 presence of an analyte, while specificity indicates the test's ability to return a negative result in 46 the absence of that analyte. While both are important, test specificity can be an especially 47 critical parameter for screening tests, where the vast majority of persons are negative. This fact 48 has been well-appreciated in infectious disease serological testing for viruses such as human 49 immunodeficiency virus (HIV) or hepatitis C virus (HCV), but has not been widely interrogated 50 for infectious disease nucleic acid amplification testing (NAAT) in great detail, as the cost of 51 molecular testing has restricted its use in screening. In addition, there is little commercial 52 incentive to advertise a laboratory's false positivity rate, even though both laboratorians and 53 physicians understand that false positives can occur at any point in the testing process.

54
The performance of molecular infectious disease testing has generally been evaluated 55 based on analytical sensitivity, since analytical specificities of most NAAT, including polymerase 56 chain reaction (PCR) tests, during validation are nearly always 100%. With the widespread 57 availability of genomic data today, it is generally facile to specifically design PCR primer and 58 probe sets to avoid cross-reaction with other analytes. While several SARS-CoV-2 primer sets 59 were initially designed to specifically cross-react with SARS-CoV or potentially other

66
Retesting a positive specimen on an orthogonal testing platform may help determine true 67 analytical positivity if the second test is positive, but a negative test may not be specifically 68 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 27, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 informative if the viral load is at the analytical limit of detection (LoD). An alternative method to 69 determine false positivity would be to recollect individuals, though this still has problems with 70 stochasticity of samples, especially at the LoD.

71
The SARS-CoV-2 pandemic creates an intriguing use case to examine the overall false 72 positivity rate associated with large-scale NAAT, as millions of tests were performed daily 73 across the United States in 2020-21. Many employers such as technology companies or sports 74 teams specifically tested their employees every day or every other day to prevent widespread 75 transmission within their organization. Our clinical laboratory performed almost half of testing 76 within Washington State and offered repeat longitudinal testing to several employer groups that 77 had exceptionally low positivity rates. Here, we use this data to specifically interrogate the false 78 positivity rate associated with high-throughput, so called "sample-to-answer" assays employed 79 in our laboratory: Hologic Panther Fusion, the Hologic Aptima transcription mediated 80 amplification (TMA), and Roche cobas 6800.

83
Description of cohort and testing 84 This study was approved by the University of Washington Institutional Review Board.

85
Three different, high-throughput sample-to-answer platforms for SARS-CoV-2 testing were used

118
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 27, 2021.  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

132
A total of 451 people repetitively tested over a seven-month period from May to 133 November 2020 were included in the analysis. The median age of those tested was 27 years 134 old (interquartile range, IQR, 23-33 years). A total of 7,242 SARS-CoV-2 tests were performed 135 on the high-throughput assays during the study period; all specimens were nasal swabs. The 136 median number of tests per individual was 10 (IQR 6-16) and the median number of days 137 between consecutive tests was 2.1 days (IQR 2-5 days). There were 12 positive (detected) 138 results during the study period (0.17%) from 9 people ( Figure 1 and Table 2). There was one   mean time to subsequent collection was 3 days (range 1 to 7 days). Two of the false positive 148 cases (Cases 2 and 3) did not have a preceding result in our system, but did have repetitive 149 testing at an outside lab that was negative per report. We estimate the overall false positive rate 150 of high-throughput automated molecular testing to be approximately 0.041%.

151
A total of five cases were interpreted true positives. Two positive cases had static or 152 decreasing Ct value on consecutive tests performed in our laboratory (Cases 7 and 8, Figure 2).

153
One case (Case 1) had a positive test result that was followed by a repeat positive test 154 performed at an outside laboratory a day later (no Ct value available). The fourth case (Case 4) 155 was interpreted as clinical true positive as the patient developed symptoms within one day of 156 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10.1101/2021.04.25.21254890 doi: medRxiv preprint testing and had close contact with a person known to have COVID-19; no subsequent testing 157 was performed on this individual. Lastly, Case 5 had a positive result following an inconclusive 158 result, which is consistent with detecting an early infection, and we count as a true positive. All 159 of the true positive results were positive using the same specimen on an orthogonal platform.

160
The average Ct value of the five true positives for which this data was available on the primary 161 testing platform was 29.5 and the mean time to subsequent collection was 2.97 days (range 1 to

182
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 27, 2021.

228
Repeat collection and testing of another specimen, ideally within a short (several day) 229 timeframe, can also serve to identify false positive results.

230
There are several limitations to the work presented here. This study was performed at a 231 single site and our measured false positivity rate is likely dependent on our own testing 232 processes. This analysis only covers high-throughput automated platforms. Other PCR methods

233
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 27, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 that use a 96 well plate format which involve sample transfers and manipulation likely will 234 experience a higher false positivity rate. The individuals tested here were not uniformly sampled    CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 27, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021