Abstract
Large-scale testing for SARS-CoV-2 by the reverse transcription polymerase chain reaction is a key part of the response to the COVID-19 pandemic, but little attention has been paid to the potential frequency and impacts of false positive results. In the absence of data on the clinical specificity of SARS-CoV-2 assays, we estimate a conservative false positive rate from external quality assessments of similar viral assays, and show that this rate may have large impacts on the reliability of positive test results. This has clinical and case management implications, affects an array of epidemiological statistics, and should inform the scale of testing and the allocation of testing resources. Measures to raise awareness of false positives, reduce their frequency, and mitigate their effects should be considered.
Testing for SARS-CoV-2 employs the reverse transcription polymerase chain reaction (RT-PCR) to detect diagnostic sequences in the virus' RNA genome. The results are used to diagnose and determine the treatment of patients, to sequester individuals, to allocate the use of personal protective equipment, and to direct contact tracing. At the policy level test results are used to model rates of spread, calculate hospitalization and death rates, assess the number and significance of asymptomatic carriers, shape public policy on social distancing, and inform decisions about the spatial and temporal allocation of medical resources. Test accuracy is thus of paramount importance, yet little attention has been paid to the potential frequency and impacts of false positive results (Supplementary Text). Although SARS-CoV-2 RT-PCR assays are widely reported to have 100% specificity (Table S1), this refers to the absence of cross-reactivity with non-target genetic material (analytical specificity), not to the potential for incorrect results in the real-world implementation of testing (clinical specificity) where contamination or human error can generate false positives during sample collection, transport or analysis.
External quality assessments (EQAs) test the implementation of medical diagnostic assays by providing participating laboratories with blind panels of positive and negative samples. The laboratories assay these panels using their normal procedures and report the results to the EQA manager, who compiles and analyzes the results. Since external quality assessments have not yet been performed on SARS-CoV-2 assays, and in the absence of any other data on the clinical specificity of these assays, we conducted a meta-analysis to determine the range of false positive rates (FPRs) in EQAs of similar assays (see Methods). We compiled data and calculated FPRs for 43 EQAs of RT-PCR assays of RNA viruses, conducted in 2004-2019. Each EQA involved between three and 174 laboratories, which together provided results for 4,113 blind panels containing 10,538 negative samples (Table 1). Though the reported information on the laboratories' methods was incomplete, it appears that 99.8% of the panels were assayed with RT-PCR and 0.2% with other nucleic acid amplification-based methods (RT-loop-mediated isothermal amplification or RT-reverse polymerase amplification). In EQAs where the information was provided, 17% of the RT-PCR assays were conventional and 83% were realtime.
In all, 336 of the 10,538 negative samples (3.2%) were reported as positive. We considered two data sets comprising all 43 EQAs (full data set), and the 37 EQAs that analyzed at least 100 negative samples (subset). FPRs in each EQA ranged from 0-16.7% for the full data set, and 0-8.1% for the subset. There was no correlation between FPR and Year for the full data set (n=43, r=0.147, p=0.346), and a weak downward correlation for the subset (n=37, r=0.327, p=0.056) (Figure S1). The median and interquartile range were lower for the full data set (median=2.3%, interquartile range=0.8-4.0%) than for the subset (median=2.5%, interquartile range=1.2-4.0%) (Figure S2).
We conservatively used the lower of the 25th percentile FPRs from the two data sets in our models. This FPR value (0.8%) is further conservative in that it does not include false positives produced during sampling (1), or account for any increased error rate stemming from the rapid expansion of SARS-CoV-2 testing and the use of novel diagnostic assays (2). Published analyses of SARS-CoV-2 testing indicate a range of FNRs from 0% to 52.2% (Table S2). We used an FNR of 25% as input to the model; sensitivity analyses performed over the range of 0-50% showed little effect on the reliability of positive results (Figures S2-S6).
Even a low FPR reduces the reliability of positive results when prevalence is low. This is apparent in model results based on test data for 50 U.S. states and 77 countries (Figures 1A, 2A), which suggest that some testing is targeted too broadly to be useful in regions with low test positivities (toward the right side of the panels). However, some elements of a testing program may be of value even if the average reliability of positive results in a region is extremely low. Modeling disaggregated data (for example, separately modeling data for diagnostic, screening and surveillance testing) could guide the reallocation of test resources from program elements whose positive results are too unreliable to be useful into more focused and reliable testing.
Modeling daily test data reveals the trajectories of test program reliability (Figures 1B-D, 2B-C). Some regions continue to test broadly even as test positivity drops to low levels and the modeled reliability of positive results approaches or reaches zero. In Montana, for example, the model suggests that after around April 25th most positive results, especially for asymptomatic individuals, were probably false positives, but an average of nearly 3,000 tests continued to be conducted in the state each day. South Korea arrived at a similar point by April 7 but continued to conduct over 6,000 tests a day.
The reliability of positive results dropped to near zero in these cases when test positivity approached the estimated FPR. However, even at positivities up to around four times the FPR, over 20% of positive results are likely to be false positives. Unless other respiratory diseases are pervasive in a test population, most of these false positive individuals would likely be asymptomatic, which could at least partially explain reports of large numbers of asymptomatic carriers of SARS-CoV-2.
False positives also affect the interpretation of individual test results. Statements from health agencies and officials suggest that positive results from SARS-CoV-2 tests are more trustworthy than negative results (3, 4). However, over a wide range of likely scenarios, the opposite is true (see Figures 1, 2, S3-S6, wherever positive predictive value is less than negative predictive value). The FPR acts on samples from the uninfected fraction of the population, returning positive results for some uninfected individuals, while the FNR acts on the infected fraction, returning negative results for some infected individuals. When prevalence is low, the uninfected fraction is much greater than the infected fraction, so that even a low FPR can have a larger effect than a high FNR.
The false positives reported in the EQAs were probably due to contamination either from a positive sample analyzed at the same time (cross-contamination), or more likely from genes amplified from prior positive samples or positive controls (carryover contamination). The amplification of nucleic acids makes PCR-based assays highly sensitive, but also highly vulnerable to minute levels of sample contamination which can produce false positives that are indistinguishable from true positives. False positives can also be produced by sample mix-ups (5) or data entry errors.
Some non-EQA studies have reported false positives in RT-PCR assays for other coronaviruses (6, 7), and there is some evidence for false positives specifically in SARS-CoV-2 assays. Four studies conducting sensitivity or cross-reactivity assessments on SARS-CoV-2 RT-PCR assays reported false positives when negative samples were tested, presumably due to contamination in those laboratories (Tables S3, S4). Several cases of false positives in regular SARS-CoV-2 testing have been reported in the media (Supplementary Text).
Though numerous studies have addressed FNRs in SARS-CoV-2 RT-PCR testing (Table S2), we found virtually no discussion of FPRs in these tests in the scientific or medical literature (Supplementary Text). False positive results, especially when unanticipated, have clinical and case management consequences, including waste of personal protective equipment, waste of human resources in contact tracing, and unnecessary sequestering of uninfected individuals, sometimes along with infected individuals (8). A false positive test result could impede a correct diagnosis, delaying or depriving patients of appropriate treatment. False-positive patients introduce noise into clinical observations, which may hinder the development of improved COVID-19 medical care based on clinical experience. If antibody or antiviral treatments become available for COVID-19 patients, or prophylactic treatments for asymptomatic or mildly symptomatic individuals with positive test results, false-positive individuals could be subjected to medically inappropriate treatments (9). Individuals that have falsely tested positive may be less likely to avoid future exposure to infected individuals, believing they have immunity, and for the same reason may not seek vaccination when it becomes available. Clinical trials could lose statistical power by unwittingly enrolling false-positive individuals, who would be exposed to potentially harmful side effects without any mitigating potential for benefit. False positives also affect an array of epidemiological statistics, including the asymptomatic ratio.
The impact of false positives in SARS-CoV-2 testing could be mitigated by increasing awareness of the probability of false positives; by improving estimates of false positive rates with appropriately-designed external quality assessments, or assessing results retrospectively with serological tests; and by reducing the frequency of false positives by requiring two independent positive tests to classify an individual as positive. In previous disease outbreaks the World Health Organization and the U.S. Centers for Disease Control and Prevention warned about the potential for false positives in RT-PCR testing, restricted testing to individuals most likely to have the disease to avoid generating excessive false positives, and required confirmation of positive results by a second test (10, 11). These warnings and requirements are absent from these organizations' guidance on SARS-CoV-2 (12, 13). Perhaps they should be reinstated.
Data Availability
All relevant data are included in the text or supplementary materail.
Supplementary Materials
Methods
Supplementary Text
Figures S1-S6
Tables S1-S5
References
Acknowledgements
We thank Michael Milgroom, Elisa Liberti, Dominic Chow and Thomas Taylor for their assistance. Author contributions: AC designed and conducted the analyses. AC and BK collaborated on the review of clinical implications and writing the report. Competing interests: The authors have no competing interests. Data and materials availability: All data is available in the manuscript or supplementary materials.