ABSTRACT
Background Cases with negative reverse transcription-polymerase chain reaction (RT-PCR) results at initial testing for suspicion of SARS-CoV-2 infection, and found to be positive in a subsequent test, are considered as RT-PCR false-negative cases. False-negative cases have important implications for COVID-19 management, isolation, and risk of transmission. We aimed to review and critically appraise evidence about the proportion of RT-PCR false-negatives at initial testing for COVID-19.
Methods We performed a systematic review and critical appraisal of literature with high involvement of stakeholders in the review process. We searched on MEDLINE, EMBASE, LILACS, the WHO database of COVID-19 publications, the EPPI-Centre living systematic map of evidence about COVID-19, and the living systematic review developed by the University of Bern (ISPM). Two authors screened and selected studies according to the eligibility criteria and collected data of included studies (no-independent verification). Risk of bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. We calculated the false-negative proportion with the corresponding 95% CI using a multilevel mixed-effect logistic regression model using STATA 16®. Certainty of the evidence about false-negative cases was rated using the GRADE approach for tests and strategies. The information is current up to 6 April 2020.
Findings Five studies enrolling 957 patients were included. All studies were affected by several biases and applicability concerns. Pooled estimation of false-negative proportion was 0.085 (95% CI= 0.034 to 0.196; tau-squared = 1.08; 95% CI= 0.27 to 8.28; p<0.001); however, this estimation is highly affected by unexplained heterogeneity, and its interpretation should be avoided. The certainty of the evidence was judged as very low, due to the risk of bias, indirectness, and inconsistency issues.
Conclusions The collected evidence has several limitations, including risk of bias issues, high heterogeneity, and concerns about its applicability. Nonetheless, our findings reinforce the need for repeated testing in patients with suspicion of SARS-Cov-2 infection given that up to 29% of patients could have an initial RT-PCR false-negative result.
Systematic review registration Protocol available on OSF website: https://osf.io/gp38w/
BACKGROUND
On December 31, 2019, the World Health Organization (WHO) was alerted about a cluster of pneumonia patients in the city of Wuhan, in China’s Hubei province [1]. Chinese authorities confirmed a week later the outbreak of a novel coronavirus currently called Severe Acute Respiratory Coronavirus 2 (SARS-CoV-2) [2]. This new virus is the underlying cause of Coronavirus Disease 2019 (COVID-19), which has become a worldwide public health emergency and reached pandemic status [3]. By the time of this article’s writing, the virus has spread to 212 countries and territories and has caused over 85,837 deaths worldwide [4].
Patients with COVID-19 exhibit respiratory symptoms such as fever, cough, and shortness of breath as primary manifestations [5, 6]. Although most of the cases present mild symptoms, some cases have developed pneumonia, severe respiratory diseases, kidney failure and even death [7-9]. SARS-CoV-2 mainly spreads through person-to-person contact via respiratory droplets from coughing and sneezing, and through surfaces that have been contaminated with these droplets.[10] Recent studies have suggested the presence of asymptomatic cases in cluster families, possibly transmitting the virus before a virus-carrying person displays any symptom [11].
Because the signs of infection mentioned above are non-specific, confirmation of cases is currently based on the detection of a viral sequence by reverse transcription-polymerase chain reaction (RT-PCR). Different RT-PCR schemes have been proposed; all of them include the N gene that codes for the viral nucleocapsid. Other alternative targets are the E gene, for the viral envelope, or the S gene for the spike, and the Hel gene for the RNA polymerase gene (RdRp/Helicase) [12, 13]. Molecular criteria for in vitro diagnosis of COVID-19 disease are heterogeneous, and usually require the detection of two or more genes of SARS-CoV-2 [14].
RT-PCR repeated testing might be required to confirm a clinical diagnosis, especially in the presence of symptoms close related to COVID-19 disease [15]. Cases with negative RT-PCR results at initial testing and later found to be positive in a subsequent test are commonly considered cases with an initial false-negative result. Some researchers have suggested that these failures in SARS-CoV-2 detection are related to multiple pre-analytical and analytical factors, such as lack of standardisation to collect specimens, the time and conservation of samples until to be received in the laboratory, the use of non-adequately validated assays, contamination during the procedure, insufficient viral specimens and load, the incubation period of the disease, and the risk of active recombination and mutation [14, 16].
The availability of accurate laboratory tools for COVID-19 is essential for case identification, contact tracing, and optimization of infection control measures, as it was shown by previous epidemics caused by SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV) [17-19]. Due to the COVID-19 pandemic causing an important burden on health systems around the globe, and considering that a missing COVID 19 case might have severe consequences at several levels, we aimed to estimate through a systematic review of the literature the proportion of false-negatives related to the detection of SARS-CoV-2 using RT-PCR assays at the initial laboratory test.
METHODS
We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) for diagnostic test accuracy (DTA) to perform this report [20]. For the development of this systematic review of literature, we used selected methods for the development of rapid reviews, such as a high involvement of stakeholders in the review process (including the definition of the review question), a non-independent verification of data selection and extraction, and parallelisation of tasks (that is, to perform selected activities simultaneously instead of consecutively). Other review shortcuts and omission of review tasks were not applied. A protocol of this review was published in the Open Science Framework repository for public consultation (https://osf.io/gp38w/).
Criteria for considering studies for this review
We included observational studies (including accuracy studies, cohorts, and case series) reporting the initial use of RT-PCR to the detection of SARS-CoV-2 RNA in patients under suspicion of infection by clinical or epidemiological criteria. Specially, we prioritised studies enrolling consecutive patients who were receiving RT-PCR as initial testing with further confirmation of SARS-CoV-2 infection and/or COVID-19 diagnosis (positive/negative). We did not impose limits by age, gender, or study location.
We aimed to include all types of RT-PCR kits, regardless of the brand/manufacturer, the RNA extraction method used, the number of target gene assays assessed and cycle threshold value for positivity. Studies comparing the accuracy of two or more tests for COVID-19 diagnosis were also considered if we could abstract the fraction of negative test results as defined by an initial RT-PCR assay.
We excluded studies without clear information about false-negative cases, the number of final confirmed cases, or an unclear verification of negative cases. Case reports, studies based on laboratory samples, and literature reviews were also excluded.
Search methods for identification of studies
We carried out a comprehensive and sensitive search strategy based on the proposal for the living systematic review developed by the University of Bern’s Institute of Social and Preventive Medicine-ISPM in the following databases:
MEDLINE (Ovid SP, 1946 to April 6th, 2020)
Embase (Ovid SP, 1982 to April 6th, 2020)
LILACS (iAH English) (BIREME, 1982 to April 6th, 2020)
We did not apply any language restrictions to electronic searches (S1 Appendix). As additional sources of potential studies, we searched in repositories of preprint articles (such as Medrxiv), clinical trials registries for ongoing or recently completed trials (clinicaltrials.gov; the World Health Organization’s International Trials Registry and Platform, and the ISRCTN Registry), and the reference lists of all relevant papers. Finally, we also screened the following resources for additional information:
The WHO Database of publications on coronavirus disease (COVID-19). Available on https://www.who.int/emergencies/diseases/novel-coronavirus-2019/global-research-on-novel-coronavirus-2019-ncov.
The Living systematic map of the evidence about COVID-19 produced by EPPI-Centre.
The Living systematic review developed by the Institute of Social and Preventive Medicine-ISPM from the University of Bern available on https://ispmbern.github.io/covid-19/
Data collection and analysis
For the selection of potential studies, one reviewer screened the search results based on the title and abstract, with additional verification by a second reviewer (no-independent verification). We retrieved the full-text copy of each study assessed as potentially eligible, and pairs of reviewers confirmed eligibility according to the selection criteria (non-independent verification). In case of disagreements we reached consensus by discussion. For data extraction one reviewer extracted qualitative and quantitative data from eligible studies. An additional reviewer checked all the extracted information for accuracy (non-independent verification of data extraction).
Assessment of methodological quality
We assessed the methodological quality of accuracy studies using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool [21]. Due to the lack of tools to assess the risk of bias associated with case series, we decided to apply the QUADAS-2 tool in case of inclusion of this type of report.
Statistical analysis and data synthesis
For all included studies, we extracted data about the number of cases initially considered as negative (i.e. false-negative cases) as well as the total of confirmed cases in further investigations. We presented the results of estimated proportions (with 95% CIs) in a forest plot, in order to assess the between-study variability. We aimed to calculate the false-negative rate with the corresponding 95% CI using a multilevel mixed-effect logistic regression model implemented in Stata 16®’s metaprop_one command. This allowed us to estimate the between-study heterogeneity from the variance of study-specific random intercepts. We assessed the heterogeneity between the results of the primary studies using the Tau-square statistic. A probability value less than 0.1 (p<0.1) was considered to suggest statistically significant heterogeneity and preclude a pooled result of numerical data.
We planned to investigate the potential sources of heterogeneity using a descriptive approach and performing a random-effects meta-regression analysis. Anticipated sources of heterogeneity included the type of specimen collected, the presence or not of clinical findings, the number of RNA targets genes under assessment, and the time of symptom evolution.
Summary of findings and certainty of the evidence
We rated the certainty of the evidence about false-negative cases following the GRADE approach for tests and strategies [22, 23]. We assessed the quality of evidence as high, moderate, low or very low, depending on several factors including risk of bias, imprecision, inconsistency, indirectness, and publication bias. We illustrate the consequences of the numerical findings in a population of 100 tested, according to three different prevalence estimates of the disease provided by the stakeholders involved in this review.
Patient and public involvement
We involved several stakeholders in the design, conduct, and reporting of our research, including general and family physicians, specialists on infectious disease and microbiologists currently attending patients under suspicion of COVID 19 disease. The study protocol and preliminary results are publicly available on https://osf.io/gp38w/.
RESULTS
Electronic searches yielded 662 references from selected databases. In addition, we obtained 186 additional references searching in other resources (Figure 1). Our initial screening of titles and abstracts identified 61 references to assess in full text. We excluded 54 studies due to: a) case reporting fewer than five patients; b) unclear information about the results of initial RT-PCR and/or false-negatives; c) reviews and state-of-art; d) other reasons (S2 Appendix). Two studies were not available in full-text despite requests to their authors. We included five studies in qualitative and quantitative synthesis [24-28] which included 957 patients.
The sample size ranged from 36 to 601 confirmed cases (median 102 patients). All included studies were in pre-print status. Three studies were focused on accuracy estimations [24, 26, 27], while two additional studies reported information of a case series [25, 28]. Data collection of cases ranged from January 6 to February 8-2020. All studies were performed in institutions based in China (Figure 2). The age of participants ranged from 44 to 51 years (information derived from three studies) [24, 25, 27]. There were 577 men versus 213 women included (Table 1). Three studies included patients under suspicion of COVID-19 due to clinical findings and/or epidemiological criteria [24, 26, 27]. Confirmation of infection was performed after isolation of SARS-CoV-2 in any real-time RT-PCR assay for 2019-nCoV, including repeated RT-PCR after negative results (two or more). Three studies provided information about the proportion of confirmed cases with positive chest CT findings, ranging from 74 to 98%. One study provided information about the time from the symptom onset to CT scan as a proxy for the duration of disease [25], and a second one reported duration of fever [27].
Characteristics of included studies
Regarding RT-PCR testing, the RT-PCR brand/manufacturer was reported by two studies [24, 25], No studies reported criteria for positivity. Most of the studies based their assessment on throat samples, such as pharyngeal, nasal and oropharyngeal swabs. Four studies provided information about the time since the initial RT-PCR to repeated testing (Table 1).
Quality of included studies
We applied the QUADAS-II tool to all included studies to reflect critical limitations in the validity of the findings (Figure 3 and S3 Appendix). The reference standard domain was the most affected by the potential risk of bias due to the lack of independence between the index test and the confirmation of cases (repeated RT-PCR testing). Details about the criteria for positivity were not provided by all included studies, and this domain was judged as under unclear risk and unclear applicability concerns. In addition, the applicability of patient selection was judged as with great concerns due to most of the studies selected patients who underwent both RT-PCR and Chest CT, excluding patients who can be candidates to receive the index test in the current clinical practice.
Findings
We analyse information from five studies collecting information from 957 patients confirmed to have SARS-CoV-2 infection and 53 cases with RT-PCR negative findings in their initial assessment (Figure 4). False-negative proportions ranged from 0.02 [24] to 0.29 [26]. Only one study provided subgroup information about time since illness onset to CT scans [25], as a proxy of the time of symptom evolution, with proportions ranged from 0.15 (≤ 2 days) to 0.08 (3 or more days) (Figure 4).
The pooled estimation of false-negative proportion was 0.085 (95% CI 0.034 to 0.196) estimated by a mixed-effects logistic regression model. However, pooled data is affected by a considerable between-study heterogeneity (tau-squared = 1.08; 95% CI= 0.27 to 8.28; p<0.001), since we are not able to warrant that the average estimation provided by the meta-analysis is a valid and representative estimation of the true value of the false-negative proportion in the current practice, we instead used the range of proportions in the analysis of the certainty of the evidence using the GRADE approach. A full exploration of heterogeneity was not possible given that: a) most of the studies collected upper or lower respiratory specimens; b) all studies included patients with clinical findings suggestive of COVID 19 disease; c) subgroup information by the time of evolution of symptom was only provided by one study; and d) key information about the characteristics of the index test, such as positivity criteria, were not reported. The high variability of pooled estimation was not reduced with the separate estimation of false-negative proportion by type of study (accuracy versus case series).
Certainty of the evidence
We use the range of false-negative proportions to develop a summary of findings following the GRADE approach. The quality of the evidence was judged to be very low due to issues related to the risk of bias, indirectness, and inconsistency (Figure 5). We illustrate the consequences of the range of false-negative proportions in a population of 100 tested, according to three different prevalence seen in the current clinical practice for participant stakeholders (30%, 50%, and 80%) (Figure 5). Using a prevalence of 50%, we found that 1 to 14 cases would be misdiagnosed and then they could no receive adequate clinical management, and they could require repeated testing at some point of their hospitalization or even they could require other investigations for competitive diagnoses. This numerical approach should be interpreted with caution due to the multiple limitations of the evidence described above (Figure 5).
DISCUSSION
Our systematic review included five studies and 957 participants providing information about the proportion of false-negative cases related to the detection of SARS-CoV-2 by RT-PCR assays at first use. The included studies enrolled patients under suspicion of COVID 19 [24, 26, 27] or confirmed COVID 19 cases [25, 28]. Almost all studies enrolled a selected sample of patients (i.e. patients with findings for RT-PCR and chest CT) from several provinces of China and collected between January to February 2020. We considered all studies to be affected by several sources of bias, especially related to the independence between the index test and the reference standard and the unclear report of key RT-PCR characteristics. A meta-analysis of the proportions using Stata® showed a considerable heterogeneity not explained by the collected data, and this variability is a limitation for the full interpretation of averaged proportion. As an alternative, we preferred to provide an analysis of the range of false-negative proportions derived from included studies in a cohort of 100 patients tested and using three different prevalence of the disease derived from the current clinical practice of our participant stakeholders. Using a prevalence of 80%, we found that 2 to 23 cases would be misdiagnosed and then they could no receive adequate clinical management. However, we emphasized that this numerical approach should be interpreted with caution due to the multiple limitations of the evidence described above (Quality of evidence: Very low).
Although we did not impose restrictions on population characteristics such as age, setting or publication status, we noticed that our findings are limited due to all the studies were performed in one country (China), and they reported data only for the beginning of the pandemic (January 2020), in addition to the lack of reporting about the index test previously mentioned. RT-PCR kits in use for included studies were likely the first kits developed by detection of SARS-CoV-2, and then the tests currently in use might have a great technological evolvement and different characteristics to those of the initial tools.
Despite the scarcity of information to answer the review question, our study carried out a comprehensive literature search to identify all relevant studies, including several sources of unpublished literature such as pre-print repositories. Our assessment also includes a rigorous assessment of potential sources of bias, a formal statistical analysis of results and a final assessment of the certainty of the evidence under a well-known system (GRADE). We applied selected methods associated with rapid reviews to streamline the review process, such as the involvement of stakeholders in the development of the review, a non-independent verification of data selection and extraction, and parallelisation of tasks (that is, to conduct selected activities simultaneously instead of consecutively) [29]. We avoided the use of methods that potentially might affect the quality of the review process, such as those related to limiting the search strategies, the omission of quality assessment of the collected evidence and the narrative synthesis of results [29, 30].
Due to the permanent involvement of clinicians managing COVID 19 patients in the development of this review, we were able to define a review question that responds to a clinical inquiry relevant to current clinical practice [31-33]. In fact, the number of cases misdiagnosed as not having the target condition is a critical figure due to the severe consequences of not treatment of missing patients. This estimation also can help in the estimation of additional resources in the current clinical practices to confirm a suspicious case.
Implications for practice
Our findings reinforce the need for repeated testing in patients with suspicion of being infected, due to either clinical or epidemiological reasons, given that up to 29% of patients may have an initial negative RT-PCR (certainty of evidence: very low). The collected evidence has several limitations in terms of risk of bias and applicability; in addition, lack of reporting of several key factors remains a significant constraint for analysis of collected data. A false negative result during the recovering phase could have important implications for isolation and risk of transmission, although this risk is reduced by the documentation of at least two negative samples before the discharge. A consequent positive result could also be erroneously considered as reinfection. An update of this review when new studies would be available is warranted.
Implications for research
Due to the multiple difficulties associated with the lack of reporting of included studies, and due to the high probability of new studies being published in the short-term, we provided some recommendations for future studies candidates to be included in an update of this review:
Inclusion of a series of consecutive patients instead of selected groups, to avoid spectrum bias.
Inclusion of a series of consecutive patients instead of selected groups, to avoid spectrum bias
If samples /specimens are analysed, reporting of information by patient
Description of RT-PCR scheme in use, including target genes under assessment and positivity criteria
Description of pre-analytical steps (conservation of samples, time until being sent to the laboratory, training of personal)
Clear reporting of the time since the onset of symptoms, especially for those patients with clinical findings at admission
Reporting of the number of additional RT-PCR assays performed
Details about the application of the reference standard, including the time of administration after the index test (initial RT-PCR)
If possible, database sharing could allow re-analyses by independent researchers, including individual-patient data (IPD)-meta-analysis and increasing thus the confidence on the new evidence
Add serological samples to a cohort of individuals with compatible symptoms and negative PCR to warrant an independent verification of infection.
Data Availability
The study protocol is available online at https://tinyurl.com/vvbgqya. Most included studies are publically available. Additional data are available upon reasonable request.
DECLARATIONS
Contributors
IAR, DBG, DS and JZ conceived the study. IAR, DBG, DSR, RDC, JAPM and JZ designed the study. IAR, DBG, DSR, PZA screened titles and abstracts for inclusion. IAR, DBG, DSR, PZA,, AR and JZ extracted and analysed data. RDC, JAPM, AC, OS and NL assisted in the interpretation from a clinical viewpoint. IAR, DBG, DSR and JZ wrote the first draft, which all authors revised for critical content. All authors approved the final manuscript. IAR and JZ are the guarantors. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.
Funding
Ingrid Arevalo-Rodriguez is funded by the Instituto de Salud Carlos III through the “Acción Estrategica en Salud 2013-2016 / Contratos Sara Borrell convocatoria 2017/CD17/00219” (Co-funded by European Social Fund 2014-2020, “Investing in your future”).
Competing interests
All authors declare: no support from any organisation for the submitted work; no competing interests with regards to the submitted work.
Ethical approval
Not required.
Data sharing
The study protocol is available online at https://tinyurl.com/vvbgqya. Most included studies are publically available. Additional data are available upon reasonable request.