Saliva is less sensitive than nasopharyngeal swabs for COVID-19 detection in the community setting

The use of saliva collection for SARS-CoV-2 diagnostics in the ambulatory setting provides several advantages when compared to nasopharyngeal swabs (NPS), including ease of self-collection and reduced use of personal protective equipment (PPE). In addition saliva collection could be advantageous in advising if a convalescent patient is able to return to work after a period of self-quarantine. We investigated the utility of saliva collection in the community setting at Renown Health in a prospective Diagnostic Cohort of 88 patients and in a Convalescent Cohort of 24 patients. In the Diagnostic Cohort, we find that saliva collection has reduced sensitivity (~15% less) relative than NPS. And in our convalescent cohort of patients greater than 8 days and less than 21 days from first symptom, we find that saliva has ~ 50% sensitivity relative to NPS. Our results suggest that rigorous studies in the intended populations should be performed before large-scale screening using saliva as the test matrix is initiated.


Introduction
Strict activity restrictions have been demonstrated to dramatically reduce the spread of COVID-19 and decrease morbidity and mortality. However, such restrictions have had marked negative economic impacts on communities around the world. It has become clear that accurate broad-based testing for SARS-CoV-2 infection is a critical element in any strategy that will allow communities to safely return to normal activity levels [1,2] . Such testing is important for both detection of new infections, and for determining whether an infected individual is no longer infectious and thus safe to come in contact with uninfected members of the community.

RNA extraction
RNA extractions for NPS VTM and saliva samples (400 uL input volume per sample) were performed at Helix using the MagMax Viral/Pathogen RNA purification kit (ThermoFisher CAT: A42352), with an elution volume of 50 uL.

RT-PCR assays
RT-PCR assays were set up according to the manufacturers' protocols with the exception of modifications to the RNA input and reaction volumes, as noted in the main text. RT-PCR assays at Helix were set up manually. RT-PCR assays at the UCSD lab were set up using a Mosquito HV liquid handler (STP Labtech).

Estimate of Technical Performance with Bayesian Latent Class Models
In scenarios where disease diagnosis is often performed using competing methods, in which neither is a true gold standard, it is possible to estimate test performance (sensitivity, specificity and disease prevalence) using latent class models [7] . Test performance can be parameterized and estimated using bayesian models, where conjugate beta priors can be multiplied with binomial likelihood functions to derive posterior estimates for test performance. We modified an already implemented Gibbs sampler for [8] latent class models in R and used the following hyperparameters (ɑ,β) for the Beta priors for prevalence (ɑ=1,β=4), sensitivity (ɑ=1, β=1) and specificity (ɑ=2.5, β=1) to create weak but reasonable priors.

Limit of Detection of RT-PCR assays
The Limits of Detection of two sets of RT-PCR assays, the TaqPath Multiplex RT-PCR COVID-19 Kit (Thermo) and the PrimerDesign COVID-19 assay, were determined using different viral RNA and reaction volumes ( Supplementary Table 1 ).
The limit of detection of the TaqPath Multiplex RT-PCR COVID-19 Kit (Thermo) was tested in two laboratories using the COVID-19 control RNA from the TaqPath COVID-19 Control Kit (Thermo). In the Helix lab, a miniaturized 5 uL input RNA (10 uL total reaction volume) TaqPath assay was performed on a Quantstudio 7 qRT-PCR instrument (Thermo), and the limit of detection was determined to be 6.25 viral copies. In the UCSD lab, a miniaturized 2 uL input RNA (3 uL total reaction volume) TaqPath assay was performed on a Quantstudio 5 qRT-PCR instrument (Thermo), and the limit of detection was determined to be 3.125 viral copies.
The limit of detection of the PrimerDesign COVID-19 assay was determined at Helix using the Twist SARS-COV-2 synthetic RNA control (Twist Bioscience, Cat. # 102019 and 102024). First, . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 15, 2020. . the standard 8ul input RNA (20 uL total reaction volume) PrimerDesign assay was performed using a Roche LightCycler 480 II, and the limit of detection was determined to be 12.5 viral copies (PrimerDesign v1, Supplementary Table 1 ). Next, a miniaturized 2 uL input RNA (5 uL total reaction volume) PrimerDesign assay was performed using a Thermofisher Quantstudio 7, and the limit of detection was determined to be 3.125 viral copies (PrimerDesign v2, Supplementary Table 1 ).

Detection of SARS-CoV-2 using NPS vs saliva in the Diagnosis Cohort
We compared the performance of saliva-based detection of SARS-CoV-2 to standard NPS-based detection in symptomatic patients who met CDC criteria for standard-of-care clinical testing. A total of 88 individuals who presented and qualified for testing were consented and enrolled in the Diagnosis Cohort. The first 50 subjects in this cohort were randomized to saliva collection using the OM-505 or OGD-610 kits, and the last 18 subjects collected saliva using the OGD-610 kit. NPS samples were sent to the Nevada State Health Lab, which used the CDC RT-qPCR assay for diagnostic testing. Saliva samples were sent to Helix, where RNA was extracted and evaluated using the PrimerDesign COVID-19 assay performed at Helix and theTaqPath Multiplex RT-PCR COVID-19 assay performed at UCSD ( Table 1 ).
Using the Helix PrimerDesign v1 results we attempted to estimate the relative performance of saliva to NPS with Bayesian Latent Class Models [7,9] . The NPS sensitivity is estimated to be 98.9% (95% CI: 67.6%-99.7%) and saliva sensitivity is estimated to be 69.2% (95% CI: 38.6%-97.6%), with median reduction in sensitivity of 29.7%.   Figure 1 ). Based on these . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 15, 2020. . findings, we decided to use only the OGD-610 device for saliva collection in the Convalescent Cohort.

Detection of SARS-CoV-2 using NPS vs saliva in the Convalescent Cohort
To compare the performance of saliva-based testing to standard NPS-based testing in convalescent cases, Renown recalled known COVID-19 positive cases for paired collection. RNA was extracted from the NPS VTM and Saliva samples and analyzed using the PrimerDesign and TaqPath assays at the sites shown in Table 2 (full dataset with Ct values for each viral target sequence and internal control sequences are shown in Supplementary Table  2 ). Concordance in positive and negative results for the TaqPath assay between the two sites, UCSD and Helix, was excellent. There was only one sample, the NPS VTM sample, from Subject 11, which was positive for the TaqPath assay at one site (UCSD) and negative at the other site (Helix). The two assays, TaqPath and PrimerDesign, were discordant for four samples, with the PrimerDesign assay being generally less sensitive: the NPS VTM samples were negative using the PrimerDesign assay and positive for the TaqPath assay for Subjects 13, 18, and 8; and for the NPS VTM sample for Subject 11, the Helix PrimerDesign and TaqPath assays were negative, but the UCSD TaqPath assay was positive.
Overall, the sensitivity using the NPS VTM samples was higher than for the Saliva samples. Looking at the results from the more sensitive Helix TaqPath assay, both sample types yielded positive results for 4 Subjects, the NPS VTM only was positive for 5 Subjects, and the Saliva only was positive for 1 Subject. As expected, the positive rate decreased for both sample types over time. Of the 15 Subjects who were tested using the Helix TaqPath assay within three weeks of their first symptom, 9 had positive results from the NPS VTM sample and 4 had positive results from the Saliva sample. For the 9 Subjects who were tested more than three weeks after their first symptom, none had a positive result from the NPS VTM and only 1 had a positive result from the Saliva sample. We note that several Subjects had Inconclusive results for one or both samples; this call results from one out of the three viral target sequences yielding a detectable results for the TaqPath assay and may reflect low viral titer; in nearly every case, the positive target was the N gene, which usually yields the lowest Ct value of the three targets. There were 2 subjects for which the number of days from the first symptom was unknown or could not be corroborated.

Discussion
Self-collected saliva for SARS-CoV-2 testing could reduce patient discomfort and risk of viral transmission to healthcare workers, and also alleviate supply shortages for NPS and viral transport media. Saliva collection has been reported to perform as well or better than NPS for detection of SARS-CoV-2 infection [4,5] . The results of this study do not corroborate these previously reported findings.
We believe there are several plausible explanations for these different outcomes. First, our study focused on enrolling subjects from the community who were diagnosed as outpatients; only 2 patients in the entire study were inpatients, and neither was admitted to the ICU. Prior studies largely focused on inpatient populations, of which a large proportion were ICU patients.
Outpatients are likely to have milder symptoms, and have been shown to have reduced viral titers relative to more acute patients [10,11] . Our findings are lower than a large community study of 622 patients with paired NPS and saliva samples, where saliva samples were used to confirm NPS findings. In this study, only 84.9% of NPS samples were confirmed with saliva. This reduced sensitivity seems to be related to reduced viral titer in saliva samples [12] as well . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 15, 2020. . as differences in temporal dynamics in shedding in upper respiratory locations versus saliva [13,14] . Lastly, in convalescent patients, viral titers remain at detectable levels up to 21 days after initial testing [15,16] . Unfortunately, in this population, our saliva results had only ~50% of the sensitivity of NPS.
All reported studies have used different saliva collection methodologies. Different preservation solutions may differ in their ability to protect viral RNA from degradation, and can also interact with extraction chemistry, potentially impacting the efficiency of RNA extraction or causing inhibition of the qPCR reaction. To evaluate whether the particular preservation solution we used may have impacted the results, we tested the Spectrum S-1000, which was used in one of the previous studies [4], in saliva matrix negative samples spiked with synthetic viral RNA sequences. We did not observe decreased extraction efficiencies or inhibition between the OGD-610, which was used in our study, and the Spectrum S-1000 ( Supplementary Table 3 ).
The limitations of this study include the lack of detailed clinical information about participating patients; we did not attempt to correlate clinical characteristics with test outcomes. Further investigation and confirmation of this study is warranted. We acknowledge that alternative saliva collection methods with different preservation solutions, different extraction chemistries, or use of a more sensitive COVID-19 assay may yield better results in mildly symptomatic patients in the community setting. We are hopeful that further studies of these variables will provide an alternative collection system for testing mildly symptomatic or convalescent patients in the community setting. Our results suggest that rigorous studies in the intended populations should be performed before large-scale screening using saliva as the test matrix is initiated.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 15, 2020.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 15, 2020. . https://doi.org/10.1101/2020.05.11.20092338 doi: medRxiv preprint

Supplementary Table 3: Limit of Detection comparison of VTM, OGD-610, and Spectrum S-1000 in Contrived Samples
The Spectrum S-1000 saliva collection kit was used in a previous study that showed good correlation between NPS and saliva specimens. We evaluated the performance of both saliva collection tubes side by side, in comparison with VTM using Twist RNA control spike-in prior. Values represent Mean Cp values, using a Cq confidence cut off of >0. 8 is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 15, 2020. . https://doi.org/10.1101/2020.05.11.20092338 doi: medRxiv preprint