McQ - An open-source multiplexed SARS-CoV-2 quantification platform

Implementation of cost-efficient high-throughput methods for detection of RNA viruses such as SARS-CoV-2 is a potent strategy to curb ongoing and future pandemics. Here we describe Multiplexed SARS-CoV-2 Quantification platform (McQ), an inexpensive scalable framework for SARS-CoV-2 quantification in saliva samples. McQ is based on the parallel sequencing of barcoded amplicons generated from SARS-CoV-2 genomic RNA. McQ uses indexed, target- specific reverse transcription (RT) to generate barcoded cDNA for amplifying viral- and human-specific regions. The barcoding system enables early sample pooling to reduce hands-on time and makes the approach scalable to thousands of samples per sequencing run. Robust and accurate quantification of viral load is achieved by measuring the abundance of Unique Molecular Identifiers (UMIs) introduced during reverse transcription. The use of homemade reverse transcriptase and polymerase enzymes and non-proprietary buffers reduces RNA to library reagent costs to ~92 cents/sample and circumvents potential supply chain shortages. We demonstrate the ability of McQ to robustly quantify various levels of viral RNA in 838 clinical samples and accurately diagnose positive and negative control samples in a testing workflow entailing self-sampling and automated RNA extraction from saliva. The implementation of McQ is modular, scalable and could be extended to other pathogenic targets in future.


Introduction
The coronavirus disease 2019 (COVID-19) has created a worldwide pandemic with over 59 million reported cases and 1.4 million recorded deaths worldwide 1 . While populationwide lockdown measures help to contain the rapid spread of the virus, thereby alleviating the burden on the health care system, they come at a severe economic and societal cost. Thus, lockdowns are not sustainable and alternative measures have been proposed to enable social and economic life to resume in the absence of an effective vaccine. These include social distancing measures and wearing of face protection in public spaces. Another important leverage in the fight to control the pandemic is increasing testing capacity to enable systematic surveillance testing, which combined with contact tracing and rapid isolation of infectious individuals can curtail viral spread 2,3 .
One of the challenges in the control of the COVID-19 pandemic is the high frequency of pre-or asymptomatic individuals that nevertheless display high levels of infectivity, making containment of transmission solely by symptomatic testing impossible [4][5][6] . This is the case for a majority of patients during the 2-7 days of the incubation phase 7 and lasts during the entire duration of the infection for approximately one fifth of individuals 8 . Detection of these asymptomatic spreaders could be achieved by the implementation of highfrequency population-scale surveillance testing. This would imply testing all members of a community repeatedly and regardless of symptoms. Achieving such a goal requires the development of scalable testing strategies that are accessible and cost efficient. sensitive as it can detect very low abundances of viral particles. Current challenges include the use of commercial reagents that are subject to shortage, poor scalability and the high cost of the test, which preclude its adoption for population-scale testing of asymptomatic individuals. To circumvent these limitations, several alternative technologies have been developed 9 . These are based on the detection of antigens [10][11][12][13] , colorimetric assays [14][15][16][17][18] , CRISPR-based detection [19][20][21][22] or multiplexed detection of nucleic acids using NGS [23][24][25][26][27][28][29] .
Here we describe the development of an integrated highthroughput testing workflow for multiplexed SARS-CoV-2 quantification (McQ). The strategy entails parallel sequencing of barcoded amplicons generated from SARS-CoV-2 genomic RNA. Samples are barcoded by targeted reverse transcription against viral and human target regions to label each sample uniquely and enable pooling for barcoded amplification. Short read Illumina sequencing is used to quantify viral abundance in individual samples, using unique molecular identifiers added during reverse transcription for improved robustness and reduced variability in viral load quantification. Quality control, data analysis and diagnostics are performed using public open source code. To circumvent reagent shortages McQ largely relies on homemade enzymes and non-commercial buffers. We benchmarked McQ against existing technologies, and validated its performance through the parallel testing of >800 clinical samples. We combined McQ with automated RNA extraction from saliva samples from healthy volunteers and control samples. We confirm McQ diagnostics of all saliva samples as negative with LAMP and RT-qPCR, and show accurate diagnostics of positive and negative controls.

Results
A key feature to enable scalability of SARS-CoV-2 genomic RNA quantification resides in lowering the per sample hands-on time. To that end, we adapted a reverse transcription barcoding strategy 25 to enable early sample pooling and parallelised processing. This enables scaling to thousands of samples per sequencing run. McQ uses Unique Molecular Identifiers (UMIs) to correct for PCR amplification bias, enabling more accurate quantification of viral load. The McQ sample preparation workflow relies mostly on homemade enzymes and reagents to circumvent supply chain shortages and to lower testing cost.

McQ experimental design.
McQ is based on the parallel sequencing of barcoded amplicons generated from SARS-CoV-2 genomic RNA ( Figure 1). McQ was established using RNA purified from clinical swab samples or saliva samples provided by healthy volunteers. We implemented an automated extraction workflow based on magnetic beads and non-proprietary buffers leading to good RNA yield and purity (Supplementary Results and Supplementary File 1). Note that McQ is compatible in principle with any upstream extraction protocol.
The McQ workflow starts with the distribution of extracted RNA into 96-well plates. McQ is designed to quantify the abundance of two viral targets, the E gene and nsp14 (ORF1) regions of the SARS-CoV-2 genome (using primers established by Charite 30 and HKU 31 ); and a region of the human RNAse P (RP) gene (recommended by the CDC 32 ). The two SARS-CoV-2 targets serve as internal replicates for viral load. The human target ensures the generation of amplicons even in negative samples to guarantee pool complexity at the level of sample barcodes. The protocol uses a homemade reverse transcriptase as well as DNA polymerase that were benchmarked in the context of this study (see purification protocol in Supplementary File 2 and benchmarking in Supplementary Results).
Reverse transcription (RT) is performed using targetspecific primers containing a barcode that uniquely labels each sample. Additionally, the RT primers add a UMI to each cDNA molecule, enabling to identify duplicates that may arise at the following PCR amplification step. After RT, excess primer is digested using Exonuclease I, and equivalent volumes of samples are pooled in batches of 96 for bead purification. The pooled cDNA is used as template in a PCR with three forward primers against individual targets (E, nsp14, RP) and a common reverse primer that anneals to the universal sequence introduced at the RT step. The PCR primers contain overhangs that allow adding indexed sequencing adapters compatible with Illumina sequencing in a second PCR. This enables direct multiplexing of tens of thousands of samples in one sequencing run. Our approach relies on sequences that are compatible with standard Illumina indexing and sequencing (Illumina read1 and 2, and P5 and P7 primers).
Samples are typically sequenced on a MiSeq or NextSeq platform using 75bp single end reads. Samples were demultiplexed using the plate and sample indexes. Several filters were implemented to remove primer dimers and other sources of noise. Then UMIs were counted for SARS-CoV-2 (E and nsp14) and human (RP) targets (Methods). The sample processing protocol is provided in Supplementary File 4. It can be completed within 4 hours, is easily scalable, and cost-efficient at <1 Euro/sample (from RNA to sequencingready library). We describe implementation of the full McQ workflow, including RNA extraction and production and integration of homemade enzymes in the Supplementary Results section.
Establishing McQ assay sensitivity using synthetic SARS-CoV-2 RNA template. To determine the sensitivity of McQ we processed 96 samples spiked with 0 to 100,000 copies of synthetic viral RNA (0-5000 cp/µl RNA sample). Swab or saliva samples carry heterogeneous amounts of human DNA and RNA. To test the effect of this heterogeneity, we added variable amounts of commercial human RNA or RNA/gDNA mix to each sample.
We retrieved all the 96 sample barcodes used to tag these samples, demonstrating the robustness of our approach to sample drop-off during pooled amplification. We observed a good correlation in the UMI counts for the E and nsp14 Target multiplex RT reaction (nsp14, E gene, RP) with primers containing N15 UMI, well barcodes (BC) and adding common reverse primmer ( We observed a good correlation between viral UMI counts and the number of SARS-CoV-2 molecules spiked into the sample (Figure 2), and high consistency between replicates, demonstrating that McQ accurately quantifies SARS-CoV-2 RNA. We observed background counts between 4 and 47 viral UMIs in control samples where no SARS-CoV-2 template was present. Similar viral counts were observed for samples containing up to 250 SARS-CoV-2 template molecules (12.5 cp/µl RNA sample), indicating that McQ sensitivity is in the range of 500 copies. This corresponds to 50 copies/µl in the extracted RNA sample.
Detection of SARS-CoV-2 in clinical samples. Having established McQ robustness and sensitivity, we benchmarked its ability to quantify viral RNA in a panel of 838 clinical samples previously characterised by qPCR in a clinical diagnostics facility. We observed a good concordance between UMI counts for viral targets and Ct values determined by the RT-qPCR assay ( Figure 3A). As for synthetic samples we detected all expected barcodes in each pool and consistent UMI counts for E and nsp14 regions. Viral UMI counts showed a linear correlation with Ct values for Ct values below 30, after which the counts flattened out to background levels. This suggests a limit of detection of our assay corresponding to a Ct of around 30 in RT-qPCR. This corresponds to approximately 50 copies/µl (estimates from Robert Koch Institute 33,34 ) and is in line with our results with synthetic RNA (Figure 2). UMI counts for the RP amplicon were relatively constant across samples ( Figure 3). We could not detect the RP target in 10% of the samples (85 out of 838). Sample to sample variability in RP counts likely reflects heterogeneity in sampling, extraction and RNA quality.
Compared to our experiment with synthetic control samples, we observed higher background UMI counts in clinical samples previously diagnosed as negative (Ct of 40) by RT-qPCR ( Figure 3). Samples with Ct 40 had acceptable median UMI counts of 9 and 11 for the nsp14 and E targets respectively. However, the background UMI counts reached up to 2452 for some samples, that is two orders of magni- . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 4, 2020. tude higher than the maximum of 47 observed in synthetic samples. The high background not only caused false positive sample assignment but also prevented distinction of samples with low viral load from negative samples, limiting sensitivity. We suspect that the presence of many samples with high viral load on these plates, combined with repeated usage of these samples has increased the chance of sample crosscontamination and is a major source of high background. Importantly, this issue was resolved when repeating the procedure on freshly collected samples ( Figure 4). Moreover, in the Supplementary Results section we show that inclusion of an Exonuclease I step prior to sequencing lowered viral UMI counts in control samples. A unique feature of the McQ protocol is the introduction of UMIs to correct for PCR duplicates in target quantification. We evaluated the importance of this feature by comparing the accuracy of target quantification using simple read counts ( Figure 3B) and upon UMI correction ( Figure 3A). While the linear correlation between Ct and counts for viral targets was observable with both metrics, the data were substantially more noisy using read counts only ( Figure 3B,C). Moreover, background counts in negative samples were significantly higher with up to 1.5 million counts ( Figure 3B). We conclude that UMI-based quantification is important for higher accuracy in quantification of viral load by correcting for amplification bias. These results demonstrate that McQ is able to identify positive samples and UMI counts quantitatively reflect viral load.
Proof of concept of McQ in routine testing. We performed a small pilot study to demonstrate the applicability of McQ for surveillance testing. Saliva samples were selfcollected by 136 healthy volunteers, registered online and dropped off at a central collection point. We performed automated RNA extraction from these samples (Supplementary File 1) and applied the McQ analysis framework. In parallel, we processed the same RNA samples with a commercially available loop-mediated isothermal amplification (LAMP) rapid colorimetric assay kit. We included several negative (water and previously tested negative swab samples) and positive controls (100,000 copies of synthetic RNA, dilutions of titrated swab samples, RNA of positive samples). RNA extraction time on the robot was about one hour per 96 well plate. The complete turnaround time was 48h. With these fresh samples, viral UMI counts for negative controls were low (between 4 and 23 counts). This argues for low levels of cross-contamination inherent in our procedure and suggests the high levels of background we observed before were a consequence of contamination already present in the RNA samples. Importantly, the viral UMI counts in positive controls were more than an order of magnitude higher than in negative controls and we could detect positive control samples with low viral load (Ct 29.56) ( Figure 4A). We diagnosed all of the saliva samples as negative based on the abundance of E and nsp14 UMIs, which were at comparable levels to negative controls (1-88 UMIs). The results were confirmed independently using LAMP ( Figure 4B), yielding 100% agreement between the two approaches. Negative status of three samples with McQ or LAMP values just below the positive threshold was confirmed via qPCR.
As McQ is based on sequencing, a significant part of the costs per sample is coming from Illumina sequencing. To address the impact of sequencing depth on diagnostic accuracy we subsampled the data. With 2600 reads per sample we detected either the E or nsp14 target in all sixteen positive controls, and 15/16 with 260 reads per sample ( Figure S10). This suggests that a sequencing depth of 3000 reads per sample could be sufficient to detect positive samples. Together these data demonstrate that McQ has the sensitivity and the throughput to test for presence of SARS-CoV-2 in saliva or swab samples.

Discussion
We have developed McQ, a SARS-CoV-2 quantification assay that couples early-stage barcoding with next-generation sequencing for multiplexed processing of thousands of samples at low cost, enabling testing at the scales required for routine surveillance screening. Barcoding at the reverse transcription step enables early sample pooling to reduce handson time and cost. Integration of homemade enzymes and reagents circumvents potential bottlenecks due to reagent shortages and facilitates implementation of the workflow in settings with limited financial resources. We have demonstrated that McQ can detect SARS-CoV-2 RNA in extracted RNA from clinical samples. Compared with read-based quantification, the UMI-based quantification implemented in McQ enables higher accuracy in determination of viral load by correcting for amplification bias. Viral UMI counts were predictive of Ct values obtained on the same samples using a validated RT-qPCR method in a clinical diagnostic facil- ity. Moreover, McQ accurately diagnosed samples in an integrated testing workflow entailing self-sampling and automated RNA extraction from saliva. McQ is highly modular and can easily be extended to other human or SARS-CoV-2 targets, or to multiplexed detection of several pathogens in one sample.
McQ analytical sensitivity is lower than with gold standard RT-qPCR based tests at 50 copies/µl, corresponding to samples with a Ct of approximately 30. Nevertheless, this may be sufficient for efficiently limiting viral spread as epi-demiological studies have identified high testing frequency and fast turnaround times as the most critical variables for controlling transmission 35,36 . Sensitivity was only secondary to these factors, with little difference in preventing transmission with high (10 3 cp/ml, RT-PCR) or lower sensitivity (10 5 cp/ml, e.g. LAMP) tests 35 . In addition, samples with low viral load (Ct >30 or <50 copies/µl RNA) have been associated with a loss of cultivation of the virus 33 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 4, 2020. to clearly distinguish positive from negative samples, with viral UMI counts at least an order of magnitude higher in positive samples even for a sample with low viral load (Ct 29.56). McQ increases testing capacity at low cost and could complement highly sensitive symptomatic testing, which on its own is not sufficient to control viral spread due to high infectivity of pre-and asymptomatic individuals 5,7 .
McQ is based on homemade reverse transcriptase and polymerase enzymes. We provide detailed purification protocols and recipes for non-proprietary buffers for use of our enzymes by the community. Our homemade MMLV_mut reverse transcriptase and Pfu-Sso7d polymerase enzymes showed comparable performance to commercial enzymes. Our in-house solutions lower reagent costs and circumvent the dependence on commercial kits that are subject to shortage when highly demanded. This will facilitate the implementation of widespread testing. Furthermore, they can be readily deployed in other testing frameworks and complement open-source solutions for one-and two-step RT-qPCR diagnostics [41][42][43][44] .
Reducing molecular contamination, shortening turnaround time, and investing in alternative methods for swab or saliva sample processing that are more automation friendly are key areas for improvement towards . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 4, 2020. ; implementation of McQ in a surveillance testing framework. In the first set of RNA samples we processed with McQ we detected high numbers of viral UMIs in a significant number of negative (Ct 40) samples, causing false positive sample assignment and limiting sensitivity by preventing distinction of samples with low viral load from negative samples. With freshly collected samples viral UMI counts for negative controls and in negative saliva samples were reduced significantly, suggesting that the high levels of background we observed before were a consequence of contamination already present in these RNA samples. Moreover, we have shown that inclusion of an Exonuclease I step prior to sequencing reduces viral UMI counts in control samples. Combining this improvement with strategies to reduce molecular contamination (according to clinical testing guidelines) should further reduce background levels of viral UMI counts. These measures will improve sensitivity, and reduce the potential for false positives that require re-testing.

Supplemental results
Automated paramagnetic bead-based RNA extraction from swabs and saliva. RNA purification is a major bottleneck in scaling up testing capacity. To overcome this we implemented the SARS-CoV-2 RNA purification protocol from BOMB-Bio 45 on the Biomek i7 liquid-handling platform at the EMBL Genomics Core Facility. A detailed description of the automation workflow, reagents and the source code for sample processing on the Biomek i7 platform is provided in Supplementary File 1.
We first implemented bead-based extraction manually using swab samples. Nose and throat swab samples were collected in Amies transport medium from COVID-19 patients at Uni Klinikum Heidelberg, Frankfurt and Oldenburg (Methods). 200 µl of each sample was manually transferred to a 1.2 ml deep-well plate and 100 µl GITC lysis buffer was added. Samples were subsequently purified using 40 µl of paramagnetic beads, washed with ethanol and resuspended in 25 µl of MilliQ water. To evaluate extraction efficiency we performed RT-qPCR (using TaqMan Fast Virus 1-Step Master Mix, Thermo) on a subset of extracted samples (Figure S1). These included extractions of Expi293F cell suspensions (10 6 cells/ml) in a 0.9% NaCl saline solution with different amounts of synthetic SARS-CoV-2 RNA template spiked-in (2.5 -500 copies/µl corresponding to 500-100,000 copies total), dilution series of two swab samples previously tested positive for SARS-CoV-2 by RT-qPCR (swab 1 Ct = 16.37, swab 2 Ct = 16.61), and an undiluted SARS-CoV-2 negative swab sample (swab 3). We obtained similar Ct values for identical amounts of synthetic viral template regardless of whether it was spiked in prior to extraction or immediately prior to RT, indicating high extraction efficiency ( Figure S1A). Bioanalyzer traces confirmed high RNA quality ( Figure S1B). We detected viral RNA by qPCR in all swab dilutions ( Figure S1C) and observed an increase in Ct with increasing dilution, as expected. RNA quality was generally lower in extracted swab samples ( Figure S1D) compared to human cell control or synthetic viral RNA spiked into Expi293F cells. The differences in quality could be due to repeated freeze-thawing of swab samples, or inhibitory effects of the swab collection medium during RNA extraction.
Having demonstrated that bead-based RNA extraction is highly efficient from manually processed swab samples, we proceeded to implement automated extraction. At this point we turned to saliva samples as sample collection is easier and can be performed by individuals, without the need for trained personnel. Saliva samples provided by healthy volunteers were collected in a 0.9% NaCl solution. Samples were inactivated in bulk by the addition of boiling water previously heated to 95°C to a 1L beaker containing all sample tubes, and incubated for 5 minutes. Water temperature was required to be >80°C after the 5 min incubation. After inactivation, 200 µl of each sample was manually transferred to a 1.2 ml deep-well plate and 100 µl GITC lysis buffer was added. Samples were subsequently purified using 40 µl of paramagnetic beads, washed with ethanol and resuspended in 25 µl of MilliQ water. RNA quality was similar to manually extracted swab samples ( Figure S1E).
The complete extraction procedure can be achieved in 145-175 min /plate ( 40 min for sample collection, boiling and scanning of barcodes; 60-90 min for manual transfer to 96-well plates; 45 min for robotic extraction). Automated extraction typically yielded 10 -75 ng of RNA for saliva samples that were subsequently used for reverse transcription.

Expression and purification of homemade enzymes.
To develop a workflow that is independent of supply-limited and expensive commercial enzymes and reagents we expressed and purified a Moloney Murine Leukemia Virusbased reverse transcriptase (MMLV_mut) and a Pfu-Sso7d polymerase (described in detail in Supplementary File 2). Both enzymes were purified using a multi-step purification to obtain highly pure MMLV_mut (76 kDa) and Pfu-Sso7d (98 kDA) protein. MMLV_mut was expressed in E. coli BL21 (DE3) cells, with a yield of 120 mg MMLV_mut from two litres of expression culture. Pfu-Sso7d was expressed in E. coli Rosetta (DE3) cells, with a yield of 17 mg Pfu-Sso7d from one litre of expression culture.
Optimising MMLV_mut reaction conditions. To establish reaction conditions for homemade MMLV_mut reverse transcriptase we performed cDNA synthesis from synthetic SARS-CoV-2 viral RNA template with the N1 primer set recommended by the CDC, and evaluated efficiency via qPCR using SYBR Green master mix (Table S1, Figure S2A). In a first attempt to test its activity we used MMLV_mut at the suggested concentration of commercial SSIII (1 µl of 1 mg/ml stock in 20 µl reaction) with a buffer similar in composition to commercial SSIII buffer (Buffer V1). Following recommended reaction conditions for SSIII we obtained cycle threshold (Ct) values of 28 and 31 (for 100,000 and 10,000 spike-in molecules) after reverse transcription with MMLV_mut, outperforming the commercial SSIII enzyme (Ct 30 and undetermined, respectively) and demonstrating MMLV_mut was active and produced cDNA suitable for SARS-CoV-2 detection. To identify robust con- . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; ditions for efficient reverse transcription using homemade MMLV_mut enzyme we extensively varied different reaction parameters ( Figure S2A). We swapped the target-specific N1 reverse transcription primer for random hexamers to avoid optimizing conditions for a specific target. We found a Tris and MOPS based buffer with reduced salt concentrations and glycerol (for thermostabilisation) and a non-ionic detergent resulted in robust reverse transcription (Buffer V5, developed by Alexander Klenov 46 ). Reverse transcription with 10x less enzyme (stock of 0.1 mg/ml) substantially improved efficiency (Ct 24.76 compared to Ct 31 with 1 mg/ml enzyme stock for 10,000 spike-in molecules). Further dilution of the enzyme stock (to 0.02 mg/ml) tended to result in decreased Ct but with considerably more variable outcomes ( Figure S2A). We applied the optimized conditions (0.02 mg/ml enzyme and V5 buffer) to determine SARS-CoV-2 load in four previously tested clinical samples (kindly provided by M. Knop, ZMBH + DKFZ) and obtained very similar Ct values (Fig.  S2B). The previous RT-qPCR assay was run in a clinical diagnostics facility, using a commercially available test kit that uses altered versions of the E Sarbeco 30 primer set and using 10 µl RNA as input, which could explain the small discrepancies. Taken together we identified robust reaction conditions for homemade MMLV_mut reverse transcriptase compatible with efficient production of cDNA, yielding similar viral load quantification as a validated qPCR kit.

McQ amplicon and primer design.
We designed primers targeting the E Sarbeco and nsp14 (ORF1) regions in the SARS-CoV-2 genome for next-generation sequencing based multiplexed detection. These regions were used in RT-PCR tests developed by the Charite 30 and HKU 31 , and both sets of primers resulted in high analytical efficiency and sensitivity 47,48 . We modified priming sequences to increase melting temperatures and ensured specificity via blast 49 against the human genome (hg38). To ensure primers are uniquely targeting and capable to amplify SARS-CoV-2 variants we aligned primers to a reference SARS-CoV-2 genome (NC_045512.2) and a panel of 1775 sequenced SARS-CoV-2 strains obtained from GISAID. We based RP primer design on the sequences recommended by the CDC, for amplification of a human control target, and only modified primers to match melting temperatures of viral primers. It is important to note that while these primers can be used to ensure pool complexity at the level of sample barcodes, they cannot distinguish negative samples from failed extractions as human genomic DNA contamination would lead to the presence of RP amplicons 50 . Designing the RP reverse transcription primer to target another exon will address this problem. All primers were additionally checked for (self)-dimerization potential. To barcode samples at the cDNA step we incorporated 15 nucleotide sample barcodes (described in Hossain et al. 25 ) in reverse transcription primers, to allow pooling of 96 cDNA samples into a single tube for PCR amplification. Reverse transcription primers additionally contained a N15 UMI (for viral load quantification) and a partial Illumina read1 primer sequence. This sequence serves as a common reverse primer for all three targets in the first round of PCR amplification, and is used in the second PCR to add indexed sequencing adapters. PCR1 forward primers were correspondingly designed with a partial Illumina read 2 primer sequence in their overhangs. This setup allows the direct sequencing of amplicons and readout of sample and sequencing indexes using standard Illumina reagents and kits. Samples are sequenced in 75SE mode and demultiplexed into 96-well plates using 8nt i5 and i7 indexes. The first 15 cycles of read 1 contain the sample barcode to assign samples to plate wells, followed by 15 cycles of UMI that identify unique RNA template molecules and allow correcting for amplification bias during PCR, the target-specific priming sequence (18 to 24 nt) and 21 to 27 nt of target sequence (depending on the target) that allows distinguishing between actual reads and primer dimer. The 27 nt of human RP sequence are too short and do not contain sufficient genomic variation to identify individuals from sequencing data.

Implementation of McQ, an NGS-based assay for multiplexed SARS-CoV-2 detection.
For initial protocol implementation we used 1 µl of the 1 mg/ml MMLV_mut stock with buffer V1 and performed reverse transcription of defined mixtures of synthetic SARS-CoV-2 viral RNA and synthetic human liver RNA as template. To maximize Pfu-Sso7d amplification efficiency we tested amplification of the E amplicon from cDNA generated from 5000 copies of synthetic viral spike-in with four reaction buffers (HF-V2 to HF-V5, described by Alexander Klenov) 51 in addition to HF-V1, our customary in-house Pfu-Sso7d buffer. Two buffers, HF-V4 and HF-V5 enabled strong and specific amplification at levels similar to commercial Phusion enzyme ( Figure S3A). In contrast to other buffers HF-V4 and HF-V5 contained Arginine and Trehalose, which could aid the amplification of low amounts of template. We proceeded with buffer HF-V5, which enabled reduced primer dimer formation compared to HF-V4, as evidenced by the substantially weaker primer dimer band. We determined optimal primer concentrations and viral to human primer ratios during reverse transcription and PCR1 to minimise the formation of primer dimers and favour amplification of viral targets, settling on a 2:1 ratio of viral over RP primers in both reverse transcription (200 nM and 100 nM) and PCR1 (50 nM and 25 nM). Sequencing adapters were added in PCR2 using commercial KAPA polymerase. We applied the resulting protocol to twelve samples containing defined amounts of synthetic SARS-CoV-2 spike-in molecules (10 to 100,000) mixed with 10 ng total human RNA. To address whether pooled amplification leads to sample dropout or reduces quantification accuracy we performed both PCRs on individual samples ( Figure S4A) and on pooled cDNA of the twelve samples ( Figure S4B), using 35 and 30 cycles in PCR1, respectively. We obtained similar UMI counts (labelling unique RNA templates) for E and nsp14 primer sets within a sample, indicating comparable sensitivity, and UMI counts between replicates showed good agreement. Viral UMI counts showed a linear increase with the amount of synthetic SARS-CoV-2 template added to samples. UMI counts for the RP amplicon were relatively similar across samples, as expected from the fixed amount of . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; human RNA spiked into each sample. We detected all twelve sample barcodes in individually amplified samples, as well as in the sample pool. While the total UMI counts for a given template amount were lower in the sample pool than in individually amplified samples, the linear, positive correlation of viral UMI counts with the amount of synthetic SARS-CoV-2 template added to samples was preserved ( Figure S4B).
We next replaced commercial KAPA with Pfu-Sso7d in PCR2 and identified optimal buffer and annealing conditions ( Figure S3B). Using cDNA from the pool of twelve samples described above we performed PCR1 with Pfu-Sso7d and compared performance of KAPA and Pfu-Sso7d in PCR2. Both polymerases yielded similar numbers of UMIs (Figure S5A). KAPA polymerase is favored in library applications due to its low GC-amplification bias compared to other polymerases. We quantified amplification bias by counting the numbers of reads per UMI and found very similar distributions for Pfu-Sso7d and KAPA, with a tendency for less variation in read numbers with Pfu-Sso7d ( Figure S5B). We could not compare KAPA polymerase against Pfu-Sso7d for PCR1 as KAPA consistently produced non-specific higher molecular weight bands for the RP and nsp14 targets. As PCR1 entails more cycles than PCR2 it is possible that a larger difference in efficiency and bias between the two polymerases is masked when comparing them in the second PCR only.

Sources of and strategies to improve incomplete UMI
recovery. E and nsp14 UMI counts in samples with synthetic SARS-CoV-2 RNA were 5 to 10-fold lower than expected based on the number of viral template added to these samples ( Figure S4, S5). Individually amplified samples showed the same tendency, particularly for samples with high amounts of template added ( Figure S4A). This indicates that sample pooling was not by itself the cause for reduced UMI recovery. Subsampling of reads revealed that higher sequencing depth only incrementally increased UMI numbers. Taking into account the moderate amplification bias ( Figure S5B) this led us to believe that steps during library preparation, such as inefficient or biased reverse transcription and the first step of PCR, presented bottlenecks for UMI recovery. We first addressed the impact of optimizing reverse transcription. As for random hexamers, buffer V5 and a 10-fold dilution of MMLV_mut (0.1 mg/ml stock) improved reverse transcription efficiency with McQ target-specific viral primers (Figure S6A), demonstrated by the higher number of viral UMIs detected with the diluted enzyme in buffer V5. Improved viral UMI recovery with 0.1 mg/ml compared to 1 mg/ml MMLV_mut was more pronounced in buffer V1 than V5, suggesting that multiple interacting factors can lead to improved efficiency. The improvement in efficiency was target specific as UMI counts for RP dropped with lower enzyme concentration in both buffers. After a series of additional experiments we converged on an optimal MMLV_mutbased reverse transcription protocol using buffer V5 46 with trehalose (buffer V7) and 0.1 mg/ml MMLV_mut (see Figure  S6C for comparison to buffer V1 and 1 mg/ml MMLV_mut in RT-qPCR). We decided against using a further dilution of MMLV_mut (to 0.02 mg/ml stock) due to the higher variability in results ( Figure S2A). We also compared MMLV_mut to commercial SSIV enzyme (Thermo), which promises enhanced yield and performance, at different concentrations and incubation temperatures but did not observe any improvement in UMI recovery for viral targets with SSIV (Figure S6B). The RP target was amplified better with SSIV in all tested conditions. PCR jackpotting (the selection of templates in the first round of PCR) is another well-known bottleneck in library preparation. To estimate the relative contribution of bottlenecks in reverse transcription and PCR on reduced UMI recovery, and address whether a split amplification design aimed at randomizing effects of PCR jackpotting would improve UMI recovery we amplified two pools of cDNA, each consisting of twelve samples, using four different PCR designs ( Figure S7A). Each PCR design used the same pool of cDNA as input. The twelve samples contained defined amounts of synthetic SARS-CoV-2 spike-in molecules (0 to 100,000, in duplicate) mixed with 10 ng total human RNA. For pool 1 we performed reverse transcription using a modified buffer V5 and 0.1 mg/ml MMLV_mut (optimized conditions) while pool 2 samples were reverse transcribed using the initial conditions (buffer V1 and 1 mg/ml MMLV_mut). As the cDNA input in all four designs was the same, calculating the fraction of UMIs that are shared between the resulting samples can address whether PCR substantially biases the subset of UMIs amplified from a cDNA pool ( Figure S7B). We observed reasonable overlap (44%-57%) between samples derived from the same PCR1 reaction (AA and AB, as well as BA and BB), but almost no overlap between samples deriving from separate PCR1 reactions (AA or AB to BA and BB), indicating almost entirely distinct subsets of UMIs were amplified in each PCR1. This strongly suggests PCR jackpotting as a major source limiting UMI recovery. To address whether the number of UMIs could be improved by splitting a single PCR into four reactions, randomizing PCR jackpotting, we compared the number of UMIs recovered with each design ( Figure S7C). Interestingly, we observed no difference between designs for the pool of samples reverse transcribed with less optimal conditions (pool 2), but improved UMI recovery with designs using PCR splitting in one or both PCRs in the pool of samples reverse transcribed with optimized conditions (pool 1). No UMIs should be shared between pools 1 and 2 as they result from distinct reverse transcription reactions and UMI complexity is sufficiently high (N15). UMIs that are shared between pools 1 and 2 (1%-6%) therefore represent the level of contamination accrued from molecular contamination at any step after reverse transcription or from sample mis-assignment. Based on these results we suggest further optimisation of the protocol towards higher UMI recovery should address the PCR step. The split amplification design we used here can be readily implemented without further experimental optimisation, with the disadvantage of an increase in PCR sample numbers. Alternatively, evaluation of other polymerases or buffer conditions that ameliorate jackpotting, such as adding Betaine or . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Sources of and strategies to improve background sig-
nal. Reads mapping to viral sequences in samples containing no SARS-CoV-2 RNA are a source of false positive test outcomes and limit sensitivity of detection assays. Non-zero background has been observed in other sequencing-based assays for SARS-CoV-2 detection 23,24,53 , and molecular contamination and mis-assignment of sequencing reads were identified as main underlying sources 23 . Leveraging the UMI information we have shown that between 1% and 6% of the UMI counts in a pool are contaminants of another pool (Figure S7B), resulting from events taking place after reverse transcription (during library amplification or sequencing). Index hopping during sequencing due to free index primers that were not used up in PCR is a major source for sample misassignment [54][55][56] . To specifically address contribution of this variable to the fraction of NTC UMIs (viral UMIs in samples without SARS-CoV-2) we compared the number of viral UMIs in 63 NTC samples prior to and after an additional Exonuclease I digestion step after PCR2 that removes excess index primer. The NTC samples were part of pools of 12 samples that contained 2 to 12 NTC samples. Exonuclease I treatment reduced NTC UMIs at least partially in 72% of samples, and completely eradicated NTC UMIs in 58% of samples ( Figure S8). The extent of reduction was pool specific suggesting that molecular contamination prior to the reverse transcription step varied between pools. In 25% of samples the number of NTC UMIs increased, indicating additional sources of NTC UMIs during sequencing (e.g. contamination in the sequencers). These data suggest that index hopping during sequencing resulting from free-floating indexed primers accounts for a substantial fraction of background that can be reduced by including an Exonuclease I treatment step after PCR2 and prior to sequencing.

Automation of the McQ reverse transcription step.
To further reduce hands-on time in the McQ workflow we implemented a protocol for automated reverse transcription on the Beckman i7 automation system at the EMBL Genomics Core Facility. The method can easily be implemented on any liquid-handling system, using the original manual protocol as a guideline. A detailed description and step-by-step protocol is provided in Supplementary File 3. To compare performance to manually processed samples we performed reverse transcription using 10 µl aliquots of the same 96 samples (containing varying levels of synthetic viral RNA spikein and human input material) used for Figure 2. As for manually processed samples, we detected reads for all 96 sample barcodes ( Figure S9A), and observed very good concordance between UMI counts from manually processed samples and samples processed on the i7 ( Figure S9A, B). The protocol is currently implemented for parallel processing of two plates and takes approximately 106 min (20 min handling time and 86 min off-deck incubation time). It can easily be scaled as all incubation steps are performed off-deck on thermal cyclers.

Samples used in the study.
To benchmark McQ we used pseudo-anonymized surplus RNA sample material that had been collected for clinical diagnosis of SARS-CoV-2 infection by RT-qPCR carried out by the diagnostic laboratory of Heidelberg University Hospital. Such reuse of material is in accordance with German regulations, which allow development and improvement of diagnostic assays using patient samples collected specifically to perform the testing in question. Pharyngeal swab specimens provided to us were either collected through the nose (nasopharyngeal) or the mouth (oropharyngeal), or sometimes one swab was used to collect both.
Clinical sample handling. Specimens were collected as nasopharyngeal and oropharyngeal flocked swabs in Amies medium (eSwab, Copan Italia). The sample collection happened as part of the routine operation of Heidelberg University Hospital and at public testing stations set up by the City of Heidelberg. Collected samples were transported in sterile containers, delivered to the diagnostic laboratory within a few hours, and then examined directly or stored at 4°C until further processing. Samples were processed in a biosafety level 2 cabinet until inactivation by heat or mixing with a lysis buffer.

RNA isolation and RT-qPCR diagnostics for clinical
samples. The standard diagnostic pipeline of the hospital laboratory was as follows: RNA was isolated from nasopharyngeal and oropharyngeal swab specimens using QIAGEN kits (QIAGEN, Hilden, Germany); either automated on the QIASymphony (DSP Virus/Pathogen Mini Kits) or QIAcube (QIAamp Viral RNA Mini Kits) devices or manually (QI-Aamp Viral RNA Mini Kits). Please note that the QiaCube uses a sample volume of 140 µl and an elution volume of 100 µl, whereas the QiaSymphony uses a sample volume of 200 µl and an elution volume of 115 µl. RT-qPCR for the quantification of the SARS-CoV-2 viral genome was performed using kits and reagents from TIB MOLBIO Syntheselabor, Berlin, Germany. The kits were used according to the manufacturer's instruction and contained the primer/probe sets developed based on the published Sarbeco primer set 30  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; LightMix Modular SARS and Wuhan CoV N gene (cat. no. 53-0775-96; TIB MOLBIOL), 0.5 µl of LightMix Modular EAV RNA Extraction Control (cat. no. 66-0909-96; TIB MOLBIOL), and 0.1 µl of reverse transcriptase enzyme (LightCycler Multiplex RNA Virus Master, Roche, Basel, Switzerland). The master mix (10 µl) was distributed per reaction into 96-well plates, and 10 µl of purified RNA was added per well. The performance of the RT-qPCR was validated using a positive control for the E gene. A total of 1000 molecules of E gene RNA per RT-qPCR reaction correspond to a CT 30.
Optimisation of MMLV_mut reaction conditions using the N1 primer set. To test homemade MMLV_mut synthetic RNA template (Twist Biosciences, spike-in 1 or 2) was used in the indicated amounts. To each reverse transcription (RT) reaction we added 1 µl 10mM dNTPs, 1µl 2µM gene-specific primer or 50 ng random hexamers (Thermo) in a total volume of 13µl. Following a 5 minute incubation at 65°C reactions were placed on ice.
The indicated homemade RT buffer, 1µl 10mM DTT and 1 µl MMLV_mut (at the indicated stock concentration, diluted from 2 mg/ml stock using MMLV_mut dilution buffer) were added as a master mix and cDNA synthesis was performed at 55°C (for target specific RT primer) or 50°C (for random hexamers) for 30 to 60 min (see Table S1) followed by 15 min inactivation at 70°C. Buffer recipes are provided in Supplementary File 2. For determination of RT efficiency, SYBR green PCR master mix (Thermo) was used according to the manufacturer's protocol with primers at 100 nM. Nuclease free water was used as a non-template control. Reactions were run on an ABI flex with default SYBR green cycling conditions.

McQ reverse transcription, cleanup and PCR amplification.
A detailed step-by-step protocol is provided in Supplementary File 4. 10 µl of sample (either 10 µl extracted RNA or a mixture of indicated amounts of Twist synthetic viral RNA and total human RNA filled up with nuclease-free water) was used as input to reverse transcription. Mixtures of the three indexed primers and dNTPs were prepared in a 96well format index master plate as described in Supplementary File 4. The final primer concentrations in reverse transcription reactions were 200 nM for E and nsp14, and 100 nM for RP. Primers, dNTPs and template were incubated at 65°C for 5 min, and immediately placed on ice to cool down for 1 min prior to addition of the enzyme master mix containing MMLV_mut at the indicated concentration (1 µl of indicated stock in a total of 20 µl reaction), the respective buffer, DTT, and RNAse inhibitor (Takara Bio). Reverse transcription was run in a thermocycler (55°C 30 min, 70°C 10 min). Samples were then incubated with thermolabile Exonuclease I (NEB) (37°C 30 min, 85°C 5 min) to digest excess barcoded RT primer. For manually processed samples, samples were processed in 96-well format using multichannel pipettes. For reverse transcription on the Biomek i7 we prepared the index plate as described in Supplementary File 4. RT enzyme and Exonuclease I master mixes were prepared in four 1.5 ml Lo-bind Eppendorf tubes as described in Supplementary Files 3 and 4. The procedure after ExoI digest was the same for manually and robotically processed samples. All samples from a plate (up to 96) were pooled using a multichannel pipette. 10 µl of each well were transferred into 8-well PCR strips, and combined into 1.5 ml Lo-bind Eppendorf tubes and mixed well. Samples were purified using 1X SPRI beads (Beckman Coulter Life Sciences) according to manufacturer's instructions, and eluted in a total volume of 30 µl of nuclease-free water (NFW). To avoid sample loss due to elution in a small volume of NFW relative to beads we used only 30 µl of beads for cleanup and added 930 µl of SPRI buffer only (for a total volume of 1X); see Supplementary File 4 for details. 29 µl of the cleaned, indexed cDNA pool was used as template for multiplexed amplification with two virus-specific forward primers (E and nsp14, 50 nM each), human RP-specific forward primer (25 nM), and a common reverse primer binding to the constant sequence added with the RT primer (125 nM). Each pool was amplified for 30 cycles with homemade Pfu-Sso7d (0.07 mg/ml) and buffer HF-V5. The master mix and samples were kept on ice at all times during the sample preparation, and reactions transferred directly to the pre-warmed PCR block (98°C). The following cycling conditions were used: 98°C 30 sec; [98°C 10 sec, 62°C 20 sec, 72°C 20 sec] X 30 cycles; 72°C 5 min. PCR products were then cleaned using 1.8 volumes of SPRI magnetic beads (Beckman Coulter, Life Sciences) according to manufacturer's instructions, and eluted in 15 µl of nuclease-free water (see Supplementary File 4 for detailed procedure).
Sequencing adapters and sample indexes were added in PCR2, using 3 µl of cleaned PCR1 product as input. Indexed Illumina P5 and P7 primers were used at 400 µM each, using unique i5-i7 index pairs for each 96-well plate pool. Each pool was amplified for 10 cycles with homemade Pfu-Sso7d (0.07 mg/ml) and buffer HF-V5. The master mix and samples were kept on ice at all times during the sample preparation, and samples transferred directly to the pre-warmed PCR block (98°C). The following cycling conditions were used: 98°C 2 min; [98°C 10 sec, 62°C 30 sec, 72°C 15 sec] X 10 cycles; 72°C 2 min. Libraries were cleaned using 1 volume of SPRI magnetic beads (Beckman Coulter, Life Sciences) according to manufacturer's instructions and eluted in 15 µl of nuclease-free water. Sample concentrations were determined using Qubit high-sensitivity DNA assay, and quality confirmed on an Agilent high-sensitivity Bioanalyzer chip. Samples were pooled equally (according to pool size and concentration) for sequencing. We amplified samples individually for data in Supplementary Figures S3, S4 (upper  panel), and S6. Procedure was the same as for 96-well pooled amplification, except that samples were cleaned up individually after RT, eluted in 7 µl NFW, which was used as input for PCR1 with 35 cycles. The entire cleaned PCR1 reaction was used as input for PCR2. We processed samples in pools of 12 for Supplementary Figures S4, S7 and S8. To this end 10 µl of each well was pooled, cleaned with 1X SPRI beads, and eluted in 40 µl of NFW. Half of the cDNA pool was used as template for PCR1, which was run identically as for the 96- . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint well pools. PCR1 products were cleaned using 1.8X SPRI beads and eluted in 12 µl NFW. Half of the cleaned PCR1 product was used as template for PCR2.

Exonuclease I digest to remove excess indexed P5
and P7 primers. To reduce contamination during sequencing caused by leftover indexed primers, the PCR2 products were optionally treated with thermolabile Exonuclease I (NEB) for 30 min at 37°C followed by 5 min at 85°C.

SARS-CoV-2 detection in saliva samples.
Healthy volunteers collected and registered their own samples and dropped them off at a central collection point. Volunteers were provided with a self-testing kit consisting of a 50 ml falcon tube containing 5 ml of 0.9% saline (0.9% NaCl solution, sterile), a barcoded 2 ml tube with screw lid and a disposable pipette inside a plastic bag. The set further contained an information sheet containing the code of the test set and a link to an instruction video for correct sample collection. Volunteers used the saline solution for deep-throat gargling (30 sec) for sample collection, which was spit back into the Falcon tube. The Falcon tube was shaked vigorously and 1-1.5 ml of the mixture was transferred into the 2 ml tube using the plastic disposable pipette. Volunteers were instructed to collect samples before breakfast or brushing their teeth, on the same morning the RNA was extracted. Volunteers dropped their 2 ml sample tube at the established collection points before 9:30am. Sample inactivation proceeded as described above in a safety Level S2 laboratory. For transfer to 96-well plates, samples were processed in a laminar flow hood. The plate was then sealed with a foil and placed on the robot for RNA extraction. RNA extraction and library preparation were performed as described above, with the following deviations: We included an additional 2.2X SPRI cleanup step after RNA extraction, samples were incubated with ExoI after PCR2 to digest excess indexed primer, reverse transcription was performed with commercial SSIII and we used commercial NEB Phusion ready mix in PCR2. Saliva samples were distributed into two 96-well plates, along with positive and negative controls. Positive controls were added to each plate at the same location, and included a highly diluted negative saliva sample spiked with RNA from a sample tested positive for SARS-CoV-2 (spiked into two wells), 100,000 molecules of Twist synthetic RNA (in water, spiked into two wells prior to extraction) both plotted as positive control, and extracted RNA from four samples tested positive for SARS-CoV-2 (added after RNA extraction) plotted as positive sample. We included two kinds of negative controls: Empty wells (water added before RNA extraction), to measure background contamination, and saliva from a previously tested negative sample (spiked into two wells). An index master mix plate was prepared in a PCR hood after UV exposure from newly ordered oligos and in a room where no SARS-CoV-2 related reagents had been handled. For comparison we processed 1 µl RNA from each sample with the SARS-CoV-2 Rapid Colorimetric LAMP Assay kit (NEB) following manufacturers instructions except that reaction volumes were halved and reactions performed in 96-well format. LAMP delta OD value (D) was calculated as the difference between absorbance at 432 nM and 560 nM (D = OD432 -OD560). The kit included actin as internal control (IC), and samples with IC D values of <0.4 were classified as failed. The LAMP primer mix contained primers targeting the E and N gene for viral detection, and samples with D>0.4 were considered positive. Samples tested via qPCR (using 2 µl RNA as input) were processed using TaqMan Fast Virus 1-step Master mix, which uses the N1 primer set.
Data processing. Demultiplexed FASTQ files were processed using umi_tools 57 (umi_tools extract) to only preserve reads with expected sample well barcodes (maximum 96). Reads were mapped to the three expected amplicon sequences with bwa 58 (bwa mem). Sorted BAM files were filtered to remove primer dimer reads and potential other noise. The number of UMIs as well as the number of reads per UMI was counted and merged for all the samples for downstream analysis. For downsampled data, we sampled a fraction of reads per plate at random with samtools 59 command samtools view -s. Using the EMBL cluster (run on Intel Xeon Gold 6136 CPU and 32 Gb of RAM), this pipeline allows generating count matrices in under 10 minutes for a sequencing run. Exact running time depends on available hardware and sequencing depth. For instance, while it takes about eight minutes for a single plate (FASTQ file) with 15 million reads to be processed, analysis time is reduced to 9 seconds when reads are downsampled to 15,000. Downstream analysis of count tables was performed with the R programming language 60 using tidyverse 61 for data handling and ggplot2 62 for visualization. Scripts for processing of sequencing data have been deposited at http://github.com/gtca/McQ. *corresponding authors Correspondence should be addressed to sibylle.vonesch@embl.de; larsms@embl.de; benes@embl.de; kim.remans@embl.de; arnaud.krebs@embl.de . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint  Number of synthetic viral spike-in molecules . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint Figure S1. RNA extraction benchmarking. (A) Scatter plot depicts Ct values obtained in RT-qPCR for defined amounts of synthetic SARS-CoV-2 template added to samples before RNA extraction (spiked into Expi293F cell suspension at 106 cells/ml in saline solution) or directly added prior to RT. (B) Bioanalyzer traces and RNA integrity numbers (RIN) for extracted Expi29 3F cell suspension (200 µl) spiked with indicated amounts of synthetic SARS-CoV-2 template. (C) Scatter plot depicts Ct values obtained in RT-qPCR for dilutions of two swab samples extracted manually with our magnetic bead-based RNA extraction workflow that were previously tested positive via qPCR. (D) Bioanalyzer traces and RNA integrity numbers (RIN) for extracted swab samples diluted prior to extraction. NTC = no template control. Human cell control is a suspension of cells without viral RNA spiked in. (E) Bioanalyzer traces and RNA integrity numbers (RIN) for saliva samples extracted with our automated workflow.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. . PCRs were performed on half of the cDNA reverse transcribed from 10,000 Twist synthetic RNA templates (using E-specific RT primer, RT buffer V1 and 1 mg/ml MMLV_mut) (+). Non-template controls (-) contain nuclease-free water instead of cDNA. Arrows indicate E amplicon and primer dimer bands. (B) Buffer comparison and annealing gradient for PCR2 with Pfu-Sso7d and commercial KAPA polymerase. PCRs were performed on PCR1 products of mixtures of 10,000 synthetic viral RNA molecules and 100 ng human total RNA reverse transcribed with a mixture of E, nsp14 and RP RT primers (using RT buffer V5 + Tre and 0.1 mg/ml MMLV_mut) (+). Non-template controls (-) received nuclease-free water instead of cDNA.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint Scatter plots depict the number of Unique Molecular Identifiers (UMI) labeling viral targets (E gene: pink, nsp14: purple) or human control (RP: green) as a function of the number of synthetic viral RNA molecules spiked into 10 ng total human RNA. Each viral spike-in amount was processed in duplicate. UMI counts reflect unique RNA templates present in the sample.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020.  . Empty = water, added before RNA extraction, negative control = saliva from a previously tested negative sample, positive control = highly diluted negative saliva sample spiked with RNA from a sample tested positive for SARS-CoV-2 and100,000 molecules of Twist synthetic RNA (in water, spiked in prior to extraction), positive RNA = extracted RNA from samples tested positive for SARS-CoV-2 (added after RNA extraction). Boxplots are overlaid to visualize distributions. Data is the same as Figure 4 except only 1% of the data (corresponding to approximately 2600 reads/sample) is plotted.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The protocol follows the original protocol by BOMB.Bio 1 adapted for an automated implementation on the Biomek i7. This protocol can in principle be implemented on any liquid handling robot equipped to work with magnetic beads. The input of the automated procedure is a heat-inactivated sample in GITC lysis buffer, manually formatted in 96 deep-well plates from Abgene. Individual steps as programmed on the Biomek i7 are described in Figure 1 and deck layout at start is shown in Figure 2. Table 1 lists required consumables. The figure shows a detailed step-by-step workflow of the protocol used for extraction on the i7. We start the extraction protocol with a plate containing 300ul (sample + lysis-buffer). The sample in the end is then eluted in 25ul into a PCR-plate.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020.  Pink boxes are 50ul Beckman tip-boxes, orange boxes are 1000 ul Beckman tip-boxes. "Isoprop" and "H20dd" are upside-down tip-box lids used as reservoirs. Waste and "lysSamples" are AB-0932 plates. "MagMix" and "EtOH" are AB-1127 plates. The Alpaqua Magnum-EX is used as magnet. Samples are eluted in "purRNA", which is a barcoded PCR-plate (Thermo). The blue position is a metal PCR-lid used for storing the samples in the cycler after extraction. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 4, 2020. implemented an automated RT protocol on the Beckman i7 automation system at the EMBL Genomics Core Facility. We believe that the method can easily be implemented on any liquid-handling system, using the original manual protocol as a guideline. Individual steps as programmed on the Biomek i7 are described in Figure 1, deck layout at start is shown in Figure 2, and chemistry setup in Figure 3. Table 1 lists required consumables for processing two plates. Master mixes are prepared in advance in 1.5 ml tubes and kept on ice until used. Lids have to be cut off tubes immediately prior to start of the protocol. The index master plate (containing the three target-specific barcoded primers and dNTPs in each well) is prepared in a PCR plate as for manual processing. All incubation steps requiring a thermal cycler are run off-deck to allow for parallel processing of several plates. The protocol starts with aliquoting the master-mixes from 1.5 ml tubes into chilled PCR plates.
Reaction plates are set up using 10µl extracted RNA and 3µl index master mix, followed by off-deck incubation for RT primer annealing.
After incubation plates are placed back on the deck and RT master mix (containing enzyme, buffer and additives) is added. Reverse transcription is performed off-deck, plates are placed back on deck for addition of Exonuclease I, followed by off-deck incubation for ExoI digest, after which the samples are ready for pooling and cleanup.
The system prompts the user to clean the deck and shut down the instrument at the end of the protocol. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint    Setup of the chemistry in the cooling-rack, as well as input criteria for sample and index plate. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint

McQ step by step protocol for 96 samples
Critical notes before starting: • Pipet everything on ice! • Prepare the index master mix plate prior to starting, this plate can be re-used for processing several 96-well plates. We recommend making a large amount of stock once, in an area/room free of potential virus contamination (no synthetic virus material or swab samples handled, and using no equipment or reagents that have been used for processing such samples), and aliquoting the master mix according to anticipated number of plates that are processed per run. Discard the remainder of the aliquot to avoid carrying molecular contamination into following runs.
• Trehalose in RT Buffer V7 will fall out of solution at low temperatures. Trehalose can be re-solubilized by incubation at 40°C in a water bath for 30 minutes (intermittent shaking).
Preparation step -Make an index master mix plate: Make an index master mix plate in a PCR plate. This plate can be used for processing several 96-well plates. If oligos are ordered at 100µM first make dilutions of the oligo plate to 10µM. See comments above for aliquoting index master mix.

Reagent
Pipet everything on ice! Prepare RT enzyme mix.
Add 3 ul index master mix into 96-well plates Add 10 ul sample Incubate 65°C 5 min . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020. 12 Dilute Exonuclease I (ExoI) 1:1 in ExoI dilution buffer (to reduce viscosity), and add 2 µl diluted ExoI to each well (helps with viscosity)
Step 2 -pool and clean cDNA: • Combine 10 µl of each RT reaction into a single 1.5 ml tube (960µl total) • Perform 1X (v/v) SPRI cleanup: Add 30 ul SPRI beads to pool, then add 930 ul SPRI buffer (without beads) for a total buffer to sample volume of 1X. Extra buffer can be prepared or obtained from taking the supernatant of 930µl of beads bound on a magnet. • Mix well and incubate 5 min at room temperature • Transfer tube to magnet and let bind for 5 minutes (until the solution is completely clear). This can be aided by pipetting up and down several times while the tube is bound on the magnet. • Remove the buffer and do two washes with 80% ethanol (make sure beads are completely covered), completely remove residual ethanol, and take the tube off the magnet for drying (~1 minute) • Add 30 µl NFW and resuspend beads, transfer the sample to a PCR tube and incubate for 1-2 minutes • Transfer to a PCR-tube sized magnet, let beads bind for 1-2 minutes, and transfer eluate to a fresh tube Step 3 -PCR1: Keep mastermix and samples on ice during pipetting as this helps reduce primer dimer. Keep reactions on ice and put directly into pre-warmed block (98°C) when using Pfu-Sso7d.

Total 50
Add 21 ul master mix to cDNA . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020. 12 Step 4 -SPRI cleanup PCR1: 1.8X SPRI bead cleanup. Add 30 ul beads and 60 ul SPRI buffer to sample (total buffer volume 90 µl), and perform cleanup in PCR tubes, following protocol described in Step 2. Elute in 15 µl NFW.
Step 5 -PCR2: Keep mastermix and samples on ice during pipetting as this helps reduce primer dimer. Keep reactions on ice and put directly into pre-warmed block (98°C). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 4, 2020. ; https://doi.org/10.1101/2020.12.02.20242628 doi: medRxiv preprint