Prospective evaluation of genome sequencing versus standard-of-care as a first molecular diagnostic test

Abstract Purpose: To evaluate the diagnostic yield and clinical utility of clinical genome sequencing (cWGS) as a first genetic test for patients with suspected monogenic disorders. Methods: We conducted a prospective randomized study with pediatric and adult patients recruited from genetics clinics at Massachusetts General Hospital who were undergoing planned genetic testing. Participants were randomized into two groups: standard-of-care genetic testing (SOC) only or SOC and cWGS. Results: 204 participants were enrolled and 99 received cWGS. cWGS returned 23 molecular diagnoses in 20 individuals: A diagnostic yield of 20% (20/99, 95%CI 12.3-28.1%)), which was not significantly different from SOC (17%, 95%CI 9.7%-24.6%, P=0.584). 19/23 cWGS diagnoses provided an explanation for clinical features or were considered worthy of additional workup by referring providers. While cWGS detected all variants reported by SOC, SOC failed to capture 9/23 cWGS diagnoses; primarily due to genes not included in SOC tests. Turnaround time was significantly shorter for SOC compared to cWGS (33.9 days vs 87.2 days, P<0.05). Conclusions: cWGS is technically suitable as a first genetic test and identified clinically relevant variants not captured by SOC. However, further studies addressing other variant types and implementation challenges are needed to support feasibility of its broad-scale adoption.


Introduction
Currently, diagnostic standard-of-care (SOC) genetic testing practices are guided by specialty-based practice guidelines and clinical judgement [1][2][3] . These practices may consist of a combination of methods such as karyotyping, array-based comparative genomic hybridization, single gene analysis, and multigene panels 4 . While high-coverage targeted sequencing technology has broadened the ability to assess and interpret the human genome, this approach has three key limitations. First, it requires that a set of genes be prespecified for each disease area; second, it limits the ability to reanalyze the data after new gene-disease associations are made; and third, it requires provider awareness and commercial availability of numerous disease-specific testing options.
In contrast to disease-focused genetic analysis, exome (WES) and genome sequencing (WGS) have the potential to overcome the limitations of SOC and serve as effective diagnostic tools for rare genetic disorders in children [5][6][7][8][9][10] . Furthermore, WGS provides more uniform coverage of the genome, expands the scope of variants that can be identified based on documented medical and family history, and can reduce the number of genetic tests necessary to reach a diagnosis 11 .
In this prospective randomized study, we aimed to 1) assess diagnostic yield and clinical utility of clinical genome sequencing (cWGS) across various disease phenotypes and ages at diagnostic evaluation, and 2) explore the challenges associated with implementing cWGS as a diagnostic tool for patients with suspected genetic conditions. Here we report on diagnostic yield, clinical utility, and turnaround time (TAT) of cWGS as compared to SOC.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20181073 doi: medRxiv preprint To be eligible for the study, patients were required to be pursuing a diagnostic genetic test at the time of enrollment; individuals were not eligible if they previously pursued genetic testing for the same indication. Potential participants were identified through medical record review by a research study coordinator and eligibility was confirmed by a study genetic counselor and the referring clinician. Given prior data on the utility of sequencing pediatric patients and their parents 12 , patients under the age of 18 were offered enrollment as a family trio. Eligibility criteria are further described in Table S1. Consent sessions with a genetic counselor involved a discussion of study logistics, an overview of cWGS, and potential results, which included both primary and non-primary findings. Participants were allowed to opt-out of receiving results in the American College of Medical Genetics minimum list of genes to be reported as secondary findings (ACMG-59) 13 . After enrollment, patient features were abstracted from electronic medical records (EMR) and recorded as Human Phenotype Ontology (HPO) terms using PhenoTips 14 .

Study design and participants
Randomization was used as a strategy to avoid influencing the referring provider's SOC approach and biasing patient choices for reflex testing. Enrolled participants were randomized 1:1 to receive only SOC or to receive both SOC and cWGS. Referring clinical providers, study staff members with patient interaction, and patients were blinded to randomization status until cWGS report availability of three months after enrollment if randomized to the control arm. Block randomization stratified by clinic was implemented to ensure that a comparable proportion of individuals from each clinic received cWGS.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20181073 doi: medRxiv preprint Participants enrolled as a family trio were randomized independent of the clinic in which they were enrolled.
All participants were asked to complete two surveys -one at the time of enrollment and one after learning their randomization status and receiving cWGS results if enrolled into the cWGS arm. This study was completed as a demonstration project in the MGH Center for Genomic Medicine and was approved by the Mass General Brigham Institutional Review Board.

Genome Sequencing, Analysis, and Reporting
Genome sequencing was performed in the Clinical Laboratory Improvement Amendments-certified, The Partners Laboratory for Molecular Medicine (CLIA#22D1005307) performed sequence realignment, variant calling, annotation, and report generation. Reads were aligned to the human reference sequence (GRCh37) using Burrows-Wheeler Aligner (BWA), and variant calls were made using the Genome Analysis ToolKit (GATK). cWGS analysis methods and reporting criteria are described in the Supplementary Methods and Figure S1.
The cWGS data was retrospectively screened for SOC-reported copy number variants as described in the Supplementary Methods.

Molecular Diagnosis and Clinical Utility
In this study, sequencing results were categorized as a 'molecular diagnosis' if they met all of the following criteria: (1) variant(s) classified as Pathogenic, 'P' or Likely Pathogenic, 'LP', (2) variant(s) in genes with known disease association, and (3) variant(s) in allele states consistent with the inheritance . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20181073 doi: medRxiv preprint pattern of the associated disorder. Further, molecular diagnostic findings were categorized as 'primary' if they explained or partially explained the indication for SOC. 'Non-primary' molecular diagnoses were those that were unrelated to the patient's indication for testing but were related to the patient's family history, explained an additional phenotype identified upon EMR review, or were indicated on the ACMG-59 secondary findings list.
The molecular diagnostic yield of SOC was compared to that of cWGS for all patients who received both SOC and cWGS reports. All molecular diagnoses on cWGS were evaluated for 'clinical utility'. To assess clinical utility, we evaluated if the result provided a diagnosis consistent with the patient's reported phenotype and if the result informed medical management; clinical utility was confirmed by the referring clinician.
Turnaround time was determined by calculating the number of days from the test order date to the date the report was generated. When multiple tests were performed as part of the SOC process (i.e. microarray plus WES), turnaround time was designated as the number of days between the first test order date to the date of last report.

Statistical analyses
Mean values between groups were compared using the two-sample t-test. Comparison of multiple values between the two study arms was performed using two-way analysis of variance (ANOVA).
Diagnostic yields were compared using the two-sample test of proportions. Statistical significance threshold was set at alpha = 0.05. All analyses were performed in Stata/IC 14.2.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20181073 doi: medRxiv preprint

Participant demographics, clinics of enrollment, and genetic test indications
Between March 2018 and July 2019, 3,771 patients were evaluated by one of the six participating MGH genetics clinics; 204 patients were enrolled and 100 were randomized to receive cWGS (Fig. 1, Figure S2). One participant did not receive SOC due to insurance challenges and was removed from subsequent analysis --this resulted in 99 participants who received both SOC and cWGS. The highest volume enrollment sites were the Cardiovascular Genetics Program (n=69, 34%) and Medical Genetics Program (n=60, 29%) ( Table 1).
The average age of the total cohort was 40.1 years, with 82% (n=168) age 18 years or older. The majority of all participants (82%) were White (Table 1). 17/36 pediatric probands were enrolled as a trio with both biological parents. The most common SOC test ordered was a multi-gene panel (n = 137, 65%) (Table 1, Figure S3). The average number of HPO terms per participant was 6.14 ( Table S2). No statistically significant differences in age, sex, race, ethnicity, insurance, HPO terms, or number of SOC tests ordered were observed between the control (SOC only) and intervention (SOC + cWGS) groups (Pvalues > 0.05, Table 1).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. .

Comparison of variants detected on SOC vs. cWGS
All 23 of the P/LP small (<20bp) sequence variants reported by SOC were technically detected by cWGS and filtered appropriately, corresponding to a sensitivity of 100%. (Table S3). In addition to sequence variants, 3 P/LP copy number variants (CNVs) were reported by SOC (Table S3, Cases 204CGS, 152CGS, 170CGS). Although CNV calling had not yet been validated in the cWGS test used for this study, retrospective analysis using a research-grade WGS CNV calling algorithm detected all 3 CNVs identified by SOC (Table S4).
Amongst the participants who received both SOC and cWGS, nine molecular diagnoses were detected only by cWGS (Fig. 2, Table S5, S6): four diagnoses met study criteria for primary diagnosis and five met study criteria for non-primary phenotype diagnosis. For two of the cWGS primary molecular diagnoses, reported variants were in genes not included in the SOC analysis (Patients 32CGS, 65CGSsee case vignette below; Table S5, S6). The other two cWGS primary molecular diagnoses were attributed to variants that were detectable by the SOC method (WES) but were not reported on SOC due to differences in laboratory reporting practices (Table S5,  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. .  Table S5, S6). Non-primary phenotypes and family history were not the focus of the SOC approaches. As a result, these genes were not included in SOC ordered.
It should be noted that five molecular diagnoses were made by SOC but not cWGS (Fig. 2). Three of the diagnoses made only by SOC were the result of variant classification differences (63CGS-TOR1A, Table S5, S6). For the remaining two cases (Cases 204CGS, 152CGS), the molecular diagnostic discrepancy can be attributed to the identification of a CNV by SOC, which were only retrospectively examined in this first pilot of cWGS.
cWGS and SOC reports also differed in reporting of non-diagnostic variants. Amongst 99 participants who received both SOC and cWGS, 57 VUS were reported on SOC and/or cWGS (Table   S6). Five VUSs identified exclusively by cWGS in five participants prompted additional clinical workup ( Fig. 3). Two case examples are described below --in both cases, familial testing was recommended to determine the phase of the identified variants; this testing was still pending at the time of this publication.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020.   is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20181073 doi: medRxiv preprint To further explore the medical importance of cWGS results, we reviewed the utility of clinically suspicious VUS findings. Despite uncertain variant pathogenicity, referring clinicians reported that they planned to change medical management and/or pursue additional workup for five patients with VUSs based on cWGS results (Fig. 3). To date, a diagnosis of Niemann Pick Type C was confirmed based on additional workup for one patient (Case 80CGS). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20181073 doi: medRxiv preprint cWGS also confirmed one clinical diagnosis of hemochromatosis in a parent enrolled in this study as a part of a family trio. In total, 15 cWGS molecular diagnoses were confirmed by clinical workup; two (170CGS parent, 32CGS) would not have been made by standard-of-care genetic test approaches.

Diagnostic workup timeline
On average, 1.16 genetic tests per patient were ordered during SOC workup (Fig. 1, Table 1). No differences were observed in number of SOC genetic tests ordered between study arms (P-value > 0.05), suggesting that enrollment in the study did not impact the SOC approach. Amongst participants who received cWGS (n=99), TAT data was available for 98 participants for at least 11 months postenrollment. The average time from the first SOC test order date to last available SOC report was 33.9 days (minimum = 7, maximum = 293, 95% CI, 24.5-43.3) and 87.2 days (minimum = 49, maximum = 162, 95% CI, 82.5-91.9) via cWGS. The SOC workup timeline exceeded cWGS for 6 cases, all of which involved exome or genome sequencing -5 of these 6 were recruited from the Ataxia clinic. Notably for the case with the longest SOC TAT (293 days), the patient was unable to return to care for 8 months between tests, exaggerating the TAT. When comparison was restricted to cases that reached a diagnosis, TAT for cWGS was still significantly longer than for SOC (SOC, 51.2 days, cWGS 92.3 days, P-value < 0.05). However, it should be noted that the cWGS TAT may not be reflective of commercially available WGS tests given the limited WGS analysis staff available for this study.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. .

Discussion
Previous studies suggest that exome/genome sequencing be utilized as the first genetic test for individuals with a suspected genetic disorder, citing increased diagnostic yield, reduced time to reach a diagnosis, and economic advantages over the SOC step-wise approach to genetic testing 15,16 . These studies predominantly enrolled pediatric patients or focused on a specific disease area. To our knowledge, this is the first prospective study comparing the diagnostic yield and clinical utility of singleton and family-trio clinical genome sequencing to that of SOC practices across age groups and medical specialties. This study was particularly unique in that genome analysis and interpretation was conducted within an integrated healthcare setting, allowing for collegial discussions about the significance of cWGS results with referring providers. cWGS resulted in a molecular diagnostic yield of 20% (26.3% in pediatrics and 18.9% in adults); this yield was consistent with other studies that report diagnostic yields ranging from 14% -76% 15,17,18 . Of note, the clinic with the highest cWGS diagnostic yield in this study was the adult ataxia unit (neurology) at 30% (3/10) -all diagnoses were due to the identification of sequence variants. This was an unanticipated finding as 74% of the SOC genetic tests were ordered on the basis of concern for a triplet repeat expansion disorder ( Figure S3). Upon further review of previous studies, the use of WES/WGS has been suggested as a way to improve diagnostic yield for adults with clinically heterogeneous cerebellar ataxias, with yields ranging from 21-46% [19][20][21][22] . Despite equivalent SOC and cWGS diagnostic yields observed in this clinic, cWGS identified clinically suspicious VUS results not assessed by SOC in two additional participants from the ataxia unit (cases 9CGS, 163CGS), further supporting this recommendation.
The TAT for SOC was shorter than cWGS for 93.9% (92/98) of cases. However, this largely reflected the limited staff dedicated to case analysis. Recently, optimized sample preparation, sequencing, and data processing steps and artificial intelligence-assisted analyses have been reported to reduce cWGS TAT to less than 30 hours 23 . If cWGS is to be implemented as a first-line test, it is feasible . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . that investment in infrastructure to support rapid analyses could make TAT comparable to or significantly quicker than existing SOC testing options.
This study also revealed multiple sources of reporting differences between SOC and cWGS that will require consideration before cWGS is broadly implemented. The identification of diagnostic findings that partially explained participant phenotypes in genes that were omitted from the ordering provider's SOC workup highlights the advantages of an unbiased approach to genetic testing such as cWGS. However, cWGS also revealed diagnoses that were unrelated to the patient's primary indication for testing, which may be undesirable for some patients. Another source of reporting differences were owed to discrepancies in variant classification 24 , highlighting the importance of ongoing efforts to refine the ACMG/AMP classification criteria and support data sharing (ClinGen, https://clinicalgenome.org/; ClinVar, https://www.ncbi.nlm.nih.gov/clinvar/). Laboratory reporting practices represented a key third source of discordance between cWGS and SOC reports in this study. It is common practice for targeted sequencing tests to include all variants classified as P, LP, or VUS on the report. Given the drastically increased scope of genomic sequencing, current guidelines for genomic sequencing suggest that VUSs should only be reported in genes highly relevant to the patient phenotype 25 . In following with this practice, several VUSs included on SOC reports were reviewed and excluded from reporting by cWGS due to perceived lack of relevance to the patient phenotype. The observation of fewer reported VUSs together with improved diagnostic yield for cWGS as compared to SOC suggests that more targeted genetic testing reports may be one benefit of widespread implementation of cWGS.
Prospective comprehensive CNV analysis was not performed on the cWGS data in this study. The detection by SOC of clinically significant CNVs suggests that the full potential of cWGS as a diagnostic tool was not realized in this study. However, it is encouraging that 3/3 SOC-reported diagnostic CNVs were detectable in the WGS data using CNV calling algorithms. Future studies will include clinical validation of these algorithms and a comprehensive evaluation of the impact of structural variant calling on the diagnostic yield of cWGS.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. .
We would be remiss not to note that this study was limited by multiple systemic barriers that impact access to and uptake of genetic services and testing within a healthcare system. In a 2015 systematic review these obstacles to genetic services were identified, which included: lack of awareness of personal/ patient risk factors, lack of knowledge of family medical history/ lack of obtaining adequate family history, and lack of knowledge of genetic services 26 . These factors influenced the patients identified and recruited for this study and negatively impacted participant diversity. Beyond access to genetic services, uptake of SOC appointments and testing was a barrier to participation. To participate in this study, individuals were required to attend an in-person appointment and pursue SOC at the time of enrollment.
Given that 189 eligible patients did not attend their SOC appointment and a portion of eligible patients deferred SOC genetic testing due to insurance coverage concerns, patients were likely excluded from the study due to challenges preventing them from traveling to Boston for an appointment as well as underlying insurance challenges imposed by the United States healthcare system ( Figure S2). Further, 176 eligible patients were excluded because they were not English speaking, emphasizing the need for dedicated resources to support diverse populations in clinical care and research. Given that cWGS requires a blood sample, we were also limited by our inability to collect parental samples for trio-WGS when both parents were unable to come to clinic -often due to work, travel, and family-related obstacles.
In order to equitably offer the most comprehensive cWGS evaluation, effort is needed to develop methods that allow cWGS to be run on saliva or buccal samples which can be submitted remotely. Lastly, this was a hospital sponsored clinical research study. Given that most payers consider cWGS to be investigational, efforts must be made to contract with insurance companies and to conduct the necessary costeffectiveness analyses needed to improve payer coverage of this test; doing so will make cWGS accessible to more patients.
This study provides evidence that cWGS is suitable as a first-line diagnostic genetic test, regardless of patient age or clinical specialty. However, metrics beyond diagnostic yield and turnaround time need to be considered prior to broad scale implementation. Capturing the full scope of utility and feasibility, with . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . a particular focus on payer coverage, will allow us to move towards equitable and scalable delivery models of genomic medicine.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. .

Conflict of Interest
The authors declare no conflict of interest.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . P-value is for T-test comparing single values between the two study arms and Anova for comparing multiple values. * protocol deviation to enroll participant less than 3 months of age. ‡ Two enrolled individuals withdrew before being randomized and are not included in this table. Probands in a trio were randomized separately, but included in this table based on clinic of enrollment. Parents in a trio were not included in this table.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . Fig. 1 Proband participant enrollment flowchart. Note: 2 additional cWGS reports were produced for parents in a trio, but were not included in this diagram.
Reports produced (n=100) Excluded from analysis (n= 1) *did not get SOC report . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. .  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. Specific examples when a cWGS report was available before the SOC diagnostic pipeline was complete. Note: One case was removed from this analysis because the report date was unknown. In another case, turnaround time (TAT) was adjusted due to insurance concerns resulting in a delay in the start of the SOC genetic testing pipeline. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020.