Abstract
The current diagnostic rate for patients with suspected Mendelian genetic disorders is only 25 to 58%, even though whole exome sequencing (WES) is part of the standard of care. One reason for the low diagnostic rate is that traditional WES analysis methods struggle to detect RNA splicing aberrations. It is estimated that 15-50% of human pathogenic variants alter splicing, with numerous splice-altering variants being causal for known Mendelian disorders. Developing reliable diagnostic tools to detect, quantify, prioritize, and visualize RNA splicing aberrations from patient RNA sequencing is therefore crucial. We present MAJIQ-CLIN, a method to address this need to augment clinical diagnostic using RNA-Seq and compare it to existing tools. We include the first systematic evaluation of the accuracy of such tools using synthetic data across several aberration types and transcript inclusion levels; we also evaluate accuracy on several datasets of biologically validated solved test cases. We show that MAJIQ-CLIN compares favorably to existing tools in both accuracy and efficiency, then use MAJIQ-CLIN to investigate several unsolved patient cases from the Undiagnosed Diseases Network.
Competing Interest Statement
The MAJIQ software used in this study is available for licensing for free for academics, for a fee for commercial usage. Some of the licensing revenue goes to Yoseph Barash and members of the Barash lab.
Funding Statement
This research was supported by National Institutes of Health Grant R01 LM013437. UDN research reported in this publication was supported by the National Institute of Neurological Disorders and Stroke of the National Institutes of Health under Award Numbers U01HG007709 and U01HG007942. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
https://www.gtexportal.org/home/ https://undiagnosed.hms.harvard.edu https://github.com/carojoquendo/RNA_splicing_and_disease
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The code for MAJIQ and VOILA are available for academic/non-commercial use at majiq.biociphers.org. Licensing information for commercial use can be found at majiq.biociphers.org/commercial.php. All processed data and code to reproduce figures will be deposited on Zenodo before publication. The MAJIQ CLIN and matching VOILA updates will be added to the MAJIQ and VOILA repository upon publication. Raw GTEx data used for the analyses in this manuscript are available in dbGaP under accession phs000424. UDN data referenced in this manuscript is available in dbGaP under accession phs001232.v5.p2. The Baralle dataset is available at https://github.com/carojoquendo/RNA_splicing_and_disease.
https://www.majiq.biociphers.org