PT - JOURNAL ARTICLE AU - Gardner, Eugene J. AU - Sifrim, Alejandro AU - Lindsay, Sarah J. AU - Prigmore, Elena AU - Rajan, Diana AU - Danecek, Petr AU - Gallone, Giuseppe AU - Eberhardt, Ruth Y. AU - Martin, Hilary C. AU - Wright, Caroline F. AU - FitzPatrick, David R. AU - Firth, Helen V. AU - Hurles, Matthew E. TI - Detecting cryptic clinically-relevant structural variation in exome sequencing data increases diagnostic yield for developmental disorders AID - 10.1101/2020.10.02.20194241 DP - 2021 Jan 01 TA - medRxiv PG - 2020.10.02.20194241 4099 - http://medrxiv.org/content/early/2021/06/04/2020.10.02.20194241.short 4100 - http://medrxiv.org/content/early/2021/06/04/2020.10.02.20194241.full AB - Structural Variation (SV) describes a broad class of genetic variation greater than 50bps in size. SVs can cause a wide range of genetic diseases and are prevalent in rare developmental disorders (DD). Patients presenting with DD are often referred for diagnostic testing with chromosomal microarrays (CMA) to identify large copy-number variants (CNVs) and/or with single gene, gene-panel, or exome sequencing (ES) to identify single nucleotide variants, small insertions/deletions, and CNVs. However, patients with pathogenic SVs undetectable by conventional analysis often remain undiagnosed. Consequently, we have developed the novel tool ‘InDelible’, which interrogates short-read sequencing data for split-read clusters characteristic of SV breakpoints. We applied InDelible to 13,438 probands with severe DD recruited as part of the Deciphering Developmental Disorders (DDD) study and discovered 64 rare, damaging variants in genes previously associated with DD missed by standard SNV, InDel or CNV discovery approaches. Clinical review of these 64 variants determined that about half (30/64) were plausibly pathogenic. InDelible was particularly effective at ascertaining variants between 21-500 bps in size, and increased the total number of potentially pathogenic variants identified by DDD in this size range by 42.3%. Of particular interest were seven confirmed de novo variants in MECP2 which represent 35.0% of all de novo protein truncating variants in MECP2 among DDD patients. InDelible provides a framework for the discovery of pathogenic SVs that are likely missed by standard analytical workflows and has the potential to improve the diagnostic yield of ES across a broad range of genetic diseases.Competing Interest StatementM.E.H. is a co-founder of, consultant to, and holds shares in, Congenica Ltd, a genetics diagnostic company.Funding StatementThe DDD study presents independent research commissioned by the Health Innovation Challenge Fund [grant number HICF-1009-003], a parallel funding partnership between Wellcome and the Department of Health, and the Wellcome Sanger Institute [grant number WT098051]. The views expressed in this publication are those of the author(s) and not necessarily those of Wellcome or the Department of Health.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study has UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South REC, and GEN/284/12 granted by the Republic of Ireland REC).All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Yes I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesSequencing, phenotype data, and variant calls for all data in this paper are accessible via the European Genome-phenome Archive (EGA) under study number EGAS00001000775 [https://www.ebi.ac.uk/ega/studies/EGAS00001000775 [ebi.ac.uk]]. Within this study, WES files of all DDD families are provided as part of the dataset EGAD00001004390. Gene lists and input data for analysis are available as part of the InDelible software. https://github.com/HurlesGroupSanger/indelible