Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Severus: accurate detection and characterization of somatic structural variation in tumor genomes using long reads

Ayse Keskus, Asher Bryant, Tanveer Ahmad, Byunggil Yoo, Sergey Aganezov, Anton Goretsky, Ataberk Donmez, Lisa A. Lansdon, Isabel Rodriguez, Jimin Park, Yuelin Liu, Xiwen Cui, Joshua Gardner, Brandy McNulty, Samuel Sacco, Jyoti Shetty, Yongmei Zhao, Bao Tran, Giuseppe Narzisi, Adrienne Helland, Daniel E. Cook, Pi-Chuan Chang, Alexey Kolesnikov, Andrew Carroll, Erin K. Molloy, Irina Pushel, Erin Guest, Tomi Pastinen, Kishwar Shafin, Karen H. Miga, Salem Malikic, Chi-Ping Day, Nicolas Robine, Cenk Sahinalp, Michael Dean, Midhat S. Farooqi, Benedict Paten, Mikhail Kolmogorov
doi: https://doi.org/10.1101/2024.03.22.24304756
Ayse Keskus
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Asher Bryant
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tanveer Ahmad
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Byunggil Yoo
2Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sergey Aganezov
3Oxford Nanopore Technologies, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anton Goretsky
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
4Department of Computer Science, University of Maryland, College Park, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ataberk Donmez
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
4Department of Computer Science, University of Maryland, College Park, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lisa A. Lansdon
2Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Isabel Rodriguez
5Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jimin Park
6UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yuelin Liu
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
4Department of Computer Science, University of Maryland, College Park, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xiwen Cui
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joshua Gardner
6UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Brandy McNulty
6UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Samuel Sacco
6UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jyoti Shetty
7Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yongmei Zhao
8Sequencing Facility Bioinformatics Group, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Bao Tran
7Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Giuseppe Narzisi
9New York Genome Center, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Adrienne Helland
9New York Genome Center, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Daniel E. Cook
10Google Inc, Mountain View, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Pi-Chuan Chang
10Google Inc, Mountain View, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alexey Kolesnikov
10Google Inc, Mountain View, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrew Carroll
10Google Inc, Mountain View, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Erin K. Molloy
4Department of Computer Science, University of Maryland, College Park, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Irina Pushel
2Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Erin Guest
2Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tomi Pastinen
2Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kishwar Shafin
5Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Karen H. Miga
6UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Salem Malikic
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chi-Ping Day
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nicolas Robine
9New York Genome Center, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Cenk Sahinalp
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael Dean
5Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Midhat S. Farooqi
2Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benedict Paten
6UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mikhail Kolmogorov
1Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mikhail.kolmogorov{at}nih.gov
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Most current studies rely on short-read sequencing to detect somatic structural variation (SV) in cancer genomes. Long-read sequencing offers the advantage of better mappability and long-range phasing, which results in substantial improvements in germline SV detection. However, current long-read SV detection methods do not generalize well to the analysis of somatic SVs in tumor genomes with complex rearrangements, heterogeneity, and aneuploidy. Here, we present Severus: a method for the accurate detection of different types of somatic SVs using a phased breakpoint graph approach. To benchmark various short- and long-read SV detection methods, we sequenced five tumor/normal cell line pairs with Illumina, Nanopore, and PacBio sequencing platforms; on this benchmark Severus showed the highest F1 scores (harmonic mean of the precision and recall) as compared to long-read and short-read methods. We then applied Severus to three clinical cases of pediatric cancer, demonstrating concordance with known genetic findings as well as revealing clinically relevant cryptic rearrangements missed by standard genomic panels.

Competing Interest Statement

S.A. is an employee and stockholder of Oxford Nanopore Technologies. A.K., P.C., K.S., D.C., A.C. are employees of Google LLC and own Alphabet stock as part of the standard compensation package. E.G. served on advisory boards for Jazz Pharmaceuticals and Syndax Pharmaceuticals. M.S.F. is part of the speakers bureau for Bayer and PacBio. The remaining authors declare no competing interests.

Funding Statement

The work was supported in part by the Intramural Research Program of the NIH. This work utilized the computational resources of the NIH HPC Biowulf cluster. (http://hpc.nih.gov). ONT sequencing of the HCC1395 cell line was supported by the National Cancer Institute of the National Institutes of Health under Award Number U01CA253405. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors would like to thank the patients and families who donated their samples for this research. M.S.F. and E.G. would like to thank Braden's Hope for Childhood Cancer, Elizabeth and Monte McDowell, the Black & Veatch Foundation, and Big Slick for their generous support. M.S.F., E.G., and L.L. would also like to thank Children's Mercy Oncology Biorepository study personnel: Judy Vun, Amie Hatfield, and Robin Ryan; as well as Jason Seymour and Keiondra Sanders in the Children's Mercy Research Institute (CMRI) Biorepository, for their assistance with sample collection and processing; and Maggie Gibson, Adam Walter, Laura Puckett in the CMRI Genomics Core for their assistance with sequencing. Y.L. is funded by the NCI-UMD Partnership Program. E.K.M. was supported by the State of Maryland. B.P. was supported by the National Human Genome Research Institute (NHGRI) under award numbers R01HG010485, U01HG013748, U24HG011853, U24HG010262, and U41HG010972, and from NIH award OT2OD033761.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

For the cell lines analysis, Institutional Review Board of National Institutes of Health considers patient-derived cell lines as non-human subjects, and no approval was required. There are, however, ethical considerations, as the cell lines were derived prior to establishing the research use consent mechanism, and no such consent was received. Commercially available cell lines used in this study are anonymized, and the risks of identifying original patients or their immediate family members are low. On the other hand, openly releasing this data will significantly benefit research into developing new methods for detecting somatic variants - a critical task in current and future precision cancer therapies. We concluded that the benefits outweigh the risks and followed the practices established by the NCI and NHGRI in the TCGA tumor cell line data release (https://www.cancer.gov/ccg/research/genome-sequencing/tcga/history/ethics-policies). For the three leukemia/lymphoma cases, patients were enrolled by Children's Mercy Hospital (CMH) into its institutional Tumor Bank research study, which was approved by the CMH Institutional Review Board and included patient consent for the collection, processing, storage, and sequencing of patient samples.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted March 26, 2024.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Severus: accurate detection and characterization of somatic structural variation in tumor genomes using long reads
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Severus: accurate detection and characterization of somatic structural variation in tumor genomes using long reads
Ayse Keskus, Asher Bryant, Tanveer Ahmad, Byunggil Yoo, Sergey Aganezov, Anton Goretsky, Ataberk Donmez, Lisa A. Lansdon, Isabel Rodriguez, Jimin Park, Yuelin Liu, Xiwen Cui, Joshua Gardner, Brandy McNulty, Samuel Sacco, Jyoti Shetty, Yongmei Zhao, Bao Tran, Giuseppe Narzisi, Adrienne Helland, Daniel E. Cook, Pi-Chuan Chang, Alexey Kolesnikov, Andrew Carroll, Erin K. Molloy, Irina Pushel, Erin Guest, Tomi Pastinen, Kishwar Shafin, Karen H. Miga, Salem Malikic, Chi-Ping Day, Nicolas Robine, Cenk Sahinalp, Michael Dean, Midhat S. Farooqi, Benedict Paten, Mikhail Kolmogorov
medRxiv 2024.03.22.24304756; doi: https://doi.org/10.1101/2024.03.22.24304756
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Severus: accurate detection and characterization of somatic structural variation in tumor genomes using long reads
Ayse Keskus, Asher Bryant, Tanveer Ahmad, Byunggil Yoo, Sergey Aganezov, Anton Goretsky, Ataberk Donmez, Lisa A. Lansdon, Isabel Rodriguez, Jimin Park, Yuelin Liu, Xiwen Cui, Joshua Gardner, Brandy McNulty, Samuel Sacco, Jyoti Shetty, Yongmei Zhao, Bao Tran, Giuseppe Narzisi, Adrienne Helland, Daniel E. Cook, Pi-Chuan Chang, Alexey Kolesnikov, Andrew Carroll, Erin K. Molloy, Irina Pushel, Erin Guest, Tomi Pastinen, Kishwar Shafin, Karen H. Miga, Salem Malikic, Chi-Ping Day, Nicolas Robine, Cenk Sahinalp, Michael Dean, Midhat S. Farooqi, Benedict Paten, Mikhail Kolmogorov
medRxiv 2024.03.22.24304756; doi: https://doi.org/10.1101/2024.03.22.24304756

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (430)
  • Allergy and Immunology (756)
  • Anesthesia (221)
  • Cardiovascular Medicine (3294)
  • Dentistry and Oral Medicine (364)
  • Dermatology (279)
  • Emergency Medicine (479)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1171)
  • Epidemiology (13376)
  • Forensic Medicine (19)
  • Gastroenterology (899)
  • Genetic and Genomic Medicine (5153)
  • Geriatric Medicine (482)
  • Health Economics (783)
  • Health Informatics (3268)
  • Health Policy (1140)
  • Health Systems and Quality Improvement (1190)
  • Hematology (431)
  • HIV/AIDS (1017)
  • Infectious Diseases (except HIV/AIDS) (14629)
  • Intensive Care and Critical Care Medicine (913)
  • Medical Education (477)
  • Medical Ethics (127)
  • Nephrology (523)
  • Neurology (4925)
  • Nursing (262)
  • Nutrition (730)
  • Obstetrics and Gynecology (883)
  • Occupational and Environmental Health (795)
  • Oncology (2524)
  • Ophthalmology (724)
  • Orthopedics (281)
  • Otolaryngology (347)
  • Pain Medicine (323)
  • Palliative Medicine (90)
  • Pathology (543)
  • Pediatrics (1302)
  • Pharmacology and Therapeutics (550)
  • Primary Care Research (557)
  • Psychiatry and Clinical Psychology (4212)
  • Public and Global Health (7504)
  • Radiology and Imaging (1706)
  • Rehabilitation Medicine and Physical Therapy (1013)
  • Respiratory Medicine (980)
  • Rheumatology (480)
  • Sexual and Reproductive Health (497)
  • Sports Medicine (424)
  • Surgery (548)
  • Toxicology (72)
  • Transplantation (236)
  • Urology (205)