Abstract
The HER2 protein overexpression is one of the most significant biomarkers for breast cancer diagnostics, prediction, and prognostics. The availability of HER2-inhibitors in routine clinical practice directly translates into the diagnostic need for precise and robust marker identification.
At the brink of the genomic era, multigene next-generation sequencing methodologies slowly take over the field of single-biomarker molecular and cytogenetic tests. However, copy number alterations such as amplification of the HER2-coding ERBB2 gene, are certainly harder to validate as an NGS biomarker than simple SNV mutations. They are characterized by several compound genomic factors i.a. structural heterogeneity, dependence on chromosome count and genomic context of ploidy. In our study, we tested the approach of using whole genome sequencing instead of NGS panels to robustly and accurately determine HER2 status in clinical setup. Based on the large dataset of 877 breast cancer patients’ genomes with curated clinical data and a machine learning approach for optimization of an unbiased diagnostic classifier, we provide a reliable algorithm of HER2 status assessment.
Competing Interest Statement
Alicja Woźna and Paweł Zawadzki are share owners in the company MNM Bioscience Inc. 16192 Coastal Highway, Lewes, DE 19958. Other authors declare no conflict of interests.
Funding Statement
No external funding was received to support this work.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
All data were collected from controlled-access repositories on the basis of written approval. HMF data were acquired under data request number DR-169. ICGC data were acquired under data request number DACO-6030. The Cancer Genome Atlas data was acquired via dbGaP platform (project phs000178.v11.p8) under data request number #86794-3. The collection of all aforementioned data was supervised and approved by local Medical Ethical Committees, the approvals were granted during original studies design and the written consent of each participant is in possession of Genomic Consortia. The primary data was collected in accordance with the standards set by the Declaration of Helsinki and the highest data security standards of ISO 27001.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The data that support the findings of this study are openly available in the following repositories: Hartwig Medical Foundation which was acquired under data request number DR-169, International Cancer Genome Consortium, which was acquired under data request number DACO-6030, The Cancer Genome Atlas data was acquired via dbGaP platform (project phs000178.v11.p8) under data request number #86794-3. Secondary data that supports the findings of this study that was generated by the Authors are available in the supplementary material of this article.
Abbreviations
- WGS
- whole genome sequencing,
- HER2
- human epidermal growth factor receptor 2
- BC
- breast cancer
- CN
- copy number
- FISH
- fluorescence in situ hybridization
- IHC
- immunohistochemistry
- ASCO
- American Society Cancer of Clinical Oncology
- CAP
- College American Pathologists
- NGS
- next-generation sequencing
- CEP
- Centromere enumeration probe
- AI
- artificial intelligence
- TCGA
- The Cancer Genome Atlas
- ICGC
- International Cancer Genome Consortium
- HMF
- Hartwig Medical Foundation
- FFPE
- formalin-fixed, paraffin-embedded
- TNBC
- Triple-negative breast cancer