%0 Journal Article %A Wojtaszewska Marzena %A Stępień Rafał %A Woźna Alicja %A Piernik Maciej %A Dąbrowski Maciej %A Gniot Michał %A Szymański Sławomir %A Socha Maciej %A Kasprzak Piotr %A Matkowski Rafał %A Zawadzki Paweł %T Validation of HER2 status in whole genome sequencing data of breast cancers with AI-driven, ploidy-corrected approach %D 2021 %R 10.1101/2021.08.30.21258379 %J medRxiv %P 2021.08.30.21258379 %X The HER2 protein overexpression is one of the most significant biomarkers for breast cancer diagnostics, prediction, and prognostics. The availability of HER2-inhibitors in routine clinical practice directly translates into the diagnostic need for precise and robust marker identification.At the brink of the genomic era, multigene next-generation sequencing methodologies slowly take over the field of single-biomarker molecular and cytogenetic tests. However, copy number alterations such as amplification of the HER2-coding ERBB2 gene, are certainly harder to validate as an NGS biomarker than simple SNV mutations. They are characterized by several compound genomic factors i.a. structural heterogeneity, dependence on chromosome count and genomic context of ploidy. In our study, we tested the approach of using whole genome sequencing instead of NGS panels to robustly and accurately determine HER2 status in clinical setup. Based on the large dataset of 877 breast cancer patients’ genomes with curated clinical data and a machine learning approach for optimization of an unbiased diagnostic classifier, we provide a reliable algorithm of HER2 status assessment.Competing Interest StatementAlicja Woźna and Paweł Zawadzki are share owners in the company MNM Bioscience Inc. 16192 Coastal Highway, Lewes, DE 19958. Other authors declare no conflict of interests.Funding StatementNo external funding was received to support this work.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:All data were collected from controlled-access repositories on the basis of written approval. HMF data were acquired under data request number DR-169. ICGC data were acquired under data request number DACO-6030. The Cancer Genome Atlas data was acquired via dbGaP platform (project phs000178.v11.p8) under data request number #86794-3. The collection of all aforementioned data was supervised and approved by local Medical Ethical Committees, the approvals were granted during original studies design and the written consent of each participant is in possession of Genomic Consortia. The primary data was collected in accordance with the standards set by the Declaration of Helsinki and the highest data security standards of ISO 27001.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data that support the findings of this study are openly available in the following repositories: Hartwig Medical Foundation which was acquired under data request number DR-169, International Cancer Genome Consortium, which was acquired under data request number DACO-6030, The Cancer Genome Atlas data was acquired via dbGaP platform (project phs000178.v11.p8) under data request number #86794-3. Secondary data that supports the findings of this study that was generated by the Authors are available in the supplementary material of this article. https://www.hartwigmedicalfoundation.nl/ https://portal.gdc.cancer.gov/legacy-archive/search/f WGSwhole genome sequencing,HER2human epidermal growth factor receptor 2BCbreast cancerCNcopy numberFISHfluorescence in situ hybridizationIHCimmunohistochemistryASCOAmerican Society Cancer of Clinical OncologyCAPCollege American PathologistsNGSnext-generation sequencingCEPCentromere enumeration probeAIartificial intelligenceTCGAThe Cancer Genome AtlasICGCInternational Cancer Genome ConsortiumHMFHartwig Medical FoundationFFPEformalin-fixed, paraffin-embeddedTNBCTriple-negative breast cancer %U https://www.medrxiv.org/content/medrxiv/early/2021/09/05/2021.08.30.21258379.full.pdf