PT - JOURNAL ARTICLE AU - Vandaele, Robin AU - Mukherjee, Pritam AU - Selby, Heather Marie AU - Shah, Rajesh Pravin AU - Gevaert, Olivier TI - Topological Data Analysis of Thoracic Radiographic Images shows Improved Radiomics-based Lung Tumor Histology Prediction AID - 10.1101/2022.05.22.22275410 DP - 2022 Jan 01 TA - medRxiv PG - 2022.05.22.22275410 4099 - http://medrxiv.org/content/early/2022/05/23/2022.05.22.22275410.short 4100 - http://medrxiv.org/content/early/2022/05/23/2022.05.22.22275410.full AB - Topological data analysis (TDA) provides unparalleled tools to capture local to global structural shape information in data. In particular, its main method under the name of persistent homology has found many recent successful applications to both supervised and unsupervised machine learning. Despite its recent gain in popularity, much of its potential for medical image analysis remains undiscovered. In this paper we explore the prominent learning problems on thoracic radiographic images of lung tumors to which persistent homology provides improvements over state-of-the-art radiomic-based learning. It turns out that the novel topological features well capture complementary information important for both ‘benign vs. malignant ‘ and ‘adenocarcinoma vs. squamous cell carcinoma’ tumor prediction, while contributing less consistently to ‘small cell vs. non-small cell ‘—an interesting result in its own right. Furthermore, while radiomic features may be better at predicting malignancy scores assigned by expert radiologists based on visual inspection, it turns out that topological features may be better at predicting the more accurate tumor histology assessed through long-term radiology review, biopsy, surgical resection, progression or response.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThe research leading to these results has received funding from the from the FWO (project no. V407520N, G091017N, G0F9816N, 3G042220), the European Research Council under the European Union′s Seventh Framework Programme (FP7/2007-2013) / ERC Grant Agreement no. 615517, and from the Flemish Government under the ″Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen′ programme. Next, research reported in this publication was supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) of the National Institutes of Health (NIH), R01 EB020527 and R56 EB020527, both to OG. This material is the result of work supported with resources and the use of facilities at the VA Palo Alto Health Care System, Palo Alto, CA. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This retrospective study was approved by the Institutional Review Board overseeing research at both the VA Palo Alto Health Care System and Stanford University.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesData to replicate the results summarized in this paper is available from https://github.com/robinvndaele/TDA_LungLesion. This includes persistence diagrams, features, metadata, and outcomes for both the SF/PA and LIDC data. Original scans and masks for the SF/PA cohort are excluded from this repository and are not permitted to be shared. Original LIDC scans and masks are publicly available from https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI. https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI