RT Journal Article SR Electronic T1 AI based pre-screening of large bowel cancer via weakly supervised learning of colorectal biopsy histology images JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2022.02.28.22271565 DO 10.1101/2022.02.28.22271565 A1 Bilal, Mohsin A1 Tsang, Yee Wah A1 Ali, Mahmoud A1 Graham, Simon A1 Hero, Emily A1 Wahab, Noorul A1 Dodd, Katherine A1 Sahota, Harvir A1 Lu, Wenqi A1 Jahanifar, Mostafa A1 Robinson, Andrew A1 Azam, Ayesha A1 Benes, Ksenija A1 Nimir, Mohammed A1 Bhalerao, Abhir A1 Eldaly, Hesham A1 Ahmed Raza, Shan E A1 Gopalakrishnan, Kishore A1 Minhas, Fayyaz A1 Snead, David A1 Rajpoot, Nasir YR 2022 UL http://medrxiv.org/content/early/2022/02/28/2022.02.28.22271565.abstract AB Histopathological examination is a pivotal step in the diagnosis and treatment planning of many major diseases. To facilitate the diagnostic decision-making and reduce the workload of pathologists, we present an AI-based pre-screening tool capable of identifying normal and neoplastic colon biopsies. To learn the differential histological patterns from whole slides images (WSIs) stained with hematoxylin and eosin (H&E), our proposed weakly supervised deep learning method requires only slide-level labels and no detailed cell or region-level annotations. The proposed method was developed and validated on an internal cohort of biopsy slides (n=4 292) from two hospitals labeled with corresponding diagnostic categories assigned by pathologists after reviewing case reports. Performance of the proposed colon cancer pre-screening tool was evaluated in a cross-validation setting using the internal cohort (n=4 292) and also by an external validation on The Cancer Genome Atlas (TCGA) cohort (n=731). With overall cross-validated classification accuracy (AUROC = 0.9895) and external validation accuracy (AUROC = 0.9746), the proposed tool promises high accuracy to assist with the pre-screening of colorectal biopsies in clinical practice. Analysis of saliency maps confirms the representation of disease heterogeneity in model predictions and their association with relevant pathological features. The proposed AI tool correctly reported some slides as neoplastic while clinical reports suggested they were normal. Additionally, we analyzed genetic mutations and gene enrichment analysis of AI-generated neoplastic scores to gain further insight into the model predictions and explore the association between neoplastic histology and genetic heterogeneity through representative genes and signaling pathways.Competing Interest StatementDS reports personal fees from Royal Philips, outside the submitted work. NR and FM report research funding from GlaxoSmithKline. SG, DS and NR are co-founders of Histofy Ltd. All other authors declare no competing interests.Funding StatementThis study is supported by the PathLAKE Centre of Excellence for digital pathology and artificial intelligence, which is funded from the Data to Early Diagnosis and Precision Medicine strand of the HM Government's Industrial Strategy Challenge Fund, managed and delivered by Innovate UK on behalf of UK Research and Innovation (UKRI, Grant ref: File Ref 104689/application number 18181).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This study was conducted under Health Research Authority National Research Ethics approval 15/NW/0843; IRAS 189095 and the Pathology image data Lake for Analytics, Knowledge and Education (PathLAKE) research ethics committee approval (REC reference 19/SC/0363, IRAS project ID 257932, South Central - Oxford C Research Ethics Committee).I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe annotations and the COBI cohort will be made available upon completion of the PathLAKE project. All images and the clinical, demographic, and mutation status information for the TCGA COAD and READ cohort used in this study are publicly available at https://portal.gdc.cancer.gov/ and cBioPortal (https://www.cbioportal.org/).