Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Fast Organ-of-Origin Classification for Digital Pathology Quality Control

View ORCID ProfileWitali Aswolinskiy, John K.L. Wong, Myroslav Zapukhlyak, Yulia Kindruk, Martin Paulikat, Christian Aichmüller
doi: https://doi.org/10.64898/2026.02.03.26345443
Witali Aswolinskiy
aPAICON GmbH, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Witali Aswolinskiy
John K.L. Wong
aPAICON GmbH, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Myroslav Zapukhlyak
aPAICON GmbH, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yulia Kindruk
aPAICON GmbH, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Martin Paulikat
bDepartment of Applied Tumor Biology, Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christian Aichmüller
aPAICON GmbH, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: c.aichmueller{at}paicon.com
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Digitizing large histopathology archives requires processing millions of scanned whole slide images that must be validated rapidly. Automated organ-of-origin classification can accelerate quality control and enable early detection of mislabeled specimens. We developed a deep learning model that classifies the organ of origin from H&E-stained slides using a single low-resolution thumbnail per slide in under one second. For training, we used thumbnails from 16,624 slides from the TCGA and CPTAC archives, which contain mostly primary tumor resections. The images were categorized into 14 classes based on the most common primary sites in TCGA: Bladder, Brain, Breast, Colorectal, Kidney, Liver, Lung, Pancreas, Prostate, Skin, Stomach, Thyroid gland, Uterus, and Other (encompassing the remaining tissue types). We evaluated our approach on two independent external cohorts: a 5-class cohort with 2,857 slides (Colorectal, Kidney, Liver, Pancreas, Prostate) and a comprehensive 14-class cohort (12,348 slides). The model achieved 90% balanced accuracy for the 5-class cohort and 62% for the full 14-class cohort. Notably, when considering only the predictions with high confidence, 53% of the large cohort could be classified with 74% balanced accuracy. Manual review of high-confidence misclassifications suggested that some may reflect errors in the ground truth rather than model error. Mean model inference time was 0.2s per slide on an NVIDIA L4 GPU. Our deep learning approach demonstrates high classification performance with very low inference time, indicating its potential for real-time and cost-effective quality control in digital pathology.

Competing Interest Statement

All authors except M.P. are affiliated with PAICON GmbH. M.P. declares no competing interests.

Funding Statement

No external funding was received. The work was conducted as part of regular employment at PAICON GmbH.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This study is a retrospective analysis of de-identified whole-slide images and associated metadata from publicly available datasets. No new patient data were collected and no patient contact occurred. Ethics approval and informed consent were not required for this study. The TCGA slides are available at https://portal.gdc.cancer.gov. The CPTAC slides are available at https://www.cancerimagingarchive.net. The PAIP slides are available at https://www.wisepaip.org. The VML slides are available at https://wirtualnymikroskop.mostwiedzy.pl.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

The TCGA slides are available at https://portal.gdc.cancer.gov. The CPTAC slides are available at https://www.cancerimagingarchive.net. The PAIP slides are available at https://www.wisepaip. org. The VML slides are available at https://wirtualnymikroskop.mostwiedzy.pl.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted February 04, 2026.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Fast Organ-of-Origin Classification for Digital Pathology Quality Control
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Fast Organ-of-Origin Classification for Digital Pathology Quality Control
Witali Aswolinskiy, John K.L. Wong, Myroslav Zapukhlyak, Yulia Kindruk, Martin Paulikat, Christian Aichmüller
medRxiv 2026.02.03.26345443; doi: https://doi.org/10.64898/2026.02.03.26345443
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Fast Organ-of-Origin Classification for Digital Pathology Quality Control
Witali Aswolinskiy, John K.L. Wong, Myroslav Zapukhlyak, Yulia Kindruk, Martin Paulikat, Christian Aichmüller
medRxiv 2026.02.03.26345443; doi: https://doi.org/10.64898/2026.02.03.26345443

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Pathology
Subject Areas
All Articles
  • Addiction Medicine (576)
  • Allergy and Immunology (868)
  • Anesthesia (306)
  • Cardiovascular Medicine (4483)
  • Dentistry and Oral Medicine (449)
  • Dermatology (385)
  • Emergency Medicine (615)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1528)
  • Epidemiology (15282)
  • Forensic Medicine (31)
  • Gastroenterology (1134)
  • Genetic and Genomic Medicine (6650)
  • Geriatric Medicine (671)
  • Health Economics (1006)
  • Health Informatics (4606)
  • Health Policy (1378)
  • Health Systems and Quality Improvement (1624)
  • Hematology (544)
  • HIV/AIDS (1276)
  • Infectious Diseases (except HIV/AIDS) (15965)
  • Intensive Care and Critical Care Medicine (1111)
  • Medical Education (626)
  • Medical Ethics (147)
  • Nephrology (674)
  • Neurology (6698)
  • Nursing (346)
  • Nutrition (1006)
  • Obstetrics and Gynecology (1153)
  • Occupational and Environmental Health (961)
  • Oncology (3370)
  • Ophthalmology (988)
  • Orthopedics (370)
  • Otolaryngology (421)
  • Pain Medicine (437)
  • Palliative Medicine (131)
  • Pathology (670)
  • Pediatrics (1704)
  • Pharmacology and Therapeutics (700)
  • Primary Care Research (717)
  • Psychiatry and Clinical Psychology (5497)
  • Public and Global Health (9287)
  • Radiology and Imaging (2225)
  • Rehabilitation Medicine and Physical Therapy (1375)
  • Respiratory Medicine (1202)
  • Rheumatology (598)
  • Sexual and Reproductive Health (721)
  • Sports Medicine (536)
  • Surgery (722)
  • Toxicology (100)
  • Transplantation (290)
  • Urology (267)