Transformer-Based Tissue Classification from Colorectal Cancer Pathology Slides

Authors

Keywords:

Colorectal cancer, transformer, tissue classification, deep learning, histopathology, MSI

Abstract

Colorectal cancer (CRC) remains one of the leading causes of cancer-related mortality worldwide. Biomarkers such as microsatellite instability (MSI) play a pivotal role in guiding treatment decisions, particularly in the context of immunotherapy. However, conventional MSI testing methods, including PCR and sequencing, are often time-consuming and costly. To address this, we propose a transformer- based deep learning model for histopathological image analysis, specifically aimed at classifying tissue patches into nine categories to support MSI biomarker prediction. The model was trained on a publicly available CRC dataset from Zenodo, which includes annotated tissue patches categorized into nine classes: ADI, BACK, DEB, LYM, MUC, MUS, NORM, STR, and TUM. Patch features of shape (7×7×1024) were extracted using pretrained embeddings. A transformer encoder, followed by fully connected layers, was implemented using PyTorch. The model was trained with cross-entropy loss and optimized with Adam. Performance was evaluated using accuracy and confusion matrices. The transformer-based model achieved an overall classification accuracy of 96% on the test set. Notably, high precision and recall were observed for key classes such as TUM (tumor) and LYM (lymphocytes). Most misclassifications occurred between STR and DEB, which exhibit morphological similarities. Compared to conventional CNN-based approaches, the transformer model demonstrated superior generalization and interpretability, benefiting from its ability to model global dependencies through self-attention mechanisms. This study highlights the potential of transformer architectures for accurate and scalable tissue classification in digital pathology. The results confirm their applicability as a foundation for future biomarker prediction tasks, including MSI detection. Future work will focus on extending this framework for direct biomarker inference and validating its performance on larger, multi-institutional datasets.

Downloads

Published

09/09/2025

Issue

Section

9. ISSC Proceedings Book