Exploring Multi-Scale Local and Global Features in Whole Slide Images Using State Space Models
Exploring Multi-Scale Local and Global Features in Whole Slide Images Using State Space Models
Jiang, C.; Zhao, Z.; Liang, P.; Shi, M.; Han, J.; Tzeng, N.-F.; Xiao, G.; Chen, D. Z.; Zheng, H.
AbstractWhole slide image (WSI) classification is crucial in computational pathology, yet the gigapixel scale of WSIs makes it challenging to extract discriminative and compact WSI-level features for disease diagnosis. In this paper, we propose MambaWSI, a novel method that leverages the state space model (SSM) for WSI classification by exploring multi-scale local and global features. Unlike existing approaches that sequentially traverse WSI tiles and rely on vanilla SSMs for long-range dependency modeling, we exploit a traversal strategy in a higher-dimensional discrete space that preserves spatial proximity, enabling a first-local-then-global feature extraction process. Furthermore, to align with the clinical workflow of pathologists when examining WSIs at multiple scales, we propose a two-stage hierarchical fusion strategy: inter-scale feature alignment and aggregation, followed by attention-based fusion across magnifications, integrating complementary information from multiple magnifications. Experiments on two datasets demonstrate that MambaWSI outperforms state-of-the-art methods in classification performance.