Cross-Modal Training Using Xenium Spatial Transcriptomics Enables DINO-DETR Based Detection of Vascular Niches in H&E Whole-Slide Images
Cross-Modal Training Using Xenium Spatial Transcriptomics Enables DINO-DETR Based Detection of Vascular Niches in H&E Whole-Slide Images
S, P.; Alugam, R.; Gupta, S.; Shah, N.; Uppin, M. S.
AbstractBackground: Tumor vasculature is a key driver of glioma progression, yet routine quantification depends on subjective histopathologic assessment or resource-intensive ancillary immunohistochemistry. A scalable, objective method for vascular phenotyping from routine histology remains an unmet need. Methods: We leveraged 10x Genomics Xenium spatial transcriptomics data from a glioblastoma specimen to generate molecularly resolved annotations of GBM-associated endothelial cells and pericytes across 809,041 cells. These annotations were transferred to matched H&E-stained sections to train a DINO-DETR-based object detection model using a binary classification scheme (vascular vs. other). The model was validated on four independent Xenium patient slides and applied to a retrospective cohort of 119 diffuse gliomas spanning WHO grades 2-4 (oligodendroglioma, astrocytoma, and glioblastoma) with linked survival data. Results: Binary vascular cell detection achieved a precision of 0.78, a recall of 0.63, and an F1 score of 0.70, with an overall accuracy of 98.6%. Orthogonal spatial validation confirmed that predicted vascular cells were preferentially localized within annotated blood vessel regions. In subtype-stratified survival analysis, high AI-derived vascular cell proportion was significantly associated with worse overall survival in astrocytoma patients (log-rank p < 0.019). Conclusion: Cross-modal AI training using spatial transcriptomics enables scalable, molecularly informed vascular quantification directly from routine H&E slides. Within the astrocytoma subtype, where tumor grade is most heterogeneous and vascular phenotype most variable, objective vascular quantification provides independent prognostic information demonstrating the potential of spatially supervised deep learning to extract clinically meaningful microenvironmental signals from universally available histologic material.