A multimodal framework for immune cell annotation across blood and tissue single-cell data
A multimodal framework for immune cell annotation across blood and tissue single-cell data
Tomar, S. S.; Haas, J. T.; Staels, B.; Dombrowicz, D.; Sondergaard, J. N.
AbstractAccurate immune cell annotation remains a central challenge in single-cell analysis, particularly when datasets contain rare populations, closely related subtypes, transitional states, and tissue-adapted immune phenotypes. We previously developed scODIN as an expert-guided framework for immune cell annotation in single-cell RNA-sequencing data. Here, we present pyODIN, a major extension of this framework for the Python ecosystem. Compared to scODIN, pyODIN supports multimodal annotation using RNA, antibody-derived tag (ADT), or combined RNA-ADT information. It also substantially expands the underlying annotation database from a CD4 T-cell-centered framework to a broader immune reference resource spanning major and minor peripheral blood mononuclear cell populations as well as tissue-associated immune subsets. To evaluate performance, we benchmarked pyODIN against CellTypist and HiCAT using two independently curated immune reference datasets. pyODIN achieved the highest overall classification accuracy on the Allen dataset and matched the top-performing method on Azimuth. At the top-lineage level, pyODIN reduced broad cross-lineage errors relative to the benchmark methods. At finer resolution, pyODIN preserved substantially greater CD4 T-cell subtype structure than CellTypist and HiCAT, resolving regulatory, follicular, helper, memory, and cytotoxic states that were largely collapsed into broader categories by the comparator methods. In a controlled marker-ablation experiment, addition of ADT information restored NK and CD8 T-cell annotation when key lineage-defining RNA markers were removed from the expression feature set, demonstrating the value of multimodal annotation when transcript-level evidence is incomplete. Finally, application to liver fine-needle aspirate data showed that the expanded framework can support both broad and fine-grained annotation beyond PBMC datasets. Together, pyODIN provides an expert-guided and adaptable immune annotation framework for Python that is designed for multimodal immune phenotyping across blood and tissue single-cell datasets.