Optimized Detection and Inference of Names in scRNA-seq data
Optimized Detection and Inference of Names in scRNA-seq data
Tulyeu, J.; Priest, D.; Wing, J. B.; Sondergaard, J. N.
AbstractAccurate identification of immune cell subsets in single-cell (sc)RNA-seq data is critical for understanding immune responses in autoimmune diseases, infections, and cancer. One caveat of scRNA-seq is the inability to properly assign rare immune cell subsets due to gene dropout events. To circumvent this caveat, we here developed Optimized Detection and Inference of Names in scRNA-seq data (scODIN). scODIN uses an informed holistic two-step approach combining expert knowledge with machine learning to rapidly assign cell type identities to large scRNA-seq dataset. First, scODIN uses key lineage-defining markers to identify a set of core cell types. Second, scODIN compensates for dropout events by integrating a k-nearest neighbors (kNN) algorithm. We additionally programmed scODIN to detect dual and transitional phenotypes, which are usually overlooked in conventional analyses. Consequently, scODIN may enhance our understanding of immune cell heterogeneity and provides comprehensive insights into immune regulation, with broad implications for immunology and personalized medicine.