scMusketeers: Addressing imbalanced cell type annotation and batch effect reduction with a modular autoencoder

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

scMusketeers: Addressing imbalanced cell type annotation and batch effect reduction with a modular autoencoder

Authors

Collin, A.; Pelletier, S. J.; Fierville, M.; Droit, A.; Precioso, F.; Becavin, C.; BARBRY, P.

Abstract

The increasing number of single-cell gene expression atlases available represent a potential revolution in understanding physio-pathological processes. To fully leverage this single-cell revolution, we need to enhance data integration and cell annotation strategies, with a particular emphasis on addressing the challenges posed by imbalanced cell type proportions and substantial batch effects. scMusketeers, a deep learning model, optimizes the latent data representation and solves all at once these challenges. scMusketeers features three neural modules: (1) an autoencoder for noise and dimensionality reductions; (2) a focal loss classifier to enhance rare cell type predictions; and (3) an adversarial domain adaptation (DANN) module for batch effect correction. Benchmarking against state-of-the-art tools, including the UCE foundation model, showed that scMusketeers performs on par or better, particularly in identifying rare cell types. It also allows to transfer cell labels from single-cell RNA sequencing to spatial transcriptomics. With its modular and adaptable design, scMusketeers offers a versatile framework that can be generalized to other large-scale biological projects requiring deep learning approaches, establishing itself as a valuable tool for single-cell data integration and analysis.

Follow Us on

0 comments

Add comment