Science Cast

Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data

librarianFebruary 28, 2026 9:01pm

Views (7)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data

bioRxivPDFFebruary 28, 2026 12:00am

Authors

Chardes, V.

Abstract

Single-cell RNA-seq provides detailed molecular snapshots of individual cells but is notoriously noisy. Variability stems from biological differences and technical factors, such as amplification bias and limited RNA capture efficiency, making it challenging to adapt computational pipelines to heterogeneous datasets or evolving technologies. As a result, most studies still rely on principal component analysis (PCA) for dimensionality reduction, valued for its interpretability and robustness, in spite of its known bias in high dimensions. Here, we improve upon PCA with a Random Matrix Theory (RMT)-based approach that guides the inference of sparse principal components using existing sparse PCA algorithms. We first introduce a novel biwhitening algorithm which self-consistently estimates the magnitude of transcriptomic noise affecting each gene in individual cells, without assuming a specific noise distribution. This enables the use of an RMT-based criterion to automatically select the sparsity level, rendering sparse PCA nearly parameter-free. Our mathematically grounded approach retains the interpretability of PCA while enabling robust, hands-off inference of sparse principal components. Across seven single-cell RNA-seq technologies and four sparse PCA algorithms, we show that this method systematically improves the reconstruction of the principal subspace and consistently outperforms PCA-, autoencoder-, and diffusion-based methods in cell-type classification tasks.

TwitterandLinkedIn

0 comments

Add comment

Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data

Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments