Uncertainty-aware synthetic lethality prediction with pretrained foundation models
Uncertainty-aware synthetic lethality prediction with pretrained foundation models
Hua, K.; Haber, E.; Ma, J.
AbstractSynthetic lethality (SL) offers a promising paradigm for targeted cancer therapy, yet experimental identification of SL gene pairs remains costly, context-dependent, and biased toward well-studied genes. Existing computational approaches often rely on curated protein-protein interaction (PPI) networks and Gene Ontology (GO) annotations, which limit their ability to generalize to novel genes. Here we introduce CILANTRO-SL, a two-stage, graph-free framework that leverages pretrained biological foundation models to predict SL pairs with calibrated uncertainty. In Stage 1, we apply a pretrained single-cell foundation model to bulk RNA-seq profiles of cancer cell lines to obtain context-aware embeddings and perform in silico gene knockouts to generate delta embeddings. These perturbation signals are further conditioned on a data-driven gene prior and supervised with CRISPR viability readouts to learn knockout-aware viability embeddings. In Stage 2, we derive pairwise features from these embeddings and train a lightweight classifier to distinguish SL from non-SL pairs. To enable reliable experimental prioritization, CILANTRO-SL incorporates conformal prediction, producing calibrated and interpretable prediction sets that highlight high-confidence SL candidates. Across two evaluation settings, including zero-shot generalization to unseen gene pairs and to unseen genes, ablation analyses show that viability pretraining and the gene prior substantially improve performance while avoiding reliance on PPI and GO features. CILANTRO-SL therefore transforms pretrained biological representations into practical, uncertainty-aware hypotheses that support robust and scalable discovery of therapeutic targets.