PepCABO: Latent-space Bayesian optimization for peptide-MHC binding using contrastive alignment

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

PepCABO: Latent-space Bayesian optimization for peptide-MHC binding using contrastive alignment

Authors

Ghane, M.; Korpela, D.; Dumitrescu, A.; Lähdesmäki, H.

Abstract

Motivation: Optimizing peptide sequences for binding to specific MHC class I alleles is a central challenge in immunotherapy and vaccine design. The combinatorial size of peptide space, the nonlinear nature of peptide-MHC interactions, and limited experimental budgets make efficient optimization difficult. Latent-space Bayesian optimization (LSBO) provides a framework by embedding discrete sequences into a continuous space where Bayesian optimization can be applied. However, existing LSBO methods do not effectively leverage binding data from related alleles and often rely on inefficient random initialization. Results: We propose PepCABO, an LSBO framework for peptide-MHC binding using contrastive alignment, which utilizes a dual variational autoencoder framework that jointly learns peptide-allele alignment and a Gaussian process surrogate prior to Bayesian optimization. This simultaneous training induces a latent geometry that reflects the binding landscape and enables structured knowledge transfer across alleles. The pretrained model shapes a structured latent space in which peptides with high objective values regarding a specific MHC allele are geometrically organized, while the jointly trained Gaussian process defines an informative prior over the objective in this space, enabling principled and efficient exploration of promising regions during subsequent optimization. Across 12 target alleles without prior binding data and under both low- and high-budget settings, PepCABO consistently outperforms various baselines. We observe faster convergence, improved area under the optimization curve, and stronger best-found binding affinities, suggesting improved sample efficiency under experimentally constrained scenarios.

Follow Us on

0 comments

Add comment