A Conditional Variational Autoencoder with QSAR-Guided Surrogate-Weighted Fine-Tuning and Cross-Entropy Optimization for Targeted Antimicrobial Peptide Generation

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

A Conditional Variational Autoencoder with QSAR-Guided Surrogate-Weighted Fine-Tuning and Cross-Entropy Optimization for Targeted Antimicrobial Peptide Generation

Authors

Castanon, I.; Wan, F.; de la Fuente, C.; Pini, A.; Falciani, C.

Abstract

Machine Learning frameworks have emerged as a promising tool for antimicrobial peptide design; however, generative models remain limited by two persistent problems: the limited availability of experimentally validated peptides and the circular dependency of the models. In this work we present a conditional variational autoencoder pipeline that addresses both limitations through a modular architecture that combines both binary and quantitative experimental data and implements a multimodal approach to externally guide the generation. A transformer-based encoder successfully generated a discriminative 64-dimensional latent space (test AUROC 0.968, F1 0.919) separating antimicrobial from non-antimicrobial sequences. This latent representation conditions a species-specific LoRA fine-tuned ProtGPT2 decoder through a scalar gating function, which generates balanced antimicrobial peptides through two different modes; prior and perturb, depending on their generation starting points. We introduced a Surrogate Weighted Fine-Tuning (SWF) ensemble to eliminate the circular dependency and a Cross-Entropy Method to explore and exploit the latent space, leading to successful antimicrobial peptide generation. The best candidates exhibited competitive physicochemical characteristics, a mean helical fraction of 0.874 (mean pLDDT 83.7), and externally predicted efficacy evaluated by APEX.

Follow Us on

0 comments

Add comment