A Simple Generative Model for the Prediction of T-Cell Receptor - Peptide Binding in T-Cell Therapy for Cancer
A Simple Generative Model for the Prediction of T-Cell Receptor - Peptide Binding in T-Cell Therapy for Cancer
Papanikolaou, A.; Sivtsov, V.; Zereik, E.; Ruggiero, E.; Bonini, C.; Bonsignorio, F.
AbstractObjective: To develop a deep learning model capable of predicting epitope peptides recognized by specific CDR3 (Complementarity-Determining Region 3) sequences of T-cell receptors (TCRs) in the context of Major Histocompatibility Complex (MHC) molecules, addressing the challenges of incomplete datasets and the need for novel sequence generation in adoptive T-cell therapy for cancer. Methods: We implemented a sequence to sequence generative model named \"GRIP\" (Generative Reconstruction of antIgen Peptides) using a Long Short-Term Memory (LSTM) network with attention mechanisms. The model was trained and validated on publicly available datasets, employing data balancing, label smoothing, and dynamic learning rate scheduling to enhance performance and generalization. Accuracy was assessed at the amino acid level. Results: The model achieved a training accuracy of 97\\% and a test accuracy of 85\\% for predicting epitope sequences at the amino acid level. Probabilistic sequence generation allowed GRIP to produce biologically plausible epitope sequences, even for unseen CDR3 inputs. Attention-based interpretability provided insights into the model\'s focus on critical sequence elements. The model outperformed existing approaches in handling data imbalance and generalization to novel epitopes. Conclusion: GRIP offers a novel solution to the TCR-epitope binding problem by generating potential epitope sequences instead of matching to known data, addressing a fundamental gap in existing models. This approach has significant implications for personalized immunotherapy, facilitating the design of targeted T-cell therapies for cancer.