Improved Graph-based Antibody-aware Epitope Prediction with Protein Language Model-based Embeddings
Improved Graph-based Antibody-aware Epitope Prediction with Protein Language Model-based Embeddings
Ahmed, M.; Ali, S.; Jan, A.; Khan, I. U.; Patterson, M.
AbstractA major challenge in computational antibody design is the accurate identification of antigen binding site, i.e., epitope. The current approaches to epitope prediction struggle because of the variational nature of epitopes and the lack of availability of experimental datasets. However, deep learning-based approaches have shown great promise in achieving better results for the epitope prediction task in recent years. Moreover, there is now great potential in epitope-prediction research because of the newly released and largest-of-its-kind benchmark dataset, Antibody-specific Epitope Prediction (AsEP), modeling antibody-antigen complexes as graph pairs. In this paper, we employ a graph convolutional network (GCN) coupled with protein language models (PLM)-based residue embeddings for epitope prediction on the AsEP dataset. We explore the use of different PLM-embedding methods on the epitope prediction task and show that antibody-specific PLMs such as AntiBERTy and general PLMs such as ProtBERT and ESM2 for antigens provide improved epitope prediction performance with an area under the ROC curve of $0.65$, precision of $0.28$, and recall of $0.46$. The source code is available at: \\url{https://github.com/mansoor181/walle-pp.git}.