Allo-PED: Leveraging protein language models and structure features for allosteric site prediction

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Allo-PED: Leveraging protein language models and structure features for allosteric site prediction

Authors

Chen, X.; Zheng, J.; Huang, Z.; Xu, Z.; Huang, J.; Wei, Y.; Zhang, H.

Abstract

Allosteric regulation plays a pivotal role in modulating protein function and allosteric sites represent a promising target for drug discovery. However, identifying allosteric sites remains challenging due to their structural and evo-lutionary diversity. Here, we present AlloPED, a novel framework that com-bines protein language models and machine learning to predict allosteric sites with high accuracy. AlloPED consists of two modules: AlloPED-pocket, an ensemble model leveraging physicochemical features to predict allosteric pockets; and AlloPED-site, a dilated convolutional neural network (DCNN) augmented with a comprehensive attention mechanism for residue-level pre-diction. AlloPED-pocket achieves state-of-the-art performance on bench-mark datasets, yielding an MCC of 0.544 and an AUC of 0.920, outperform-ing existing methods such as AllositePro and PARS. AlloPED-site further re-fines predictions using high-dimensional sequence embeddings from the ProtT5 protein language model, achieving a precision of 0.601, a recall of 0.422, and a specificity of 0.661. These results highlight the effectiveness of integrating ensemble learning and deep learning for allosteric site prediction. AlloPED also identifies critical determinants of allosteric sites, including res-idue clustering coefficients, van der Waals volume, and hydrophobic micro-environments. In summary, this framework provides a robust tool for ad-vancing our understanding of allosteric regulation and facilitating structure-based drug design.

Follow Us on

0 comments

Add comment