Discovery and Characterization of Terpene Synthases Powered by Machine Learning

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Discovery and Characterization of Terpene Synthases Powered by Machine Learning

Authors

Samusevich, R.; Hebra, T.; Bushuiev, R.; Bushuiev, A.; Chatpatanasiri, R.; Kulhanek, J.; Calounova, T.; Perkovic, M.; Engst, M.; Tajovska, A.; Sivic, J.; Pluskal, T.

Abstract

Terpene synthases (TPSs) generate the scaffolds of the largest class of natural products, including several first-line medicines. The amount of available protein sequences is increasing exponentially, but computational characterization of their function remains an unsolved challenge. We assembled a curated dataset of one thousand characterized TPS reactions and developed a method to devise highly accurate machine-learning models for functional annotation in a low-data regime. Our models significantly outperform existing methods for TPS detection and substrate prediction. By applying the models to large protein sequence databases, we discovered seven TPS enzymes previously undetected by state-of-the-art computational tools and experimentally confirmed their activity. Furthermore, we discovered a new TPS structural domain and distinct subtypes of previously known domains. This work demonstrates the potential of machine learning to speed up the discovery and characterization of novel TPSs.

Follow Us on

0 comments

Add comment