Predicting RNA Sequence-Structure Likelihood via Structure-Aware Deep Learning

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Predicting RNA Sequence-Structure Likelihood via Structure-Aware Deep Learning

Authors

Zhou, Y.; Pedrielli, G.; Zhang, F.; Wu, T.

Abstract

Motivation: The active functionalities of RNA are recognized to heavily dependent on the structure and sequence. Therefore, A model that can accurately evaluate a design by giving RNA sequence-structure pairs would be a valuable tool for many researchers. Machine learning methods have been explored to develop such tools, showing promising results. However, two key issues remain. Firstly, the performance of machine learning models is affected by the features used to characterize RNA. Currently, there is no consensus on which features are the most effective for characterizing RNA sequence-structure pairs. Secondly, most existing machine learning methods extract features describing entire RNA molecule. We argue that it is essential to define additional features that characterize nucleotides and specific sections of RNA structure to enhance the overall efficacy of the RNA design process. Results: We develop two deep learning models for evaluating RNA sequence-structure pairs. The first model, NU-ResNet, uses a convolutional neural network architecture that solves the aforementioned problems by explicitly encoding RNA sequence-structure information into a 3D matrix. Building upon NU-ResNet, our second deep learning model, NUMO- ResNet, incorporates additional information derived from the characterizations of RNA, specifically the 2D folding motifs. In this work, we introduce an automated method to extract these motifs based on fundamental secondary structure descriptions. To assess the robustness of our models, we conduct 10-fold cross validation. Furthermore, we evaluate the performance of both models on two independent testing datasets. Our proposed models demonstrate excellent performance across both datasets and surpass the performance of the ENTRNA approach. Availability and Implementation: The corresponding source code and data for this research is available at https: //github.com/yzhou617/NU-ResNet_and_NUMO-ResNet.

Follow Us on

0 comments

Add comment