Improved BioCyc Operon Prediction: Revisiting theOperon Prediction Problem

Avatar
Poster
Voices Powered byElevenlabs logo
Connected to paperThis paper is a preprint and has not been certified by peer review

Improved BioCyc Operon Prediction: Revisiting theOperon Prediction Problem

Authors

Midford, P. E.; Cadigan, J.; Karp, P. D.

Abstract

Introduction: Operon prediction is a valuable component of microbial-genome annotation because operon organization can yield inferences about gene function, and because knowledge of operon structure can aid the interpretation of gene expression data. Methods: We present a number of improvements to the existing Pathway Tools operon predictor based mostly on 7 new features that we hypothesized would increase its performance. The new features include shared Gene Ontology biological process terms, similarity of codon usage and GC content, correlated gene expression, and shared protein complex. Results: We evaluated the proposed 7 new features and found that the addition of 6 of them improved the performance of the operon predictor from 79.55% to 83.49%, a decrease in error rate of 19.3%. When gene expression data was not included, the accuracy decreased to 82.547, still an improvement of 14.7%. One of the proposed features as well as a previously used feature had no effect and were removed. Discussion: Although some of the new features had strong predictive value individually, when combined with the other features they did not have a large impact on predictive accuracy, suggesting that they were not independent from the other features.

Follow Us on

0 comments

Add comment