The Naive Bayes Classifier++ for Metagenomic Taxonomic Classification -- Query Evaluation

Avatar
Poster
Voices Powered byElevenlabs logo
Connected to paperThis paper is a preprint and has not been certified by peer review

The Naive Bayes Classifier++ for Metagenomic Taxonomic Classification -- Query Evaluation

Authors

Duan, H. N.; Hearne, G.; Polikar, R.; Rosen, G. L.

Abstract

This study examines the query performance of the NBC++ (Incremental Naive Bayes Classifier) program for variations in canonicality, kmer size, databases, and input sample data size. NBC++ can successfully assess a wide range of superkingdoms using a small training database. We demonstrate that NBC++ and Kraken2 are affected by database depth with macro measures increasing with depth but that the full diversity of life, especially viruses, is still a challenge for these classifiers. NBC++ spends less time training but at the cost of long querying time. The major enhancements are to accommodate canonical $k$mer storage (with major storage savings), adaptable and optimized memory allocation that quickens the query analysis and allows the classifier to be run on almost any system, and enables output of the log-likelihood values against each training genome which provides users with valualbe confidence information.

Follow Us on

0 comments

Add comment