Science Cast

A Bioinformatic Pipeline for Consensus Taxonomic Classification of Long-Read Amplicons

Larry HalversonMay 1, 2026 3:26am

Views (11)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

A Bioinformatic Pipeline for Consensus Taxonomic Classification of Long-Read Amplicons

bioRxivPDFApril 30, 2026 12:00am

Authors

Paulsen, A. A.; LaSarre, B.; Delp, D.; Beattie, G. A.; Halverson, L. J.

Abstract

Characterizing community composition is fundamental to understanding microbial community function. Recent advances in Oxford Nanopore Technology (ONT) long-read sequencing now allow community profiling using full-length gene amplicons, affording better taxonomic resolution than standard short-amplicon Illumina sequencing. However, robust ONT-compatible profiling workflows are lacking. To address this, we have created the Amplicon Consensus Taxonomy (ACT) pipeline for classifying long-read amplicons. ACT combines output from three existing pipelines -Emu, Sintax, and LACA - to leverage the strengths of each while offsetting their individual limitations. We also developed the ACT database (ACT-DB), a sequence-similarity-aware reference database that clusters highly similar sequences into multi-taxa groups to reduce overclassification. We benchmarked ACT performance against Emu and Sintax using a defined simple mock community, simulated datasets, and a complex rhizosphere community supplemented with novel species. While ACT exhibited generally comparable or superior performance across datasets, ACT demonstrated a marked advantage over Emu and Sintax in identifying novel and low-abundance taxa in both simple and complex communities, resulting in significantly higher species-richness estimates that better reflected those observed in prior Illumina amplicon studies. Furthermore, by clustering ambiguous reference sequences, ACT-DB allowed ACT to resolve reads to meaningful multi-species groups, improving resolution without coercing artificial precision. Together, ACT and ACT-DB form a robust long-read amplicon profiling workflow that confidently identifies known species while reducing overclassification and preserving low-abundance and unknown taxa.

TwitterandLinkedIn

0 comments

Add comment

A Bioinformatic Pipeline for Consensus Taxonomic Classification of Long-Read Amplicons

A Bioinformatic Pipeline for Consensus Taxonomic Classification of Long-Read Amplicons

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments