A Multi-Omics Processing Pipeline (MOPP) for Extracting Taxonomic and Functional Insights from Metaribosome Profiling (metaRibo-Seq) data
A Multi-Omics Processing Pipeline (MOPP) for Extracting Taxonomic and Functional Insights from Metaribosome Profiling (metaRibo-Seq) data
Weng, Y.; Moyne, O.; Walker, C.; Haddad, E.; Lieng, C.; Chin, L.; Rahman, G.; McDonald, D.; Knight, R.; Zengler, K.
AbstractMetaribosome profiling (metaRibo-Seq) enables genome-wide measurement of translation across complex microbial communities by sequencing ribosome-protected mRNA fragments, but the short length of these footprints creates substantial nonspecific mapping against large reference genome collections, leading to spurious taxonomic and functional assignments. Here we present MOPP (Multi-Omics Processing Pipeline), a modular reference-based workflow that denoises metaRibo-Seq data by leveraging matched metagenomic coverage breadth to identify genomes likely to be truly present in a sample before aligning metatranslatomic and optional metatranscriptomic reads. MOPP generates taxon-by-gene count tables across genomic, transcriptional and translational layers, enabling integrated downstream analyses of microbial function. We evaluated MOPP using a defined 79-member synthetic human gut community profiled by metagenomics and metaRibo-Seq. Coverage breadth filtering markedly improved detection accuracy relative to a standard baseline workflow, with performance remaining robust across a broad intermediate threshold range and peaking at 92-95% coverage breadth. At a 92% threshold, MOPP reduced the number of distinct detected operational genomic units by 99.4% while retaining 87.8% of aligned metaRibo-Seq reads on average, and increased the F1 score from 0.02 to 0.61. Residual false positives were predominantly attributable to genomes with extremely high nucleotide similarity to true community members, whereas false negatives were enriched among low-abundance taxa, indicating that remaining errors are driven primarily by biological similarity and detection limits rather than widespread nonspecific mapping. Together, these results establish MOPP as a high-throughput workflow for robust processing of metaRibo-Seq in the context of matched metagenomics and position it as a scalable framework for integrated taxonomic and functional analysis of microbial communities across genomic, transcriptional and translational layers.