RNAquarium: an archive-scale atlas of zebrafish gene expression coupled with pan-taxonomic profiling reveals diverse viral drivers of transcriptomic states
RNAquarium: an archive-scale atlas of zebrafish gene expression coupled with pan-taxonomic profiling reveals diverse viral drivers of transcriptomic states
Aniseia, Y.; Waltari, E.; Huang, H.; Lima, L.; Rahman, G.; Frank, M.; Zhou, A.; Kim, Y.-J.; Paras, J.; Baker, S.; Senbabaoglu, Y.; Peng, D.; Balla, K.
AbstractZebrafish RNA-seq studies span diverse developmental, physiological, and disease contexts, yet most analyses remain confined to individual experiments and disregard the non-zebrafish component of the data. We present RNAquarium, a scalable framework for joint transcriptomic and metatranscriptomic analysis of RNA-seq data and apply it to all publicly available zebrafish RNA-seq datasets in the Sequence Read Archive. This resource captures transcriptomic structure across development and tissues, reveals diverse microbial and viral associations, and identifies previously undescribed zebrafish viruses including a close relative of human influenza B virus linked to distinct host transcriptional states. We further demonstrate that archive-scale transcriptomes can support foundation-model training and prediction of infection-associated transcriptomic signatures. RNAquarium provides an open framework and interactive portal for exploring the breadth of zebrafish gene expression patterns and associated taxa profiled across a large research community and establishes a generalizable strategy for integrating transcriptomic and metatranscriptomic analyses across the diversity of life represented in public sequencing archives.