GenVarLoader: An accelerated dataloader for applying deep learning to personalized genomics
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review
GenVarLoader: An accelerated dataloader for applying deep learning to personalized genomics
Laub, D.; Ho, A.; Jaureguy, J.; Klie, A.; Salem, R. M.; McVicker, G.; Carter, H.
AbstractDeep learning sequence models trained on personalized genomics can improve variant effect prediction, however, applications of these models are limited by computational requirements for storing and reading large datasets. We address this with GenVarLoader, which stores personalized genomic data in new memory-mapped formats with optimal data locality to achieve ~1,000x faster throughput and ~2,000x better compression compared to existing alternatives.