GnnDebugger: GNN based error correction in De Bruijn Graphs

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

GnnDebugger: GNN based error correction in De Bruijn Graphs

Authors

Simunovic, M.; Sikic, M.; Bankevich, A.

Abstract

Modern sequencing technologies have enabled the reconstruction of complete mammalian genomes from telomere to telomere. However, scaling this achievement to thousands of species and population-level studies remains a challenge. Key bottlenecks include the low quality of the draft assemblies and the high coverage requirements. In particular, reconstructing complete and accurate sequences of both haplotypes in diploid genomes is especially difficult since the sequencing depth is not always sufficient to properly reconstruct diverged regions. Inspired by the success of neural networks in extracting patterns from the data on a massive scale, we introduce a method for correcting errors in De Bruijn Graphs using Graph Neural Networks. Our model provides a reliable classification of edges into correct and erroneous, especially for diploid genomes with coverage depth 35 and lower. We demonstrate that these predictions can guide the downstream read error correction algorithm and genome assembly, ultimately allowing for more accurate genome assembly.

Follow Us on

0 comments

Add comment