Sequence Design and Phylogenetic Inference with Generative Flow Networks

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Sequence Design and Phylogenetic Inference with Generative Flow Networks

Authors

Huang, Q.; Mourra-Diaz, C. M.; Wen, X.; Payette, D.

Abstract

Phylogenetic inference remains computationally challenging due to the exponentially growing tree topology search space, and current methods rely heavily on multiple sequence alignments (MSAs) which are expensive and error-prone. We propose AncestorGFN, a proof-of-concept approach leveraging Generative Flow Networks (GFlowNets) for simultaneous sequence generation and phylogenetic exploration without requiring explicit MSAs. Our method learns to generate sequences matching a target distribution while the flow trajectories implicitly encode structural relationships among sequences. We demonstrate that greedy traceback on maximum-flow trajectories recovers shared intermediate states suggestive of common ancestry, and evaluate on the let-7 microRNA family where the learned flow structure qualitatively captures phylogenetic branching patterns. Furthermore, beam search at inference time discovers novel sequences clustering near known targets, suggesting applications in de novo sequence design. This work establishes an initial foundation for alignment-free phylogenetic exploration using generative models.

Follow Us on

0 comments

Add comment