Multiple Protein Structure Alignment at Scale with FoldMason
Multiple Protein Structure Alignment at Scale with FoldMason
Gilchrist, C. L. M.; Mirdita, M.; Steinegger, M.
AbstractProtein structure is conserved beyond sequence, making multiple structural alignment (MSTA) essential for analyzing distantly related proteins. Computational prediction methods have vastly extended our repository of available proteins structures, requiring fast and accurate MSTA methods. Here, we introduce FoldMason, a progressive MSTA method that leverages the structural alphabet from Foldseek, a pairwise structural aligner, for multiple alignment of hundreds of thousands of protein structures. FoldMason computes confidence scores, offers interactive visualizations, and provides essential speed and accuracy for large-scale protein structure analysis in the era of accurate structure prediction. Using Flaviviridae glycoproteins, we demonstrate how FoldMason\'s MSTAs support phylogenetic analysis below the twilight zone. FoldMason is free open-source software: foldmason.foldseek.com and webserver: search.foldseek.com/foldmason.