Ultra-fast variant effect prediction using biophysical transcription factor binding models

Avatar
Poster
Voices Powered byElevenlabs logo
Connected to paperThis paper is a preprint and has not been certified by peer review

Ultra-fast variant effect prediction using biophysical transcription factor binding models

Authors

Hosseini, R.; Balci, A. T.; Kostka, D.; Chikina, M.

Abstract

Sequence variation within TF binding sites can significantly affect TF-DNA interactions, influencing gene expression and contributing to disease susceptibility or phenotypic traits. Despite recent progress in deep sequence-to-function models that predict functional output from sequence data, these methods perform inadequately on some variant effect prediction tasks, especially with common genetic variants. This limitation underscores the importance of leveraging biophysical models of TF binding to enhance interpretability of variant effect scores and facilitate mechanistic insights. We introduce MotifDiff, a novel computational tool designed to quantify variant effects using mono and dinucleotide position weight matrices. MotifDiff offers several key advantages, including scalability to score millions of variants within minutes, implementation of various normalization strategies for optimal performance, and support for both dinucleotide and mononucleotide models. We demonstrate MotifDiff\'s efficacy by evaluating it across diverse ground truth datasets that quantify the effects of common variants in vivo, thereby establishing robust benchmarks for the predictive value of variant effect calculations. MotifDiff is available as a standalone Python application at https://github.com/rezwanhosseini/MotifDiff.

Follow Us on

0 comments

Add comment