Enhancing Detection of Polygenic Adaptation: A Comparative Study of Machine Learning and Statistical Approaches Using Simulated Evolve-and-Resequence Data
Enhancing Detection of Polygenic Adaptation: A Comparative Study of Machine Learning and Statistical Approaches Using Simulated Evolve-and-Resequence Data
Caliendo, C.; Gerber, S.; Pfenninger, M.
AbstractDetecting signals of polygenic adaptation remains a significant challenge in evolutionary biology, as traditional methods often struggle to identify the associated subtle, multi-locus allele-frequency shifts. Here, we introduced and tested several novel approaches combining machine learning techniques with traditional statistical tests to detect polygenic adaptation patterns. We implemented a Naive Bayesian Classifier (NBC) and One-Class Support Vector Machines (OCSVM), and compared their performance against the Fisher\'s Exact Test (FET). Furthermore, we combined machine learning and statistical models (OCSVM-FET and NBC-FET), resulting in 5 competing approaches. Using a simulated data set based on empirical evolve-and-resequencing Chironomus riparius genomic data, we evaluated methods across evolutionary scenarios, varying in generations and numbers of loci under selection. Our results demonstrate that the combined OCSVM-FET approach consistently outperformed competing methods, achieving the lowest false positive rate, highest area under the curve, and high accuracy. The performance peak aligned with the late dynamic phase of adaptation, highlighting the method\'s sensitivity to ongoing selective processes and thus for experimental approaches. Furthermore, we emphasize the critical role of parameter tuning, balancing biological assumptions with methodological rigor. Our approach thus offers a powerful tool for detecting polygenic adaptation in pool sequencing data particularly from evolve-and-resequence experiments.