PETScan: Score-Based Genome-Wide Association Analysis of RNA-Seq and ATAC-Seq Data

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

PETScan: Score-Based Genome-Wide Association Analysis of RNA-Seq and ATAC-Seq Data

Authors

Hao, Y.; Kafri, T.; Zou, F.

Abstract

High-dimensional sequencing data, such as RNA-Seq for gene expression and ATAC-Seq for chromatin accessibility, are widely used in studying systems biology. Accessible chromatin allows transcription factors and regulatory elements to bind to DNA, thereby regulating transcription through the activation or repression of target genes. The association analysis of RNA-Seq and ATAC-Seq data provides insights into gene regulatory mechanisms. Most existing analytic tools exclusively focus on cis-associations, despite regulatory elements being able to physically interact with distant target genes. Furthermore, conventional approaches often utilize Pearson or Spearman correlations, which ignore the count-based nature of RNA-Seq data. To address these limitations, we introduce PETScan, a computationally efficient genome-wide PEak-Transcript Score-based association analysis, utilizing negative binomial models to better accommodate RNA-Seq data. We leverage score tests and matrix calculations for improved computational efficiency, and combine an empirical permutation method with genomic control to ensure valid p-value calculations in studies with limited sample sizes. In real-world datasets, PETScan achieved three orders of magnitude faster than Wald tests, while identifying similar significant gene-peak pairs. The PETScan R package is available on GitHub at https://github.com/yajing-hao/PETScan.

Follow Us on

0 comments

Add comment