Systematically developing a registry of splice-site creating variants utilizing massive publicly available transcriptome sequence data

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Systematically developing a registry of splice-site creating variants utilizing massive publicly available transcriptome sequence data

Authors

Iida, N.; Okada, A.; Kobayashi, Y.; Chiba, K.; Yatabe, Y.; Shiraishi, Y.

Abstract

Genomic variants causing abnormal splicing play an important role in genetic disorders and cancer development. Among them, variants that cause formations of novel splice-sites (splice-site creating variants, SSCVs) are particularly difficult to identify and often overlooked in genomic studies. Additionally, these SSCVs, especially those found in deep intronic regions, are frequently considered promising candidates for treatment with splice-switching antisense oligonucleotides (ASOs), offering therapeutic potential for rare disease patients. To leverage massive transcriptome sequence data such as those available from the Sequence Read Archive, we developed a novel framework to screen for SSCVs solely using transcriptome data. We have applied it to 322,072 publicly available transcriptomes and identified 30,130 SSCVs. Utilizing this extensive collection of SSCVs, we have revealed the characteristics of Alu exonization via SSCVs, especially the hotspots of SSCVs within Alu sequences and their evolutionary relationships. Many of the SSCVs affecting disease-causing variants were predicted to generate premature termination codons and are degraded by nonsense-mediated decay. On the other hand, several genes, such as CREBBP and TP53, showed characteristic SSCV profiles indicative of heterogeneous mutational functions beyond simple loss-of-function. Finally, we discovered novel gain-of-function SSCVs in the deep intronic region of the NOTCH1 gene and demonstrated that their activation can be suppressed using splice-switching ASOs. Collectively, we provide a systematic approach for automatically acquiring a registry of SSCVs, which can be used for elucidating novel biological mechanisms for splicing and genetic variation, and become a valuable resource for pinpointing critical targets in drug discovery. Catalogs of SSCVs identified in this study are accessible on SSCV DB (https://sscvdb.io/).

Follow Us on

0 comments

Add comment