TEDEdb: a large-scale resource and multi-cohort analysis of transposable element differential expression in cancer
TEDEdb: a large-scale resource and multi-cohort analysis of transposable element differential expression in cancer
Calendo, G.; Chaunzwa, M.; Dehzangi, I.; Madzo, J.; Issa, J.-P. J.
AbstractThe human genome consists of nearly 50% repetitive DNA, referred to for decades as "junk DNA". These repetitive sequences, usually under the strict control of epigenetic silencing, have been observed to be aberrantly expressed in cancer. Some of these expressed sequences, e.g., transposable elements (TEs), can induce innate immune responses when de-repressed following treatment with epigenetic therapies. As a result, epigenetic therapy has been suggested to augment cancer therapies. TEs are traditionally ignored in most RNA-seq studies and their expression is often excluded from publicly available data sources. Thus, the vast amount of publicly available RNA-seq data is an untapped resource for exploring the role of TE expression in cancer and cancer treatment. Here, we present a uniform re-analysis of over 7,000 RNA-seq samples, encompassing more than 2,000 differential expression experiments across 220 cancer cell lines and 700 drug treatments. We observed that TE expression is more prone to batch effects than gene expression alone, necessitating the use of meta-analysis techniques to probe the dataset for global trends. We confirm that DNMTi and HDACis are powerful inducers of TEs. We also show that non-epigenetic compounds such as CDK and topoisomerase inhibitors can also induce robust up-regulation of transposable elements and confirm that this TE induction is consistent with viral mimicry response. We make all of the reprocessed data, web application, and database publicly available at: https://dataexplorer.coriell.org/TEDEdb/