Enhancing Hi-C contact matrices for loop detection with Capricorn, a multi-view diffusion model
Enhancing Hi-C contact matrices for loop detection with Capricorn, a multi-view diffusion model
Fang, T.; Liu, Y.; Woicik, A.; Lu, M.; Jha, A.; Wang, X.; Li, G.; Hristov, B.; Liu, Z.; Xu, H.; Noble, W. S.; Wang, S.
AbstractHigh-resolution Hi-C contact matrices measure the detailed three-dimensional architecture of the genome, but high-resolution experimental Hi-C data are expensive to generate and relatively rare. Computational methods to enhance low-resolution contact matrices exist but are largely based on resolution enhancement methods for natural images and hence often employ models that do not distinguish between biologically meaningful contacts and background contacts. We present Capricorn, a tool for Hi-C resolution enhancement that incorporates high-order chromatin features as additional views of the input Hi-C contact matrix and leverages a diffusion probability model backbone to generate a high-resolution matrix. We show that Capricorn outperforms the state-of-the-art in a cross-cell-line setting, improving existing methods by 17.8% in mean-squared error and 22.9% in F1 score for loops called from the generated high-resolution data. We also show that Capricorn performs well in the cross-chromosome setting, again improving the downstream loop F1 score by 15.7% relative to existing methods. Capricorn\'s implementation and source code are freely available at https://github.com/CHNFTQ/Capricorn.