Science Cast

MBTFNet: Multi-Band Temporal-Frequency Neural Network For Singing Voice Enhancement

Weiming XuOctober 9, 2023 8:37am

Views (262)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

MBTFNet: Multi-Band Temporal-Frequency Neural Network For Singing Voice Enhancement

arXivPDFOctober 6, 2023 12:00am

Authors

Weiming Xu, Zhouxuan Chen, Zhili Tan, Shubo Lv, Runduo Han, Wenjiang Zhou, Weifeng Zhao, Lei Xie

Abstract

A typical neural speech enhancement (SE) approach mainly handles speech and noise mixtures, which is not optimal for singing voice enhancement scenarios. Music source separation (MSS) models treat vocals and various accompaniment components equally, which may reduce performance compared to the model that only considers vocal enhancement. In this paper, we propose a novel multi-band temporal-frequency neural network (MBTFNet) for singing voice enhancement, which particularly removes background music, noise and even backing vocals from singing recordings. MBTFNet combines inter and intra-band modeling for better processing of full-band signals. Dual-path modeling are introduced to expand the receptive field of the model. We propose an implicit personalized enhancement (IPE) stage based on signal-to-noise ratio (SNR) estimation, which further improves the performance of MBTFNet. Experiments show that our proposed model significantly outperforms several state-of-the-art SE and MSS models.

TwitterandLinkedIn

0 comments

Add comment

MBTFNet: Multi-Band Temporal-Frequency Neural Network For Singing Voice Enhancement

MBTFNet: Multi-Band Temporal-Frequency Neural Network For Singing Voice Enhancement

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments