A denoising-aided multi-task learning method for blind estimation of reverberation time

  • Yulong Zhang
  • , Jinqiu Sang
  • , Chengshi Zheng*
  • , Xiaodong Li
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

The noise in reverberant speech severely limits the estimation accuracy of reverberation time T60 using current deep learning (DL) methods. To address this issue, this paper proposes a denoising-aided multi-task learning (DAMTL) method for blind T60 estimation. Specifically, speech denoising, as an auxiliary module, is conducted joint training with T60 estimation for more accurate prediction accuracy. These two tasks are integrated into one DL network by sharing the same encoder network, where the complex-valued spectrum is introduced to extract comprehensive high-dimensional features from noisy reverberant speech. Subsequently, complex operation of 2-D convolutional neural network (Conv2d), batch normalization and long short-term memory (LSTM) are formulated. Furthermore, the noise robustness and applicability of the DAMTL are fully discussed by comparison with state-of-the-art DL-based methods using simulated data and real-world recorded data. The results prove the effectiveness and superiority of the proposed DAMTL, especially in low signal-to-noise ratio (SNR) scenarios and practical applications.

Original languageEnglish
Article number114568
JournalMeasurement: Journal of the International Measurement Confederation
Volume231
DOIs
StatePublished - 31 May 2024

Keywords

  • Blind reverberation time estimation
  • Low signal-to-noise-ratio scenario
  • Multi-task learning
  • Noisy environment
  • Speech denoising

Fingerprint

Dive into the research topics of 'A denoising-aided multi-task learning method for blind estimation of reverberation time'. Together they form a unique fingerprint.

Cite this