Adaptive speech separation based on beamforming and frequency domain-independent component analysis

Ke Zhang, Yangjie Wei, Dan Wu, Yi Wang

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Voice signals acquired by a microphone array often include considerable noise and mutual interference, seriously degrading the accuracy and speed of speech separation. Traditional beamforming is simple to implement, but its source interference suppression is not adequate. In contrast, independent component analysis (ICA) can improve separation, but imposes an iterative and time-consuming process to calculate the separation matrix. As a supporting method, principle component analysis (PCA) contributes to reduce the dimension, retrieve fast results, and disregard false sound sources. Considering the sparsity of frequency components in a mixed signal, we propose an adaptive fast speech separation algorithm based on multiple sound source localization as preprocessing to select between beamforming and frequency domain ICA according to different mixing conditions per frequency bin. First, a fast positioning algorithm allows calculating the maximum number of components per frequency bin of a mixed speech signal to prevent the occurrence of false sound sources. Then, PCA reduces the dimension to adaptively adjust the weight of beamforming and ICA for speech separation. Subsequently, the ICA separation matrix is initialized based on the sound source localization to notably reduce the iteration time and mitigate permutation ambiguity. Simulation and experimental results verify the effectiveness and speedup of the proposed algorithm.

Original languageEnglish
Article number2593
JournalApplied Sciences (Switzerland)
Volume10
Issue number7
DOIs
StatePublished - 1 Apr 2020
Externally publishedYes

Keywords

  • Beamforming
  • Independent component analysis
  • Principle component analysis
  • Speech separation

Fingerprint

Dive into the research topics of 'Adaptive speech separation based on beamforming and frequency domain-independent component analysis'. Together they form a unique fingerprint.

Cite this