跳到主要导航 跳到搜索 跳到主要内容

Comprehensive source-target speaker voice conversion analysis

  • He Pan
  • , Yangjie Wei
  • , Nan Guan
  • , Yi Wang
  • Northeastern University China
  • Uppsala University

科研成果: 期刊稿件文章同行评审

摘要

Voice conversion system modifies a speaker’s voice to be perceived as another speaker uttered, and now it is widely used in many real applications. However, most research only focuses on one aspect performance of voice conversion system, rare theoretical analysis and experimental comparison on the whole source-target speaker voice conversion process has been introduced. Therefore, in this paper, a comprehensive analysis on source-target speaker voice conversion is conducted based on three key steps, including acoustic features selection and extraction, voice conversion model construction, and target speech synthesis, and a complete and optimal source-target speaker voice conversion is proposed. First, a simple and direct serial feature fusion form consisting of prosodic feature, spectrum parameter and spectral envelope characteristic, is proposed. Then, to void the discontinuity and spectrum distortion of a converted speech, D_GMM (Dynamic Gaussian Mixture Model) considering dynamic information between frames is presented. Subsequently, for speech synthesis, STRAIGHT algorithm synthesizer with feature combination is modified. Finally, the objective contrast experiment shows that our new source-target voice conversion process achieves better performance than the conventional methods. In addition, both objective evaluation (speaker recognition system) and subjective evaluation are used to evaluate the quality of converted speech, and experimental result shows that the converted speech has higher target speaker individuality and speech quality.

源语言英语
页(从-至)40-48
页数9
期刊International Journal of Simulation: Systems, Science and Technology
15
6
DOI
出版状态已出版 - 12月 2014
已对外发布

指纹

探究 'Comprehensive source-target speaker voice conversion analysis' 的科研主题。它们共同构成独一无二的指纹。

引用此