跳到主要导航 跳到搜索 跳到主要内容

CVTE-Poly: A New Benchmark for Chinese Polyphone Disambiguation

  • Guangzhou Shiyuan Electronic Technology Company Limited
  • East China Normal University

科研成果: 期刊稿件会议文章同行评审

摘要

Conversion from graphemes to phonemes is an essential component in Text-To-Speech systems, and in Chinese, one main challenge is polyphone disambiguation-to determine the pronunciation of characters with multiple pronunciations. In this task, the benchmark dataset Chinese Polyphone disambiguation with Pinyin (CPP) suffers from two main limitations: Firstly, it contains some wrong labels in contrast to the newest official dictionary. Secondly, it is imbalanced and hence models learned from it show a learning bias towards frequently-used pronunciations and polyphones. In this paper, we refine CPP and release a new dataset named CVTE-poly, containing 845254 samples, nearly ten times the size of CPP and is more balanced. Besides, we propose a comprehensive measurement for polyphone disambiguation task, against the data imbalance problem. Experiments show that our simple but flexible baseline trained on CVTE-poly outperforms existing models, which demonstrate the benefit of our dataset.

源语言英语
页(从-至)5526-5530
页数5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2023-August
DOI
出版状态已出版 - 2023
活动24th Annual conference of the International Speech Communication Association, Interspeech 2023 - Dublin, 爱尔兰
期限: 20 8月 202324 8月 2023

指纹

探究 'CVTE-Poly: A New Benchmark for Chinese Polyphone Disambiguation' 的科研主题。它们共同构成独一无二的指纹。

引用此