跳到主要导航 跳到搜索 跳到主要内容

Brand-new speech animation technology based on first order motion model and MelGAN-VC

  • Shaomin Chen
  • , Xinyi Gao
  • , Jiangtao Wang*
  • , Yu Xiao
  • , Yueling Zhang
  • , Gang Xu
  • *此作品的通讯作者

科研成果: 期刊稿件会议文章同行评审

摘要

Speech animation has huge application potential in instant messaging and entertainment media fields such as videophones, virtual meetings, audio and video chats. The traditional voice-driven speech animation has the problem of a single adaptation language, and the performance-driven speech animation has the problem of high cost of capture equipment and difficult mass production. Based on the above existing problems, we propose a new method of speech animation generation, that is, given a static portrait of a person and a face-driven video, finally generate a face animation video of the character in the given portrait. The conversion system consists of two parts: face conversion and voice conversion. We noticed that the final generated face animation video has problems such as low definition, not smooth playback, and metallic sound. On this basis, this article proposes to increase the animation enhancement experiment and replace the encoder measures for improvement. Through comparative experiments, the above measures are proved to be effective.

源语言英语
文章编号012029
期刊Journal of Physics: Conference Series
1828
1
DOI
出版状态已出版 - 4 3月 2021
活动2020 International Symposium on Automation, Information and Computing, ISAIC 2020 - Beijing, Virtual, 中国
期限: 2 12月 20204 12月 2020

指纹

探究 'Brand-new speech animation technology based on first order motion model and MelGAN-VC' 的科研主题。它们共同构成独一无二的指纹。

引用此