跳到主要导航 跳到搜索 跳到主要内容

Emphasizing Semantic Consistency of Salient Posture for Speech-Driven Gesture Generation

  • Fengqi Liu
  • , Hexiang Wang
  • , Jingyu Gong
  • , Ran Yi
  • , Qianyu Zhou
  • , Xuequan Lu
  • , Jiangbo Lu
  • , Lizhuang Ma*
  • *此作品的通讯作者
  • Shanghai Jiao Tong University
  • East China Normal University
  • La Trobe University
  • SmartMore

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Speech-driven gesture generation aims at synthesizing a gesture sequence synchronized with the input speech signal. Previous methods leverage neural networks to directly map a compact audio representation to the gesture sequence, ignoring the semantic association of different modalities and failing to deal with salient gestures. In this paper, we propose a novel speech-driven gesture generation method by emphasizing the semantic consistency of salient posture. Specifically, we first learn a joint manifold space for the individual representation of audio and body pose to exploit the inherent semantic association between two modalities, and propose to enforce semantic consistency via a consistency loss. Furthermore, we emphasize the semantic consistency of salient postures by introducing a weakly-supervised detector to identify salient postures, and reweighting the consistency loss to focus more on learning the correspondence between salient postures and the high-level semantics of speech content. In addition, we propose to extract audio features dedicated to facial expression and body gesture separately, and design separate branches for face and body gesture synthesis. Extensive experimental results demonstrate the superiority of our method over the state-of-the-art approaches.

源语言英语
主期刊名MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
出版商Association for Computing Machinery, Inc
7027-7035
页数9
ISBN(电子版)9798400706868
DOI
出版状态已出版 - 28 10月 2024
已对外发布
活动32nd ACM International Conference on Multimedia, MM 2024 - Melbourne, 澳大利亚
期限: 28 10月 20241 11月 2024

出版系列

姓名MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia

会议

会议32nd ACM International Conference on Multimedia, MM 2024
国家/地区澳大利亚
Melbourne
时期28/10/241/11/24

指纹

探究 'Emphasizing Semantic Consistency of Salient Posture for Speech-Driven Gesture Generation' 的科研主题。它们共同构成独一无二的指纹。

引用此