TY - GEN
T1 - Generating Emotional Social Chatbot Responses with a Consistent Speaking Style
AU - Zhang, Jun
AU - Yang, Yan
AU - Chen, Chengcai
AU - He, Liang
AU - Yu, Zhou
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - Emotional conversation plays a vital role in creating more human-like conversations. Although previous works on emotional conversation generation have achieved promising results, the issue of the speaking style inconsistency still exists. In this paper, we propose a Style-Aware Emotional Dialogue System (SEDS) to enhance speaking style consistency through detecting user’s emotions and modeling speaking styles in emotional response generation. Specifically, SEDS uses an emotion encoder to perceive the user’s emotion from multimodal inputs, and tracks speaking styles through jointly optimizing a generator that is augmented with a personalized lexicon to capture explicit word-level speaking style features. Additionally, we propose an auxiliary task, a speaking style classification task, to guide SEDS to learn the implicit form of speaking style during the training process. We construct a multimodal dialogue dataset and make the alignment and annotation to verify the effectiveness of the model. Experimental results show that our SEDS achieves a significant improvement over other strong baseline models in terms of perplexity, emotion accuracy and style consistency.
AB - Emotional conversation plays a vital role in creating more human-like conversations. Although previous works on emotional conversation generation have achieved promising results, the issue of the speaking style inconsistency still exists. In this paper, we propose a Style-Aware Emotional Dialogue System (SEDS) to enhance speaking style consistency through detecting user’s emotions and modeling speaking styles in emotional response generation. Specifically, SEDS uses an emotion encoder to perceive the user’s emotion from multimodal inputs, and tracks speaking styles through jointly optimizing a generator that is augmented with a personalized lexicon to capture explicit word-level speaking style features. Additionally, we propose an auxiliary task, a speaking style classification task, to guide SEDS to learn the implicit form of speaking style during the training process. We construct a multimodal dialogue dataset and make the alignment and annotation to verify the effectiveness of the model. Experimental results show that our SEDS achieves a significant improvement over other strong baseline models in terms of perplexity, emotion accuracy and style consistency.
KW - Emotional conversation
KW - Multimodal
KW - Speaking style
UR - https://www.scopus.com/pages/publications/85093096877
U2 - 10.1007/978-3-030-60457-8_5
DO - 10.1007/978-3-030-60457-8_5
M3 - 会议稿件
AN - SCOPUS:85093096877
SN - 9783030604561
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 57
EP - 68
BT - Natural Language Processing and Chinese Computing - 9th CCF International Conference, NLPCC 2020, Proceedings
A2 - Zhu, Xiaodan
A2 - Zhang, Min
A2 - Hong, Yu
A2 - He, Ruifang
PB - Springer Science and Business Media Deutschland GmbH
T2 - 9th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2020
Y2 - 14 October 2020 through 18 October 2020
ER -