跳到主要导航 跳到搜索 跳到主要内容

CO3: Low-resource Contrastive Co-training for Generative Conversational Query Rewrite

  • Yifei Yuan
  • , Chen Shi
  • , Runze Wang
  • , Liyi Chen
  • , Renjun Hu
  • , Zengming Zhang
  • , Feijun Jiang
  • , Wai Lam

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Generative query rewrite generates reconstructed query rewrites using the conversation history while rely heavily on gold rewrite pairs that are expensive to obtain. Recently, few-shot learning is gaining increasing popularity for this task, whereas these methods are sensitive to the inherent noise due to limited data size. Besides, both attempts face performance degradation when there exists language style shift between training and testing cases. To this end, we study low-resource generative conversational query rewrite that is robust to both noise and language style shift. The core idea is to utilize massive unlabeled data to make further improvements via a contrastive co-training paradigm. Specifically, we co-train two dual models (namely Rewriter and Simplifier) such that each of them provides extra guidance through pseudo-labeling for enhancing the other in an iterative manner. We also leverage contrastive learning with data augmentation, which enables our model pay more attention on the truly valuable information than the noise. Extensive experiments demonstrate the superiority of our model under both few-shot and zero-shot scenarios. We also verify the better generalization ability of our model when encountering language style shift.

源语言英语
主期刊名2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
编辑Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
出版商European Language Resources Association (ELRA)
3394-3406
页数13
ISBN(电子版)9782493814104
出版状态已出版 - 2024
已对外发布
活动Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, 意大利
期限: 20 5月 202425 5月 2024

出版系列

姓名2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

会议

会议Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
国家/地区意大利
Hybrid, Torino
时期20/05/2425/05/24

指纹

探究 'CO3: Low-resource Contrastive Co-training for Generative Conversational Query Rewrite' 的科研主题。它们共同构成独一无二的指纹。

引用此