跳到主要导航 跳到搜索 跳到主要内容

Filter-GLAT: Filter Glanced Decoder Output for Non-autoregressive Transformer

  • Zichun Wang
  • , Huanran Zheng
  • , Xiaoling Wang*
  • *此作品的通讯作者
  • East China Normal University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Non-autoregressive machine translation model has achieved significantly faster inference speed compared to the autoregressive translation model. However, its translation quality is degraded compared to the autoregressive translation model. Despite numerous advanced methods are proposed to improve the translation quality of the non-autoregressive translation model, achieving the desired trade-off between quality and efficiency is difficult. In this paper, a Filter Glanced Transformer, named Filter-GLAT, is proposed to tackle this problem. It first refines the glance sampling learning strategy, followed by adopting the Filter learning strategy during training, substantially enhancing the translation quality. As for the inference speed, Filter-GLAT generates predictions with only a single decoding pass, maintaining high speed. Moreover, the Filter learning strategy helps the model narrow the gap between training and inference procedures by modifying the training process. Extensive experiments over translation benchmarks (WMT’14 EN-DE and WMT’16 EN-RO) demonstrate that Filter-GLAT almost strikes the best balance between translation quality and speed.

源语言英语
主期刊名Web and Big Data - 8th International Joint Conference, APWeb-WAIM 2024, Proceedings
编辑Wenjie Zhang, Zhengyi Yang, Xiaoyang Wang, Anthony Tung, Zhonglong Zheng, Hongjie Guo
出版商Springer Science and Business Media Deutschland GmbH
59-73
页数15
ISBN(印刷版)9789819772315
DOI
出版状态已出版 - 2024
活动8th Asia-Pacific Web and Web-Age Information Management Joint International Conference on Web and Big Data, APWeb-WAIM 2024 - Jinhua, 中国
期限: 30 8月 20241 9月 2024

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
14961 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议8th Asia-Pacific Web and Web-Age Information Management Joint International Conference on Web and Big Data, APWeb-WAIM 2024
国家/地区中国
Jinhua
时期30/08/241/09/24

指纹

探究 'Filter-GLAT: Filter Glanced Decoder Output for Non-autoregressive Transformer' 的科研主题。它们共同构成独一无二的指纹。

引用此