Chimera Model of Candidate Soups for Non-Autoregressive Translation

  • Huanran Zheng
  • , Wei Zhu
  • , Xiaoling Wang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Non-Autoregressive Translation (NAT) models have drawn much attention because of their excellent decoding speed. However, NAT models suffer a significant drop in translation quality compared to Autoregressive Translation (AT) models. Candidate Soups (CandiSoups) is an effective method that can fully use the different candidate translations, significantly improving the translation quality for NAT models. However, it needs to use an additional AT model for re-scoring to achieve the best performance, which slows down its inference speed and takes up more computing resources. In this paper, we propose a Chimera Model framework of CandiSoups (CMCS), which can significantly accelerate inference speed while maintaining superior performance for CandiSoups. Specifically, by modifying the decoder, we fuse the AT and NAT models to construct a Chimera Model that can perform self-rescore. Moreover, we propose a novel adaptive training method to help train Chimera Models better. Experimental results on two major benchmarks demonstrate the effectiveness of our proposed approach, which can significantly improve translation quality while maintaining the excellent inference speed.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - 29th International Conference, DASFAA 2024, Proceedings
EditorsMakoto Onizuka, Jae-Gil Lee, Yongxin Tong, Chuan Xiao, Yoshiharu Ishikawa, Kejing Lu, Sihem Amer-Yahia, H.V. Jagadish
PublisherSpringer Science and Business Media Deutschland GmbH
Pages416-425
Number of pages10
ISBN (Print)9789819757787
DOIs
StatePublished - 2025
Event29th International Conference on Database Systems for Advanced Applications, DASFAA 2024 - Gifu, Japan
Duration: 2 Jul 20245 Jul 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14851 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Conference on Database Systems for Advanced Applications, DASFAA 2024
Country/TerritoryJapan
CityGifu
Period2/07/245/07/24

Keywords

  • Efficient Inference
  • Language Processing
  • Machine Translation
  • Non-autoregressive Generation

Fingerprint

Dive into the research topics of 'Chimera Model of Candidate Soups for Non-Autoregressive Translation'. Together they form a unique fingerprint.

Cite this