TY - GEN
T1 - NAT4AT
T2 - 33rd ACM Web Conference, WWW 2024
AU - Zheng, Huanran
AU - Zhu, Wei
AU - Wang, Xiaoling
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/5/13
Y1 - 2024/5/13
N2 - With the increasing number of web documents, the demand for translation has increased dramatically. Non-autoregressive translation (NAT) models can significantly reduce decoding latency to meet the growing translation needs, but they sacrifice translation quality. And there is still an irreparable performance gap between NAT models and strong autoregressive translation (AT) models at the corpus level. However, more fine-grained comparative experiments on AT and NAT are currently lacking. Therefore, in this paper, we first conducted analysis experiments at the sentence level and found complementarity and high similarity between the translations generated by AT and NAT. Then, based on this observation, we propose a general and effective method called NAT4AT, which can not only use NAT to speed up the inference speed of AT significantly but also improve its final translation quality. Specifically, NAT4AT first uses a NAT model to generate an original translation in parallel and then uses an AT model as a correction model to revise errors in the original translation. In this way, the AT model no longer needs to predict the entire translation but only needs to predict a small number of error parts in the NAT result. Extensive experimental results on major WMT benchmarks verify the generality and effectiveness of our method, whose translation quality is superior to the strong AT model and achieves a 5.0x speedup.
AB - With the increasing number of web documents, the demand for translation has increased dramatically. Non-autoregressive translation (NAT) models can significantly reduce decoding latency to meet the growing translation needs, but they sacrifice translation quality. And there is still an irreparable performance gap between NAT models and strong autoregressive translation (AT) models at the corpus level. However, more fine-grained comparative experiments on AT and NAT are currently lacking. Therefore, in this paper, we first conducted analysis experiments at the sentence level and found complementarity and high similarity between the translations generated by AT and NAT. Then, based on this observation, we propose a general and effective method called NAT4AT, which can not only use NAT to speed up the inference speed of AT significantly but also improve its final translation quality. Specifically, NAT4AT first uses a NAT model to generate an original translation in parallel and then uses an AT model as a correction model to revise errors in the original translation. In this way, the AT model no longer needs to predict the entire translation but only needs to predict a small number of error parts in the NAT result. Extensive experimental results on major WMT benchmarks verify the generality and effectiveness of our method, whose translation quality is superior to the strong AT model and achieves a 5.0x speedup.
KW - efficient inference
KW - neural machine translation
KW - non-autoregressive generation
UR - https://www.scopus.com/pages/publications/85194039829
U2 - 10.1145/3589334.3645527
DO - 10.1145/3589334.3645527
M3 - 会议稿件
AN - SCOPUS:85194039829
T3 - WWW 2024 - Proceedings of the ACM Web Conference
SP - 4181
EP - 4192
BT - WWW 2024 - Proceedings of the ACM Web Conference
PB - Association for Computing Machinery, Inc
Y2 - 13 May 2024 through 17 May 2024
ER -