Skip to main navigation Skip to search Skip to main content

Variational multimodal machine translation with underlying semantic alignment

  • Xiao Liu
  • , Jing Zhao
  • , Shiliang Sun*
  • , Huawen Liu
  • , Hao Yang
  • *Corresponding author for this work
  • East China Normal University
  • Zhejiang Normal University
  • Huawei Technologies Co., Ltd.

Research output: Contribution to journalArticlepeer-review

Abstract

Capturing the underlying semantic relationships of sentences is helpful for machine translation. Variational neural machine translation approaches provide an effective way to model the uncertain underlying semantics in languages by introducing latent variables. Multitask learning is applied in multimodal machine translation to integrate multimodal data. However, these approaches usually lack a strong interpretation in utilizing out-of-text information in machine translation tasks. In this paper, we propose a novel architecture-free multimodal translation model, called variational multimodal machine translation (VMMT), under the variational framework which can model the uncertainty in languages caused by ambiguity through utilizing visual and textual information. In addition, the proposed model can eliminate the discrepancy between training and prediction in the existing variational translation models by constructing encoders only relying on source data. More importantly, the proposed multimodal translation model is designed as multitask learning in which the shared semantic representation for different modes is learned and the gap among semantic representation from various modes is reduced by incorporating additional constraints. Moreover, the information bottleneck theory is adopted in our variational encoder–decoder model, which helps the encoder to filter redundancy and the decoder to concentrate on useful information. Experiments on multimodal machine translation demonstrate that the proposed model is competitive.

Original languageEnglish
Pages (from-to)73-80
Number of pages8
JournalInformation Fusion
Volume69
DOIs
StatePublished - May 2021

Keywords

  • Machine translation
  • Multimodal learning
  • Variational neural machine translation

Fingerprint

Dive into the research topics of 'Variational multimodal machine translation with underlying semantic alignment'. Together they form a unique fingerprint.

Cite this