跳到主要导航 跳到搜索 跳到主要内容

FashionKLIP: Enhancing E-Commerce Image-Text Retrieval with Fashion Multi-Modal Conceptual Knowledge Graph

  • Xiaodan Wang
  • , Chengyu Wang
  • , Lei Li
  • , Zhixu Li
  • , Ben Chen
  • , Linbo Jin
  • , Jun Huang
  • , Yanghua Xiao*
  • , Ming Gao
  • *此作品的通讯作者
  • Fudan University
  • Alibaba Group Holding Ltd.
  • East China Normal University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Image-text retrieval is a core task in the multimodal domain, which arises a lot of attention from both research and industry communities. Recently, the booming of visionlanguage pre-trained (VLP) models has greatly enhanced the performance of cross-modal retrieval. However, the fine-grained interactions between objects from different modalities are far from well-established. This issue becomes more severe in the e-commerce domain, which lacks sufficient training data and fine-grained cross-modal knowledge. To alleviate the problem, this paper proposes a novel e-commerce knowledge-enhanced VLP model FashionKLIP. We first automatically establish a multi-modal conceptual knowledge graph from large-scale e-commerce image-text data, and then inject the prior knowledge into the VLP model to align across modalities at the conceptual level. The experiments conducted on a public benchmark dataset demonstrate that FashionKLIP effectively enhances the performance of e-commerce image-text retrieval upon stateof-the-art VLP models by a large margin. The application of the method in real industrial scenarios also proves the feasibility and efficiency of FashionKLIP.

源语言英语
主期刊名Industry Track
出版商Association for Computational Linguistics (ACL)
149-158
页数10
ISBN(电子版)9781959429685
DOI
出版状态已出版 - 2023
活动61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, 加拿大
期限: 9 7月 202314 7月 2023

出版系列

姓名Proceedings of the Annual Meeting of the Association for Computational Linguistics
5
ISSN(印刷版)0736-587X

会议

会议61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
国家/地区加拿大
Toronto
时期9/07/2314/07/23

指纹

探究 'FashionKLIP: Enhancing E-Commerce Image-Text Retrieval with Fashion Multi-Modal Conceptual Knowledge Graph' 的科研主题。它们共同构成独一无二的指纹。

引用此