跳到主要导航 跳到搜索 跳到主要内容

Multi-Prototype Space Learning for Commonsense-Based Scene Graph Generation

  • East China Normal University

科研成果: 期刊稿件会议文章同行评审

摘要

In the domain of scene graph generation, modeling commonsense as a single-prototype representation has been typically employed to facilitate the recognition of infrequent predicates. However, a fundamental challenge lies in the large intra-class variations of the visual appearance of predicates, resulting in subclasses within a predicate class. Such a challenge typically leads to the problem of misclassifying diverse predicates due to the rough predicate space clustering. In this paper, inspired by cognitive science, we maintain multi-prototype representations for each predicate class, which can accurately find the multiple class centers of the predicate space. Technically, we propose a novel multi-prototype learning framework consisting of three main steps: prototype-predicate matching, prototype updating, and prototype space optimization. We first design a triple-level optimal transport to match each predicate feature within the same class to a specific prototype. In addition, the prototypes are updated using momentum updating to find the class centers according to the matching results. Finally, we enhance the inter-class separability of the prototype space through iterations of the inter-class separability loss and intra-class compactness loss. Extensive evaluations demonstrate that our approach significantly outperforms state-of-the-art methods on the Visual Genome dataset.

源语言英语
页(从-至)1129-1137
页数9
期刊Proceedings of the AAAI Conference on Artificial Intelligence
38
2
DOI
出版状态已出版 - 25 3月 2024
活动38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, 加拿大
期限: 20 2月 202427 2月 2024

指纹

探究 'Multi-Prototype Space Learning for Commonsense-Based Scene Graph Generation' 的科研主题。它们共同构成独一无二的指纹。

引用此