TY - GEN
T1 - CROSS-MODALITY GRAPH NEURAL NETWORK FOR FEW-SHOT LEARNING
AU - Liu, Shubao
AU - Xie, Yuan
AU - Yuan, Wang
AU - Ma, Lizhuang
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - Few-shot learning, which attempts to predict unlabeled samples with only a few labeled samples, has drawn more and more attention. Though recent works have achieved promising progress, none of them have noticed to establish consistency among episodes, leading to the ambiguity in latent embedding space. In this paper, we propose a novel Cross-Modality Graph Neural Network (CMGNN) to uncover the associations among episodes for consistent global embedding. Since the semantic information induced from NLP is relatively fixed compared to visual information space, we leverage it to construct meta nodes for each category to guide the corresponding visual feature learning through GNN. Moreover, to ensure global embedding, a distance loss function is designed to force the visual nodes closer to their associated meta nodes to a greater extent. Extensive experiments and ablation studies on four benchmark datasets show its superiority over many SOTA comparison methods.
AB - Few-shot learning, which attempts to predict unlabeled samples with only a few labeled samples, has drawn more and more attention. Though recent works have achieved promising progress, none of them have noticed to establish consistency among episodes, leading to the ambiguity in latent embedding space. In this paper, we propose a novel Cross-Modality Graph Neural Network (CMGNN) to uncover the associations among episodes for consistent global embedding. Since the semantic information induced from NLP is relatively fixed compared to visual information space, we leverage it to construct meta nodes for each category to guide the corresponding visual feature learning through GNN. Moreover, to ensure global embedding, a distance loss function is designed to force the visual nodes closer to their associated meta nodes to a greater extent. Extensive experiments and ablation studies on four benchmark datasets show its superiority over many SOTA comparison methods.
KW - cross-modality
KW - few-shot learning
KW - graph neural network
UR - https://www.scopus.com/pages/publications/85126447216
U2 - 10.1109/ICME51207.2021.9428405
DO - 10.1109/ICME51207.2021.9428405
M3 - 会议稿件
AN - SCOPUS:85126447216
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2021 IEEE International Conference on Multimedia and Expo, ICME 2021
PB - IEEE Computer Society
T2 - 2021 IEEE International Conference on Multimedia and Expo, ICME 2021
Y2 - 5 July 2021 through 9 July 2021
ER -