TY - GEN
T1 - Kan
T2 - 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021
AU - Zhu, Zeyang
AU - Lin, Xin
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - Few-shot learning task aims to explore a model that is able to quickly learn new concepts by learning a few examples. The current approaches learning new categories with few images or even a single image are only based on the the visual modality. However, it is difficult to learn the representative features of new categories by a few images. This is because some categories are similar in vision. Moreover, due to the viewpoint, luminosity and that sometimes individuals of the same species appear markedly different from one another, the models are not able to learn the exact representation of classes. Therefore, considering that semantic information can enhance understanding when visual information is limited, we propose Knowledge-Augmented Networks (KAN), which combines the visual features with the semantic information extracted from knowledge graph to represent the features of each class. We demonstrate the effectiveness of our method on standard few-shot learning tasks, and further observe that with the augmented semantic information from knowledge graph, KAN is able to learn more disentangled representations. Experiments show that our model outperforms the state-of-the-art methods.
AB - Few-shot learning task aims to explore a model that is able to quickly learn new concepts by learning a few examples. The current approaches learning new categories with few images or even a single image are only based on the the visual modality. However, it is difficult to learn the representative features of new categories by a few images. This is because some categories are similar in vision. Moreover, due to the viewpoint, luminosity and that sometimes individuals of the same species appear markedly different from one another, the models are not able to learn the exact representation of classes. Therefore, considering that semantic information can enhance understanding when visual information is limited, we propose Knowledge-Augmented Networks (KAN), which combines the visual features with the semantic information extracted from knowledge graph to represent the features of each class. We demonstrate the effectiveness of our method on standard few-shot learning tasks, and further observe that with the augmented semantic information from knowledge graph, KAN is able to learn more disentangled representations. Experiments show that our model outperforms the state-of-the-art methods.
KW - Few-Shot Learning
KW - Image Classification
KW - Multimodal Fusion
UR - https://www.scopus.com/pages/publications/85115046483
U2 - 10.1109/ICASSP39728.2021.9413612
DO - 10.1109/ICASSP39728.2021.9413612
M3 - 会议稿件
AN - SCOPUS:85115046483
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 1735
EP - 1739
BT - 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 6 June 2021 through 11 June 2021
ER -