跳到主要导航 跳到搜索 跳到主要内容

Adaptive Hierarchy-Branch Fusion for Online Knowledge Distillation

  • East China Normal University
  • Beihang University
  • Tencent
  • Key Laboratory of Advanced Theory and Application in Statistics and Data Science - MOE

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Online Knowledge Distillation (OKD) is designed to alleviate the dilemma that the high-capacity pre-trained teacher model is not available. However, the existing methods mostly focus on improving the ensemble prediction accuracy from multiple students (a.k.a. branches), which often overlook the homogenization problem that makes student models saturate quickly and hurts performance. We assume that the intrinsic bottleneck of the homogenization problem comes from the identical branch architecture and coarse ensemble strategy. We propose a novel Adaptive Hierarchy-Branch Fusion framework for Online Knowledge Distillation, termed AHBF-OKD, which designs hierarchical branches and adaptive hierarchy-branch fusion module to boost the model diversity and learn complementary knowledge. Specifically, we first introduce hierarchical branch architectures to construct diverse peers by increasing the depth of branches monotonously on the basis of the target branch. To effectively transfer knowledge from the most complex branch to the simplest target branch, we propose an adaptive hierarchy-branch fusion module to create hierarchical teacher assistants recursively, which regards the target branch as the smallest teacher assistant. During the training, the teacher assistant from the previous hierarchy is explicitly distilled by the teacher assistant and the branch from the current hierarchy. Thus, the important scores to different branches are effectively and adaptively allocated to reduce branch homogenization. Extensive experiments demonstrate the effectiveness of AHBF-OKD on different datasets, including CIFAR-10/100 and ImageNet 2012. For example, the distilled ResNet18 achieves the Top-1 error of 29.28% on ImageNet 2012, which significantly outperforms the state-of-the-art methods. The source code is available at https://github.com/linruigong965/AHBF.

源语言英语
主期刊名AAAI-23 Technical Tracks 6
编辑Brian Williams, Yiling Chen, Jennifer Neville
出版商AAAI press
7731-7739
页数9
ISBN(电子版)9781577358800
DOI
出版状态已出版 - 27 6月 2023
活动37th AAAI Conference on Artificial Intelligence, AAAI 2023 - Washington, 美国
期限: 7 2月 202314 2月 2023

出版系列

姓名Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023
37

会议

会议37th AAAI Conference on Artificial Intelligence, AAAI 2023
国家/地区美国
Washington
时期7/02/2314/02/23

指纹

探究 'Adaptive Hierarchy-Branch Fusion for Online Knowledge Distillation' 的科研主题。它们共同构成独一无二的指纹。

引用此