跳到主要导航 跳到搜索 跳到主要内容

Knowledge distillation with a precise teacher and prediction with abstention

  • Yi Xu
  • , Jian Pu*
  • , Hui Zhao
  • *此作品的通讯作者
  • East China Normal University
  • Fudan University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Knowledge distillation, which aims to train model under the supervision from another large model (teacher model) to the original model (student model), has achieved remarkable results in supervised learning. However, there are two major problems with existing knowledge distillation methods. One is the teacher's supervision is sometimes misleading, and the other is the student's prediction is not accurate enough. To address the first issue, instead of learning a combination of both teachers and ground truth, we apply knowledge adjustment to correct teachers' supervision using ground truth. For the second problem, we use the selective classification framework to train the student model. In particular, the deep gambler loss is adopted to predict with reservation by explicitly introducing the (m + 1)-th class. We consider two settings of knowledge distillation: (1) distillation across different network structures (AlexNet, ResNet), and (2) distillation across networks with different depths (ResNet18, ResNet50) to evaluate the effectiveness of our method. The experimental results on benchmark datasets (i.e., Fashion-MNIST, SVHN, CIFAR10, CIFAR100) are reported with higher prediction accuracies and lower coverage errors.

源语言英语
主期刊名Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
出版商Institute of Electrical and Electronics Engineers Inc.
9000-9006
页数7
ISBN(电子版)9781728188089
DOI
出版状态已出版 - 2020
活动25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Online, 意大利
期限: 10 1月 202115 1月 2021

出版系列

姓名Proceedings - International Conference on Pattern Recognition
ISSN(印刷版)1051-4651

会议

会议25th International Conference on Pattern Recognition, ICPR 2020
国家/地区意大利
Virtual, Online
时期10/01/2115/01/21

指纹

探究 'Knowledge distillation with a precise teacher and prediction with abstention' 的科研主题。它们共同构成独一无二的指纹。

引用此