跳到主要导航 跳到搜索 跳到主要内容

Divideup: A Generic Improvement Approach for Supervised Learning Using Dataset Partition with Finer Semantical Information

  • Xinyue Zhang
  • , Qin Li*
  • *此作品的通讯作者
  • East China Normal University
  • Tongji University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Supervised learning technologies represented by neural networks have made great progress in many fields. In particular applications such as image recognition and natural language processing, the entire computing process is completely handed over to the machine learning algorithm to directly learn the mapping from the feature space to the expected output, without considering much of the semantical information and domain knowledge of the data. In this paper, we propose a generic data refinement approach called divideup, which incorporates finer semantical information into the dataset to obtain a prediction model capturing more detailed information in the training data. By providing the information theory, we have high confidence that the learned model trained with the refined dataset has better prediction accuracy than the original one. We conduct extensive experiments on different datasets with the state-of-the-art neural network architectures such as ResNet and DenseNet. The experimental results show that divideup improves the prediction accuracy of all these deep learning architectures on the original test set. The divideup approach is also applied to other machine learning models such as random forest, XGboost and SVM. The results supports the conclusion that the refined training data obtained by divideup produces better prediction accuracy of the learned model.

源语言英语
主期刊名2020 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2020
出版商Institute of Electrical and Electronics Engineers Inc.
635-641
页数7
ISBN(电子版)9781728185262
DOI
出版状态已出版 - 11 10月 2020
活动2020 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2020 - Toronto, 加拿大
期限: 11 10月 202014 10月 2020

出版系列

姓名Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
2020-October
ISSN(印刷版)1062-922X

会议

会议2020 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2020
国家/地区加拿大
Toronto
时期11/10/2014/10/20

指纹

探究 'Divideup: A Generic Improvement Approach for Supervised Learning Using Dataset Partition with Finer Semantical Information' 的科研主题。它们共同构成独一无二的指纹。

引用此