TY - GEN
T1 - Divideup
T2 - 2020 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2020
AU - Zhang, Xinyue
AU - Li, Qin
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10/11
Y1 - 2020/10/11
N2 - Supervised learning technologies represented by neural networks have made great progress in many fields. In particular applications such as image recognition and natural language processing, the entire computing process is completely handed over to the machine learning algorithm to directly learn the mapping from the feature space to the expected output, without considering much of the semantical information and domain knowledge of the data. In this paper, we propose a generic data refinement approach called divideup, which incorporates finer semantical information into the dataset to obtain a prediction model capturing more detailed information in the training data. By providing the information theory, we have high confidence that the learned model trained with the refined dataset has better prediction accuracy than the original one. We conduct extensive experiments on different datasets with the state-of-the-art neural network architectures such as ResNet and DenseNet. The experimental results show that divideup improves the prediction accuracy of all these deep learning architectures on the original test set. The divideup approach is also applied to other machine learning models such as random forest, XGboost and SVM. The results supports the conclusion that the refined training data obtained by divideup produces better prediction accuracy of the learned model.
AB - Supervised learning technologies represented by neural networks have made great progress in many fields. In particular applications such as image recognition and natural language processing, the entire computing process is completely handed over to the machine learning algorithm to directly learn the mapping from the feature space to the expected output, without considering much of the semantical information and domain knowledge of the data. In this paper, we propose a generic data refinement approach called divideup, which incorporates finer semantical information into the dataset to obtain a prediction model capturing more detailed information in the training data. By providing the information theory, we have high confidence that the learned model trained with the refined dataset has better prediction accuracy than the original one. We conduct extensive experiments on different datasets with the state-of-the-art neural network architectures such as ResNet and DenseNet. The experimental results show that divideup improves the prediction accuracy of all these deep learning architectures on the original test set. The divideup approach is also applied to other machine learning models such as random forest, XGboost and SVM. The results supports the conclusion that the refined training data obtained by divideup produces better prediction accuracy of the learned model.
UR - https://www.scopus.com/pages/publications/85098858751
U2 - 10.1109/SMC42975.2020.9283425
DO - 10.1109/SMC42975.2020.9283425
M3 - 会议稿件
AN - SCOPUS:85098858751
T3 - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
SP - 635
EP - 641
BT - 2020 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 11 October 2020 through 14 October 2020
ER -