TY - GEN
T1 - Regression Algorithm Based on Self-Distillation and Ensemble Learning
AU - Li, Yaqi
AU - Dong, Qiwen
AU - Liu, Gang
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/12/4
Y1 - 2021/12/4
N2 - Low-dimensional feature regression is a common problem in various disciplines, such as chemistry, kinetics, and medicine, etc. Most common solutions are based on machine learning, but as deep learning evolves, there is room for performance improvements. A few researchers have proposed deep learning-based solutions such as ResidualNet, GrowNet and EnsembleNet. The latter two methods are both boost methods, which are more suitable for shallow network, and the model performance is basically determined by the first model, with limited effect of subsequent boosting steps. We propose a method based on self-distillation and bagging, which selects the well-performing base model and distills several student models by appropriate regression distillation algorithm. Finally, the output of these student models is averaged as the final result. This integration method can be applied to any form of network. The method achieves good results in the CASP dataset, and the R2(Coefficient of Determination) of the model is improved from (0.65) to (0.70) in comparison with the best base model ResidualNet.
AB - Low-dimensional feature regression is a common problem in various disciplines, such as chemistry, kinetics, and medicine, etc. Most common solutions are based on machine learning, but as deep learning evolves, there is room for performance improvements. A few researchers have proposed deep learning-based solutions such as ResidualNet, GrowNet and EnsembleNet. The latter two methods are both boost methods, which are more suitable for shallow network, and the model performance is basically determined by the first model, with limited effect of subsequent boosting steps. We propose a method based on self-distillation and bagging, which selects the well-performing base model and distills several student models by appropriate regression distillation algorithm. Finally, the output of these student models is averaged as the final result. This integration method can be applied to any form of network. The method achieves good results in the CASP dataset, and the R2(Coefficient of Determination) of the model is improved from (0.65) to (0.70) in comparison with the best base model ResidualNet.
KW - Deep Learning
KW - Ensemble Learning
KW - Knowledge distillation
KW - Regression
UR - https://www.scopus.com/pages/publications/85126396184
U2 - 10.1145/3507548.3507580
DO - 10.1145/3507548.3507580
M3 - 会议稿件
AN - SCOPUS:85126396184
T3 - ACM International Conference Proceeding Series
SP - 209
EP - 215
BT - Proceedings of 2021 5th International Conference on Computer Science and Artificial Intelligence, CSAI 2021
PB - Association for Computing Machinery
T2 - 5th International Conference on Computer Science and Artificial Intelligence, CSAI 2021
Y2 - 4 December 2021 through 6 December 2021
ER -