TY - GEN
T1 - Estimation-Based Strategy Generation for Deep Neural Network Model Compression
AU - Wang, Hongkai
AU - Feng, Jun
AU - Zhao, Shuai
AU - Wang, Yidan
AU - Mao, Dong
AU - Chen, Zuge
AU - Ke, Gongwu
AU - Wang, Gaoli
AU - Long, Youqun
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Compressing the neural network can significantly reduce its computational complexity, save resources and speed up inference time. However, current compression methods, whether used individually or in combination, often neglect the issue of compression strategy generation, making it challenging to obtain compressed models with the smallest accuracy degradation that meet the user's deployment requirements. This paper proposes a method for automatically generating compression strategy, aiming to achieve high-performance models that meet deployment requirements with minimal accuracy degradation. Firstly, we design a predictor to estimate the compression performance of the model if it is compressed by different compression methods such as distillation, pruning and quantization. This includes estimating the model size, the number of parameters, computational complexity, and memory access of the model after compression. Then a computational method for estimating the inference time of the model after compression is discussed. Based on the estimated results, user requirements and hardware parameters, a method for automatically generating compression strategy is designed, which outputs suitable combinations of compression methods and compression parameter settings. Experimental results on commonly used convolutional neural networks and Jetson Nano development board validated the effectiveness of the proposed method.
AB - Compressing the neural network can significantly reduce its computational complexity, save resources and speed up inference time. However, current compression methods, whether used individually or in combination, often neglect the issue of compression strategy generation, making it challenging to obtain compressed models with the smallest accuracy degradation that meet the user's deployment requirements. This paper proposes a method for automatically generating compression strategy, aiming to achieve high-performance models that meet deployment requirements with minimal accuracy degradation. Firstly, we design a predictor to estimate the compression performance of the model if it is compressed by different compression methods such as distillation, pruning and quantization. This includes estimating the model size, the number of parameters, computational complexity, and memory access of the model after compression. Then a computational method for estimating the inference time of the model after compression is discussed. Based on the estimated results, user requirements and hardware parameters, a method for automatically generating compression strategy is designed, which outputs suitable combinations of compression methods and compression parameter settings. Experimental results on commonly used convolutional neural networks and Jetson Nano development board validated the effectiveness of the proposed method.
KW - Compression Strategy Generation
KW - Computational Time Estimation
KW - Deploy Requirement
KW - Model Compression
UR - https://www.scopus.com/pages/publications/85180540167
U2 - 10.1109/PRAI59366.2023.10331943
DO - 10.1109/PRAI59366.2023.10331943
M3 - 会议稿件
AN - SCOPUS:85180540167
T3 - 2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence, PRAI 2023
SP - 1009
EP - 1015
BT - 2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence, PRAI 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th IEEE International Conference on Pattern Recognition and Artificial Intelligence, PRAI 2023
Y2 - 18 August 2023 through 20 August 2023
ER -