TY - JOUR
T1 - 深度神经网络压缩与加速综述
AU - Ji, Rongrong
AU - Lin, Shaohui
AU - Chao, Fei
AU - Wu, Yongjian
AU - Huang, Feiyue
N1 - Publisher Copyright:
© 2018, Science Press. All right reserved.
PY - 2018/9/1
Y1 - 2018/9/1
N2 - In recent years, deep neural networks (DNNs) have achieved remarkable success in many artificial intelligence (AI) applications, including computer vision, speech recognition and natural language processing. However, such DNNs have been accompanied by significant increase in computational costs and storage services, which prohibits the usages of DNNs on resource-limited environments such as mobile or embedded devices. To this end, the studies of DNN compression and acceleration have recently become more emerging. In this paper, we provide a review on the existing representative DNN compression and acceleration methods, including parameter pruning, parameter sharing, low-rank decomposition, compact filter designed, and knowledge distillation. Specifically, this paper provides an overview of DNNs, describes the details of different DNN compression and acceleration methods, and highlights the properties, advantages and drawbacks. Furthermore, we summarize the evaluation criteria and datasets widely used in DNN compression and acceleration, and also discuss the performance of the representative methods. In the end, we discuss how to choose different compression and acceleration methods to meet the needs of different tasks, and envision future directions on this topic.
AB - In recent years, deep neural networks (DNNs) have achieved remarkable success in many artificial intelligence (AI) applications, including computer vision, speech recognition and natural language processing. However, such DNNs have been accompanied by significant increase in computational costs and storage services, which prohibits the usages of DNNs on resource-limited environments such as mobile or embedded devices. To this end, the studies of DNN compression and acceleration have recently become more emerging. In this paper, we provide a review on the existing representative DNN compression and acceleration methods, including parameter pruning, parameter sharing, low-rank decomposition, compact filter designed, and knowledge distillation. Specifically, this paper provides an overview of DNNs, describes the details of different DNN compression and acceleration methods, and highlights the properties, advantages and drawbacks. Furthermore, we summarize the evaluation criteria and datasets widely used in DNN compression and acceleration, and also discuss the performance of the representative methods. In the end, we discuss how to choose different compression and acceleration methods to meet the needs of different tasks, and envision future directions on this topic.
KW - DNN acceleration
KW - DNN compression
KW - Knowledge distillation
KW - Low-rank decomposition
KW - Parameter pruning
KW - Parameter sharing
UR - https://www.scopus.com/pages/publications/85058979765
U2 - 10.7544/issn1000-1239.2018.20180129
DO - 10.7544/issn1000-1239.2018.20180129
M3 - 文献综述
AN - SCOPUS:85058979765
SN - 1000-1239
VL - 55
SP - 1871
EP - 1888
JO - Jisuanji Yanjiu yu Fazhan/Computer Research and Development
JF - Jisuanji Yanjiu yu Fazhan/Computer Research and Development
IS - 9
ER -