摘要
In recent years, deep neural networks (DNNs) have achieved remarkable success in many artificial intelligence (AI) applications, including computer vision, speech recognition and natural language processing. However, such DNNs have been accompanied by significant increase in computational costs and storage services, which prohibits the usages of DNNs on resource-limited environments such as mobile or embedded devices. To this end, the studies of DNN compression and acceleration have recently become more emerging. In this paper, we provide a review on the existing representative DNN compression and acceleration methods, including parameter pruning, parameter sharing, low-rank decomposition, compact filter designed, and knowledge distillation. Specifically, this paper provides an overview of DNNs, describes the details of different DNN compression and acceleration methods, and highlights the properties, advantages and drawbacks. Furthermore, we summarize the evaluation criteria and datasets widely used in DNN compression and acceleration, and also discuss the performance of the representative methods. In the end, we discuss how to choose different compression and acceleration methods to meet the needs of different tasks, and envision future directions on this topic.
| 投稿的翻译标题 | Deep Neural Network Compression and Acceleration: A Review |
|---|---|
| 源语言 | 繁体中文 |
| 页(从-至) | 1871-1888 |
| 页数 | 18 |
| 期刊 | Jisuanji Yanjiu yu Fazhan/Computer Research and Development |
| 卷 | 55 |
| 期 | 9 |
| DOI | |
| 出版状态 | 已出版 - 1 9月 2018 |
| 已对外发布 | 是 |
关键词
- DNN acceleration
- DNN compression
- Knowledge distillation
- Low-rank decomposition
- Parameter pruning
- Parameter sharing
指纹
探究 '深度神经网络压缩与加速综述' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver