TY - GEN
T1 - FPGA-accelerated for constrained high dispersal network
AU - Chen, Yanliang
AU - Zhu, Minghua
AU - Xiao, Bo
AU - Meng, Dan
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2018/5/25
Y1 - 2018/5/25
N2 - In recent years, the Deep Neural Network (DNN) has been successfully used in image classification. Most of existing DNN often need to learn a very large set of parameters, which require a huge amount of computational resources and time to train these model parameters using the gradient descent and back-propagation procedure. To solve this issue, the PCANet has been developed for high efficient design and training of the DNN. Compared with traditional DNN, PCANet has simpler structure and better performance, which makes it attractive for hardware design. To overcome the limitations of PCANet and significantly improve its performance, we have proposed a novel model named Constrained High Dispersal Network (CHDNet) which is a variant of PCANet. In this paper, we implement the CHDNet on the Xilinx ZYNQ FPGA to ensure the instantaneity of the system with lower power than personal computer needed by taking advantage of the algorithmic parallelism and ZYNQ architecture. Our experimental results over two major datasets, the MNIST dataset for handwritten digits recognition, and the Extended Yale B dataset for face recognition, demonstrate that our model of implementation on FPGA is more than 15x faster than software implementation on PC (Intel i7-4720HQ, 2.6GHz).
AB - In recent years, the Deep Neural Network (DNN) has been successfully used in image classification. Most of existing DNN often need to learn a very large set of parameters, which require a huge amount of computational resources and time to train these model parameters using the gradient descent and back-propagation procedure. To solve this issue, the PCANet has been developed for high efficient design and training of the DNN. Compared with traditional DNN, PCANet has simpler structure and better performance, which makes it attractive for hardware design. To overcome the limitations of PCANet and significantly improve its performance, we have proposed a novel model named Constrained High Dispersal Network (CHDNet) which is a variant of PCANet. In this paper, we implement the CHDNet on the Xilinx ZYNQ FPGA to ensure the instantaneity of the system with lower power than personal computer needed by taking advantage of the algorithmic parallelism and ZYNQ architecture. Our experimental results over two major datasets, the MNIST dataset for handwritten digits recognition, and the Extended Yale B dataset for face recognition, demonstrate that our model of implementation on FPGA is more than 15x faster than software implementation on PC (Intel i7-4720HQ, 2.6GHz).
KW - Deep neural network
KW - FPGA
KW - High level Synthesis
KW - Image classification
UR - https://www.scopus.com/pages/publications/85048371626
U2 - 10.1109/ISPA/IUCC.2017.00128
DO - 10.1109/ISPA/IUCC.2017.00128
M3 - 会议稿件
AN - SCOPUS:85048371626
T3 - Proceedings - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
SP - 840
EP - 845
BT - Proceedings - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
A2 - Martinez, Gregorio
A2 - Hill, Richard
A2 - Fox, Geoffrey
A2 - Mueller, Peter
A2 - Wang, Guojun
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
Y2 - 12 December 2017 through 15 December 2017
ER -