TY - JOUR
T1 - Unleashing Network/Accelerator Co-Exploration Potential on FPGAs
T2 - A Deeper Joint Search
AU - Lou, Wenqi
AU - Gong, Lei
AU - Wang, Chao
AU - Qian, Jiaming
AU - Wang, Xuan
AU - Li, Changlong
AU - Zhou, Xuehai
N1 - Publisher Copyright:
© 1982-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Recently, algorithm-hardware (HW) co-exploration for neural networks (NNs) has become the key to obtaining high-quality solutions. However, previous efforts for field-programmable gate arrays (FPGAs) focus on neural architecture search (NAS) while lacking HW architecture search (HAS), thus limiting the full potential of co-design. Although expanding the scope of HAS offers performance potential, the exponentially increased joint search space presents a formidable challenge. To address this, we propose a deep and efficient framework NAF, which jointly searches for Networks and Accelerators for FPGAs in a balanced co-search space. First, we adjust the NAS space and then introduce a block-level bitwidth search on the software side. Meanwhile, we design a HW-friendly quantization algorithm to facilitate HW efficiency and accuracy. Second, we design a dataflow-configurable HW unit with computation and memory access optimizations for quantized multiplication. Based on this, we incorporate critical heterogeneous multicore architecture exploration on the HW side. Third, to enable rapid HW feedback in the enlarged HAS space, we perform resource and performance modeling and design a fast HW generation algorithm based on the genetic algorithm. Specifically, we apply optimization techniques, like mapping space pruning, greedy bandwidth allocation, and coarse-grained search, to speed up this process. We validate NAF in edge and cloud scenarios. Experimental results show that NAF efficiently explores a significantly larger joint space and provides high-quality solutions. Compared with previous state-of-The-Art co-design works, the searched convolutional neural network-Accelerator pairs improve the throughput by 2.07× ∼ 7.10× and energy efficiency by 1.41× ∼ 2.27× under similar accuracy on the ImageNet dataset.
AB - Recently, algorithm-hardware (HW) co-exploration for neural networks (NNs) has become the key to obtaining high-quality solutions. However, previous efforts for field-programmable gate arrays (FPGAs) focus on neural architecture search (NAS) while lacking HW architecture search (HAS), thus limiting the full potential of co-design. Although expanding the scope of HAS offers performance potential, the exponentially increased joint search space presents a formidable challenge. To address this, we propose a deep and efficient framework NAF, which jointly searches for Networks and Accelerators for FPGAs in a balanced co-search space. First, we adjust the NAS space and then introduce a block-level bitwidth search on the software side. Meanwhile, we design a HW-friendly quantization algorithm to facilitate HW efficiency and accuracy. Second, we design a dataflow-configurable HW unit with computation and memory access optimizations for quantized multiplication. Based on this, we incorporate critical heterogeneous multicore architecture exploration on the HW side. Third, to enable rapid HW feedback in the enlarged HAS space, we perform resource and performance modeling and design a fast HW generation algorithm based on the genetic algorithm. Specifically, we apply optimization techniques, like mapping space pruning, greedy bandwidth allocation, and coarse-grained search, to speed up this process. We validate NAF in edge and cloud scenarios. Experimental results show that NAF efficiently explores a significantly larger joint space and provides high-quality solutions. Compared with previous state-of-The-Art co-design works, the searched convolutional neural network-Accelerator pairs improve the throughput by 2.07× ∼ 7.10× and energy efficiency by 1.41× ∼ 2.27× under similar accuracy on the ImageNet dataset.
KW - Convolutional neural network (CNN)
KW - field-programmable gate array (FPGA)
KW - software-hardware (HW) co-exploration
UR - https://www.scopus.com/pages/publications/85190828826
U2 - 10.1109/TCAD.2024.3391688
DO - 10.1109/TCAD.2024.3391688
M3 - 文章
AN - SCOPUS:85190828826
SN - 0278-0070
VL - 43
SP - 3041
EP - 3054
JO - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
JF - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IS - 10
ER -