TY - GEN
T1 - Automatic Hyper-Parameter Search for Vision Transformer Pruning
AU - Feng, Jun
AU - Zhao, Shuai
AU - Peng, Liangying
AU - Pan, Sichen
AU - Chen, Hao
AU - Li, Zhongxu
AU - Ke, Gongwu
AU - Wang, Gaoli
AU - Long, Youqun
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In recent years, the high computational cost of the popular Vision Transformer (ViT) has made it difficult to deploy on lightweight devices. As a result, many pruning techniques have been developed to reduce the size and complexity of ViT models. However, most of these techniques focus on pruning the model as a whole, without considering the differences among its internal modules. Specifically, they apply a uniform pruning ratio to all modules. In our work, we observe that using different pruning ratios for the Multi-Head Self Attention (MHSA) and Feed-Forward Network (FFN) modules can result in improved compression performance for the Vision Transformer (ViT). In this way, we propose a new compression algorithm that applies distinct pruning ratios to each of these modules and automatically searches for optimal pruning ratio parameters. To further enhance the precision of this algorithm, we introduce an improved approach that employs iterative pruning and binary search strategies to identify the optimal parameters at a finer granularity, thereby minimizing the model's accuracy loss during the pruning process. We evaluated the effectiveness of our approach on two commonly used datasets, CIFAR-10 and Mini-ImageNet. Our method was compared to the state-of-The-Art (SOTA) method, CP-ViT, which uses a fixed pruning ratio. We found that when the pruned model accuracy was nearly the same, our method achieved a significant reduction in FLOPs, with our method achieving 56.91% of the FLOPs of the fixed pruning ratio method on CIFAR-10. These results demonstrate that our method can be more effective in reducing model complexity while maintaining accuracy.
AB - In recent years, the high computational cost of the popular Vision Transformer (ViT) has made it difficult to deploy on lightweight devices. As a result, many pruning techniques have been developed to reduce the size and complexity of ViT models. However, most of these techniques focus on pruning the model as a whole, without considering the differences among its internal modules. Specifically, they apply a uniform pruning ratio to all modules. In our work, we observe that using different pruning ratios for the Multi-Head Self Attention (MHSA) and Feed-Forward Network (FFN) modules can result in improved compression performance for the Vision Transformer (ViT). In this way, we propose a new compression algorithm that applies distinct pruning ratios to each of these modules and automatically searches for optimal pruning ratio parameters. To further enhance the precision of this algorithm, we introduce an improved approach that employs iterative pruning and binary search strategies to identify the optimal parameters at a finer granularity, thereby minimizing the model's accuracy loss during the pruning process. We evaluated the effectiveness of our approach on two commonly used datasets, CIFAR-10 and Mini-ImageNet. Our method was compared to the state-of-The-Art (SOTA) method, CP-ViT, which uses a fixed pruning ratio. We found that when the pruned model accuracy was nearly the same, our method achieved a significant reduction in FLOPs, with our method achieving 56.91% of the FLOPs of the fixed pruning ratio method on CIFAR-10. These results demonstrate that our method can be more effective in reducing model complexity while maintaining accuracy.
KW - Lightweight device
KW - Model Pruning
KW - Model deployment
KW - Search Ratio
KW - Vision Transformer
UR - https://www.scopus.com/pages/publications/85180542909
U2 - 10.1109/PRAI59366.2023.10332058
DO - 10.1109/PRAI59366.2023.10332058
M3 - 会议稿件
AN - SCOPUS:85180542909
T3 - 2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence, PRAI 2023
SP - 606
EP - 611
BT - 2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence, PRAI 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th IEEE International Conference on Pattern Recognition and Artificial Intelligence, PRAI 2023
Y2 - 18 August 2023 through 20 August 2023
ER -