TY - GEN
T1 - OPASS
T2 - 40th IEEE International Conference on Software Maintenance and Evolution, ICSME 2024
AU - Nie, Pengbo
AU - Wang, Zihan
AU - Wan, Chengcheng
AU - Lin, Ziyi
AU - Jiang, He
AU - Zhao, Jianjun
AU - Chen, Yuting
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Deep learning (DL) compilers, such as TVM and TensorFlow, encompass a variety of passes for optimizing computation graphs (i.e., DL models). Despite the efforts on developing optimization passes, it remains a challenge in arranging these passes - most compilers employ fixed pass sequences that do not fit with computation graphs of diverse structures; on the other hand, optimization passes have cascade effects, making the structures of graphs under compilation volatile and as well making it difficult to generate optimal sequences for graphs. Inspired by recent progresses on static computing memory footprints (i.e., memory usages) of computation graphs, we introduce in this paper OPASS, a novel approach to orchestrating TVM's optimization passes for lowering memory footprints of computation graphs, and finally allowing the graphs to run on memory-constrained devices. The key idea is, given a computation graph G, to optimize the graph heuristically and iteratively: OPASS learns the effects of passes on the graph; it then optimizes G iteratively - each iteration picks up a pass by the reduction of the memory footprint of G and as well the implicit effects of the pass for further optimizations, letting the pass be applied. We evaluate OPASS on Rebench (a suite of computation graphs) and two real-world models (Transformer and ResNet). The results clearly show the strength of OPASS: it outperforms TVM's default sequence by 1.77× in reducing graphs' memory footprints, with affordable costs; it also offers extra memory reductions of 5∼ 12% by catching the implicit effects of passes. Furthermore, OPASS helps analyze positive/negative effects of passes to graphs' memory footprints, providing TVM developers with best practices for designing optimization pass sequences.
AB - Deep learning (DL) compilers, such as TVM and TensorFlow, encompass a variety of passes for optimizing computation graphs (i.e., DL models). Despite the efforts on developing optimization passes, it remains a challenge in arranging these passes - most compilers employ fixed pass sequences that do not fit with computation graphs of diverse structures; on the other hand, optimization passes have cascade effects, making the structures of graphs under compilation volatile and as well making it difficult to generate optimal sequences for graphs. Inspired by recent progresses on static computing memory footprints (i.e., memory usages) of computation graphs, we introduce in this paper OPASS, a novel approach to orchestrating TVM's optimization passes for lowering memory footprints of computation graphs, and finally allowing the graphs to run on memory-constrained devices. The key idea is, given a computation graph G, to optimize the graph heuristically and iteratively: OPASS learns the effects of passes on the graph; it then optimizes G iteratively - each iteration picks up a pass by the reduction of the memory footprint of G and as well the implicit effects of the pass for further optimizations, letting the pass be applied. We evaluate OPASS on Rebench (a suite of computation graphs) and two real-world models (Transformer and ResNet). The results clearly show the strength of OPASS: it outperforms TVM's default sequence by 1.77× in reducing graphs' memory footprints, with affordable costs; it also offers extra memory reductions of 5∼ 12% by catching the implicit effects of passes. Furthermore, OPASS helps analyze positive/negative effects of passes to graphs' memory footprints, providing TVM developers with best practices for designing optimization pass sequences.
KW - DL compiler
KW - memory footprint
KW - optimization passes
KW - orchestration
UR - https://www.scopus.com/pages/publications/85215537490
U2 - 10.1109/ICSME58944.2024.00026
DO - 10.1109/ICSME58944.2024.00026
M3 - 会议稿件
AN - SCOPUS:85215537490
T3 - Proceedings - 2024 IEEE International Conference on Software Maintenance and Evolution, ICSME 2024
SP - 175
EP - 186
BT - Proceedings - 2024 IEEE International Conference on Software Maintenance and Evolution, ICSME 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 6 October 2024 through 11 October 2024
ER -