OPASS: Orchestrating TVM's Passes for Lowering Memory Footprints of Computation Graphs

  • Pengbo Nie
  • , Zihan Wang
  • , Chengcheng Wan
  • , Ziyi Lin
  • , He Jiang
  • , Jianjun Zhao
  • , Yuting Chen*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Deep learning (DL) compilers, such as TVM and TensorFlow, encompass a variety of passes for optimizing computation graphs (i.e., DL models). Despite the efforts on developing optimization passes, it remains a challenge in arranging these passes - most compilers employ fixed pass sequences that do not fit with computation graphs of diverse structures; on the other hand, optimization passes have cascade effects, making the structures of graphs under compilation volatile and as well making it difficult to generate optimal sequences for graphs. Inspired by recent progresses on static computing memory footprints (i.e., memory usages) of computation graphs, we introduce in this paper OPASS, a novel approach to orchestrating TVM's optimization passes for lowering memory footprints of computation graphs, and finally allowing the graphs to run on memory-constrained devices. The key idea is, given a computation graph G, to optimize the graph heuristically and iteratively: OPASS learns the effects of passes on the graph; it then optimizes G iteratively - each iteration picks up a pass by the reduction of the memory footprint of G and as well the implicit effects of the pass for further optimizations, letting the pass be applied. We evaluate OPASS on Rebench (a suite of computation graphs) and two real-world models (Transformer and ResNet). The results clearly show the strength of OPASS: it outperforms TVM's default sequence by 1.77× in reducing graphs' memory footprints, with affordable costs; it also offers extra memory reductions of 5∼ 12% by catching the implicit effects of passes. Furthermore, OPASS helps analyze positive/negative effects of passes to graphs' memory footprints, providing TVM developers with best practices for designing optimization pass sequences.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE International Conference on Software Maintenance and Evolution, ICSME 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages175-186
Number of pages12
ISBN (Electronic)9798350395686
DOIs
StatePublished - 2024
Event40th IEEE International Conference on Software Maintenance and Evolution, ICSME 2024 - Flagstaff, United States
Duration: 6 Oct 202411 Oct 2024

Publication series

NameProceedings - 2024 IEEE International Conference on Software Maintenance and Evolution, ICSME 2024

Conference

Conference40th IEEE International Conference on Software Maintenance and Evolution, ICSME 2024
Country/TerritoryUnited States
CityFlagstaff
Period6/10/2411/10/24

Keywords

  • DL compiler
  • memory footprint
  • optimization passes
  • orchestration

Fingerprint

Dive into the research topics of 'OPASS: Orchestrating TVM's Passes for Lowering Memory Footprints of Computation Graphs'. Together they form a unique fingerprint.

Cite this