OPass: Orchestrating TVM's Passes for Lowering Memory Footprints of Computation Graphs (ICSME 2024 - Research Track)

Who

Pengbo Nie, Zihan Wang, Chengcheng Wan, Ziyi Lin, He Jiang, Jianjun Zhao, Yuting Chen

Track

ICSME 2024 Research Track

Time Zone

The program is currently displayed in (GMT-07:00) Arizona.

Use conference time zone: (GMT-07:00) ArizonaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 9 Oct 2024 16:35 - 16:45 at Fremont - Session 6: Maintenance of AI-based Systems Chair(s): Sujoy Roychowdhury

Abstract

Deep learning (DL) compilers, such as TVM and TensorFlow, encompass a variety of passes for optimizing computation graphs (i.e., DL models). Despite the efforts on developing optimization passes, it remains a challenge in arranging these passes — most compilers employ fixed pass sequences that do not fit with computation graphs of diverse structures; on the other hand, optimization passes have cascade effects, making the structures of graphs under compilation volatile and as well making it difficult to generate optimal sequences for graphs.

Inspired by recent progresses on static computing memory footprints (i.e., memory usages) of computation graphs, we introduce in this paper OPass, a novel approach to orchestrating TVM’s optimization passes for lowering memory footprints of computation graphs, and finally allowing the graphs to run on memory-constrained devices. The key idea is, given a computation graph G, to optimize the graph heuristically and iteratively: OPass learns the effects of passes on the graph; it then optimizes G iteratively — each iteration picks up a pass by the reduction of the memory footprint of G and as well the implicit effects of the pass for further optimizations, letting the pass be applied.

We evaluate OPass on ReBench (a suite of computation graphs) and two real-world models (Transformer and ResNet). The results clearly show the strength of OPass: it outperforms TVM’s default sequence by 1.77x in reducing graphs’ memory footprints, with affordable costs; it also offers extra memory reductions of 5~12% by catching the implicit effects of passes. Furthermore, OPass helps analyze positive/negative effects of passes to graphs’ memory footprints, providing TVM developers with best practices for designing optimization pass sequences.

Pengbo Nie

Shanghai Jiao Tong University

China

Zihan Wang

Shanghai Jiao Tong University

China

Chengcheng Wan

East China Normal University

China

Ziyi Lin

Alibaba Group

He Jiang

Dalian University of Technology

China

Jianjun Zhao

Kyushu University

Japan

Yuting Chen

Shanghai Jiao Tong University

China

Time Zone

The program is currently displayed in (GMT-07:00) Arizona.

Use conference time zone: (GMT-07:00) ArizonaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 9 Oct
Displayed time zone: Arizona change

15:30 - 17:00	Session 6: Maintenance of AI-based SystemsResearch Track / Industry Track / New Ideas and Emerging Results Track at Fremont Chair(s): Sujoy Roychowdhury Ericsson R&D

15:30 15m		A Taxonomy of Self-Admitted Technical Debt in Deep Learning SystemsResearch Track Paper Research Track Federica Pepe , Fiorella Zampetti University of Sannio, Italy, Antonio Mastropaolo William and Mary, USA, Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Massimiliano Di Penta University of Sannio, Italy Pre-print
15:45 10m		Property-based Testing within ML Projects: an Empirical StudyNIER Paper New Ideas and Emerging Results Track Cindy Wauters Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel Pre-print
15:55 15m		Toward Debugging Deep Reinforcement Learning Programs with RLExplorerResearch Track Paper Research Track Rached Bouchoucha Polytechnique Montréal, Ahmed Haj Yahmed École Polytechnique de Montréal, Darshan Patil , Janarthanan Rajendran , Amin Nikanjam École Polytechnique de Montréal, Sarath Chandar Polytechnique Montréal, Foutse Khomh Polytechnique Montréal
16:10 15m		Ghost Echoes Revealed: Benchmarking Maintainability Metrics and Machine Learning Predictions Against Human AssessmentsIndustry Track Paper Industry Track Markus Borg CodeScene, Marwa Ezzouhri University of Clermont Auvergne, Adam Tornhill Codescene AB Pre-print
16:25 10m		RetypeR: Integrated Retrieval-based Automatic Program Repair for Python Type ErrorsResearch Track Paper Research Track Sichong Hao Faculty of Computing, Harbin Institute of Technology, Xianjun Shi , Hongwei Liu Faculty of Computing, Harbin Institute of Technology
16:35 10m		OPass: Orchestrating TVM's Passes for Lowering Memory Footprints of Computation GraphsResearch Track Paper Research Track Pengbo Nie Shanghai Jiao Tong University, Zihan Wang Shanghai Jiao Tong University, Chengcheng Wan East China Normal University, Ziyi Lin Alibaba Group, He Jiang Dalian University of Technology, Jianjun Zhao Kyushu University, Yuting Chen Shanghai Jiao Tong University