Write a Blog >>
Sun 23 Jun 2019 09:45 - 10:10 at 106A - Scaling Up

Deep learning models are becoming larger and will not fit in the limited memory of accelerators such as GPUs for training. Though many methods have been proposed to solve this problem, they are rather ad-hoc in nature and difficult to extend and integrate with other techniques. In this paper, we tackle the problem in a formal way to provide a strong foundation for supporting large models. We propose a method of formally rewriting the computational graph of a model where swap-out and swap-in operations are inserted to temporarily store intermediate results on CPU memory. By introducing a categorized topological ordering for simulating graph execution, the memory consumption of a model can be easily analyzed by using operation distances in the ordering. As a result, the problem of fitting a large model into a memory-limited accelerator is reduced to the problem of reducing operation distances in a categorized topological ordering. We then show how to formally derive swap-out and swap-in operations from an existing graph and present rules to optimize the graph. Finally, we propose a simulation-based auto-tuning to automatically find suitable graph-rewriting parameters for the best performance. We developed a module in TensorFlow, called LMS, by which we successfully trained ResNet-50 with a 4.9x larger mini-batch size and 3D U-Net with a 5.6x larger image resolution.

Sun 23 Jun
Times are displayed in time zone: (GMT-07:00) Tijuana, Baja California change

09:00 - 11:00: ISMM 2019 - Scaling Up at 106A
ismm-2019-papers09:00 - 09:05
Day opening
Harry XuUniversity of California, Los Angeles (UCLA), Jeremy SingerUniversity of Glasgow
ismm-2019-papers09:05 - 09:45
ismm-2019-papers09:45 - 10:10
Tung D. LeIBM Research - Tokyo, Haruki ImaiIBM Research - Tokyo, Yasushi NegishiIBM Research - Tokyo, Kiyokuni KawachiyaIBM Research - Tokyo
ismm-2019-papers10:10 - 10:35
Matthias SpringerTokyo Institute of Technology, Hidehiko MasuharaTokyo Institute of Technology
ismm-2019-papers10:35 - 11:00
Michihiro HorieIBM Research - Tokyo, Kazunori OgataIBM Research, Japan, Mikio TakeuchiIBM Research - Tokyo, Hiroshi HoriiIBM Research, Japan