ISMM 2024
Tue 25 Jun 2024 Copenhagen, Denmark
co-located with PLDI 2024
Tue 25 Jun 2024 16:20 - 16:40 at Iceland - ISMM: Session 4 - Potpourri Chair(s): Tony Hosking

Neural network training requires immense GPU memory, and memory optimization methods such as recomputation are being actively researched.

Recent improvements in recomputation have reduced more than 90 % of the peak allocated size. However, because it produces complex irregular allocation patterns, PyTorch's default caching allocator wastes up to 20 % of memory because of severe fragmentation and increases the cache management overhead.

The periodic allocation patterns during training make offline memory optimization possible.

Dynamic storage allocation (DSA) is the problem of minimizing the memory address ranges given the allocation pattern. This is defined as the 2D bin-packing problem, in which each rectangle can move only vertically.

Although the first-fit or best-fit heuristics perform well for DSA, we propose a simulated annealing-based non-trivial heuristic algorithm that optimizes the topological ordering of allocations to further minimize fragmentation.

The proposed algorithm evaluates a candidate allocation plan with O(log N) amortized time, where N is the number of allocations.

We empirically tested our algorithm on both randomly generated data and allocation patterns obtained by training popular vision and text models with recomputation.

The experiments showed that, on average, our algorithm reduced fragmentation caused by the PyTorch caching allocator from 29.5 % to 0.4 %, compared to 5.3 % by the first-fit method.

Tue 25 Jun

Displayed time zone: Windhoek change

16:00 - 17:00
ISMM: Session 4 - PotpourriISMM 2024 at Iceland
Chair(s): Tony Hosking Australian National University
16:00
20m
Talk
SSRD: Shapes and Summaries for Race Detection in Concurrent Data StructuresRemote
ISMM 2024
Xiaofan Sun University of California at Riverside, Rajiv Gupta University of California at Riverside
DOI
16:20
20m
Talk
A Heuristic for Periodic Memory Allocation with Little Fragmentation to Train Neural Networks
ISMM 2024
Akifumi Imanishi Preferred Networks, Zijian Xu Preferred Networks
DOI
16:40
20m
Talk
ESPN: Memory-Efficient Multi-vector Information Retrieval
ISMM 2024
Susav Shrestha Texas A&M University, Narasimha Reddy Texas A&M University, Zongwang Li Samsung
DOI