A Heuristic for Periodic Memory Allocation with Little Fragmentation to Train Neural Networks
Neural network training requires immense GPU memory, and memory optimization methods such as recomputation are being actively researched.
Recent improvements in recomputation have reduced more than 90 % of the peak allocated size. However, because it produces complex irregular allocation patterns, PyTorch's default caching allocator wastes up to 20 % of memory because of severe fragmentation and increases the cache management overhead.
The periodic allocation patterns during training make offline memory optimization possible.
Dynamic storage allocation (DSA) is the problem of minimizing the memory address ranges given the allocation pattern. This is defined as the 2D bin-packing problem, in which each rectangle can move only vertically.
Although the first-fit or best-fit heuristics perform well for DSA, we propose a simulated annealing-based non-trivial heuristic algorithm that optimizes the topological ordering of allocations to further minimize fragmentation.
The proposed algorithm evaluates a candidate allocation plan with O(log N) amortized time, where N is the number of allocations.
We empirically tested our algorithm on both randomly generated data and allocation patterns obtained by training popular vision and text models with recomputation.
The experiments showed that, on average, our algorithm reduced fragmentation caused by the PyTorch caching allocator from 29.5 % to 0.4 %, compared to 5.3 % by the first-fit method.
Tue 25 JunDisplayed time zone: Windhoek change
16:00 - 17:00 | ISMM: Session 4 - PotpourriISMM 2024 at Iceland Chair(s): Tony Hosking Australian National University | ||
16:00 20mTalk | SSRD: Shapes and Summaries for Race Detection in Concurrent Data StructuresRemote ISMM 2024 Xiaofan Sun University of California at Riverside, Rajiv Gupta University of California at Riverside DOI | ||
16:20 20mTalk | A Heuristic for Periodic Memory Allocation with Little Fragmentation to Train Neural Networks ISMM 2024 DOI | ||
16:40 20mTalk | ESPN: Memory-Efficient Multi-vector Information Retrieval ISMM 2024 DOI |