ISMM 2025
Tue 17 Jun 2025 Seoul, South Korea
co-located with PLDI 2025
Tue 17 Jun 2025 16:00 - 16:20 at Lilac - Session 4: 1540-1705 [Systems and Architecture] Chair(s): Steve Blackburn

Deep neural networks (DNNs) are one of the popular models for learning relationships between complex data. Training a DNN model is a compute- and memory-intensive operation. The size of modern DNN models spans into the terabyte region, requiring multiple accelerators to train – driving up the training cost. Such humongous memory requirements shift the focus toward memory rather than computation.

CPU-memory, on the other hand, can be scaled to several terabytes with new emerging memory technologies such as HBM and CXL-attached memories. Furthermore, recent advancements to the CPUs in terms of dedicated instructions for DNN training and inference are bridging the compute gap between CPUs and accelerators.

Proposed is an exploratory work in the direction of cost-effective DNN training on CPUs where we aim to alleviate memory management challenges in DNN training. We propose TierTrain, a novel memory tiering solution based on a dynamic queuing system that leverage the periodic and deterministic memory access behavior in DNN training to manage data placement across memory tiers. TierTrain proactively manages tensors by aggressively offloading them to slow memory tiers (NVMM, CXL) and timely prefetching them back to fast memory tiers (HBM, DRAM). Our evaluation of TierTrain on a tiered memory system with a real CXL-attached memory used for memory expansion and NVMM as a low cost memory results in average fast memory footprint reduction of 59–83% and peak fast memory footprint reduction of 25–74% with a performance overhead of 1–16%. In a memory-constrained scenario, TierTrain outperforms the state-of-the-art tiering by improving the performance by 35–84% for a set of popular DNN training models.

Tue 17 Jun

Displayed time zone: Seoul change

15:40 - 17:05
Session 4: 1540-1705 [Systems and Architecture]ISMM 2025 at Lilac
Chair(s): Steve Blackburn Google and Australian National University
15:40
20m
Talk
Fully Randomized Pointers
ISMM 2025
Sai Dhawal Phaye National University of Singapore, Gregory J. Duck National University of Singapore, Roland H. C. Yap National University of Singapore, Trevor E. Carlson National University of Singapore
DOI
16:00
20m
Talk
TierTrain: Proactive Memory Tiering for CPU-Based DNN Training
ISMM 2025
Sathvik Swaminathan Intel Labs, Sandeep Kumar Intel Labs, Aravinda Prasad Intel Labs, Sreenivas Subramoney Intel Labs
DOI
16:20
20m
Talk
EMD: Fair and Efficient Dynamic Memory De-bloating of Transparent Huge PagesRecorded
ISMM 2025
Parth Gangar Fujitsu Research of India, Ashish Panwar Microsoft Research India, K. Gopinath Rishihood University
DOI
16:40
20m
Talk
Compiler-Assisted Crash Consistency for PMEMRecorded
ISMM 2025
Yun Joon Soh University of California San Diego, Sihang Liu University of Waterloo, Steven Swanson University of California San Diego, Jishen Zhao University of California San Diego
DOI
17:00
5m
Day closing
Closing remarks
ISMM 2025
Martin Maas Google, Tim Harris OpenAI, Onur Mutlu ETH Zurich