CGO 2023
Sat 25 February - Wed 1 March 2023 Montreal, Canada
Wed 1 Mar 2023 11:18 - 11:44 at Montreal 1-2-3 - Session 7 -- Neural Network Accelerators Chair(s): Lukas Sommer

Processing-in-Memory (PIM) has evolved over decades into a feasible solution to addressing the exacerbating performance bottleneck with main memory by placing computational logic in or near memory.
Recent proposals from DRAM manufacturers highlighted the HW constraint-aware design of PIM-enabled DRAM with specialized MAC logic, providing an order of magnitude speedup for memory-intensive operations in DL models.
Although the main target for PIM acceleration did not initially include convolutional neural networks due to their high compute intensity, recent CNN models are increasingly adopting computationally lightweight implementation.
Motivated by the potential for the software stack to enable CNN models on DRAM-PIM hardware without invasive changes, we propose PIMFlow, an end-to-end compiler and runtime support, to accelerate CNN models on a PIM-enabled GPU memory.
PIMFlow transforms model graphs to create inter-node parallelism across GPU and PIM, explores possible task- and data-parallel execution scenarios for optimal execution time, and provides a code-generating back-end and execution engine for DRAM-PIM.
PIMFlow achieves up to 82% end-to-end speedup and reduces energy consumption by 26% on average for CNN model inferences.

Wed 1 Mar

Displayed time zone: Eastern Time (US & Canada) change

10:00 - 12:00
Session 7 -- Neural Network AcceleratorsMain Conference at Montreal 1-2-3
Chair(s): Lukas Sommer Codeplay Software
10:00
26m
Talk
Flexer: Out-of-Order Scheduling for Multi-NPUs
Main Conference
Hyemi Min Seoul National University, Jungyoon Kwon Seoul National University, Bernhard Egger Seoul National University
DOI
10:26
26m
Talk
Pin or Fuse? Exploiting Scratchpad Memory to Reduce Off-Chip Data Transfer in DNN Accelerators
Main Conference
Hyuk-Jin Jeong Samsung Research, JiHwan Yeo Samsung Research, Cheongyo Bahk Samsung Research, JongHyun Park Samsung Research
DOI
10:52
26m
Talk
Accelerating Deep Neural Networks on Mobile Multicore NPUs
Main Conference
Hanwoong Jung Samsung Advanced Institute of Technology, Hexiang Ji Samsung Research, Alexey Pushchin Samsung Research, Maxim Ostapenko Samsung Advanced Institute of Technology, Wenlong Niu Samsung Research, Ilya Palachev Samsung Research, Yutian Qu Samsung Research, Pavel Fedin Samsung Research, Yuri Gribov Samsung Research, Heewoo Nam Samsung Advanced Institute of Technology, Dongguen Lim Samsung Advanced Institute of Technology, Hyunjun Kim Samsung Advanced Institute of Technology, Joonho Song Samsung Advanced Institute of Technology, Seungwon Lee Samsung Advanced Institute of Technology, Hwansoo Han Sungkyunkwan University
DOI
11:18
26m
Talk
PIMFlow: Compiler and Runtime Support for CNN Models on Processing-in-Memory DRAM
Main Conference
Yongwon Shin POSTECH, Juseong Park POSTECH, Sungjun Cho POSTECH, Hyojin Sung POSTECH
DOI