CGO 2024
Sat 2 - Wed 6 March 2024 Edinburgh, United Kingdom
Mon 4 Mar 2024 14:40 - 15:00 at Tinto - Compilers for GPUs Chair(s): Roland Leißa

Compilers use a wide range of advanced optimizations to improve the quality of the machine code they generate. In most cases, compiler optimizations rely on precise analyses to be able to perform the optimizations. However, whenever a control-flow merge is performed information is lost as it is not possible to precisely reason about the program anymore. One existing solution to this issue is code duplication, which involves duplicating instructions from merge blocks to their predecessors.

This paper introduces a novel and more aggressive approach to code duplication, grounded in loop unrolling and control-flow unmerging that enables subsequent optimizations that cannot be enabled by applying only one of these transformations.

We implemented our approach inside LLVM, and evaluated its performance on a collection of GPU benchmarks in CUDA. Our results demonstrate that, even when faced with branch divergence, which complicates code duplication across multiple branches and increases the associated cost, our optimization technique achieves performance improvements of up to 81%.

Mon 4 Mar

Displayed time zone: London change

14:20 - 15:40
Compilers for GPUsMain Conference at Tinto
Chair(s): Roland Leißa University of Mannheim, School of Business Informatics and Mathematics
14:20
20m
Talk
A Framework for Fine-Grained Synchronization of Dependent GPU Kernels
Main Conference
Abhinav Jangda Microsoft Research, Saeed Maleki Microsoft Research, Maryam Mehri Dehnavi University of Toronto, Madan Musuvathi Microsoft Research, Olli Saarikivi Microsoft Research
Pre-print
14:40
20m
Talk
Enhancing Performance through Control-Flow Unmerging and Loop Unrolling on GPUs
Main Conference
Alnis Murtovi TU Dortmund, Giorgis Georgakoudis Lawrence Livermore National Laboratory, Konstantinos Parasyris Lawrence Livermore National Laboratory, Chunhua Liao Lawrence Livermore National Laboratory, Ignacio Laguna Lawrence Livermore National Laboratory, Bernhard Steffen TU Dortmund
15:00
20m
Talk
Retargeting and Respecializing GPU Workloads for Performance Portability
Main Conference
Ivan Radanov Ivanov Tokyo Institute of Technology; RIKEN R-CCS, Oleksandr Zinenko Google DeepMind, Jens Domke RIKEN R-CCS, Toshio Endo Tokyo Institute of Technology, William S. Moses University of Illinois at Urbana-Champaign; Google DeepMind
15:20
20m
Talk
Seer: Predictive Runtime Kernel Selection for Irregular Problems
Main Conference
Pre-print