Write a Blog >>
CC 2017
Sun 5 - Mon 6 February 2017 Austin, Texas, United States
Sun 5 Feb 2017 11:45 - 12:10 at 404 - Concurrency & Parallelism Chair(s): Sebastian Hack

Many computationally-intensive algorithms benefit from the wide parallelism offered by Graphical Processing Units (GPUs). However, the search for a close-to-optimal implementation remains extremely tedious due to the specialization and complexity of GPU architectures.

We present a novel approach to automatically discover the best performing code from a given set of possible implementations. It involves a branch and bound algorithm with two distinctive features: (1) an analytic performance model of a \emph{lower bound} on the execution time, and (2) the ability to estimate such bounds on a \emph{partially-specified} implementation.

The unique features of this performance model allow to aggressively prune the optimization space without eliminating the best performing implementation. While the space considered in this paper focuses on GPUs, the approach is generic enough to be applied to other architectures.

We implemented our algorithm in a tool called \emph{Telamon} and demonstrate its effectiveness on a huge, architecture-specific and input-sensitive optimization space. The information provided by the performance model also helps to identify ways to enrich the search space to consider better candidates, or to highlight architectural bottlenecks.

Sun 5 Feb

Displayed time zone: Saskatchewan, Central America change

10:30 - 12:10
Concurrency & ParallelismResearch Papers at 404
Chair(s): Sebastian Hack Saarland University
10:30
25m
Talk
Partially Redundant Fence Elimination for x86, ARM, and Power Processors
Research Papers
Robin Morisset ENS, France, Francesco Zappa Nardelli Inria, France
DOI
10:55
25m
Talk
Lightweight Data Race Detection for Production Runs
Research Papers
Swarnendu Biswas University of Texas at Austin, Man Cao Ohio State University, Minjia Zhang Ohio State University, Michael D. Bond Ohio State University, Benjamin P. Wood Wellesley College, USA
DOI
11:20
25m
Talk
Optimized Two-Level Parallelization for GPU Accelerators using the Polyhedral Model
Research Papers
Jun Shirako Rice University, USA, Akihiro Hayashi Rice University, USA, Vivek Sarkar Rice University, USA
DOI
11:45
25m
Talk
Optimization Space Pruning without Regrets
Research Papers
DOI