CGO 2024
Sat 2 - Wed 6 March 2024 Edinburgh, United Kingdom
Mon 4 Mar 2024 11:50 - 12:10 at Tinto - Machine-Learning Guided Optimizations Chair(s): Zheng Wang

Tuning compiler heuristics and parameters is well known to improve optimization outcomes dramatically. Prior works have tuned command line flags and a few expert identified heuristics. However, there are an unknown number of heuristics buried, unmarked and unexposed inside the compiler as a consequence of decades of development without auto-tuning being foremost in the minds of developers. Many may not even have been considered heuristics by the developers who wrote them. The result is that auto-tuning search and machine learning can optimize only a tiny fraction of what could be possible if all heuristics were available to tune. Manually discovering all of these heuristics hidden among millions of lines of code and exposing them to auto-tuning tools is a Herculean task that is simply not practical. What is needed is a method of automatically finding these heuristics to extract every last drop of potential optimization.
In this work, we propose Heureka, a framework that automatically identifies potential heuristics in the compiler that are highly profitable optimization targets and then automatically finds available tuning parameters for those heuristics with minimal human involvement. Our work is based on the following key insight: When modifying the output of a heuristic within an acceptable value range, the calling code using that output will still function correctly and produce semantically correct results.
Building on that, we automatically manipulate the output of potential heuristic code in the compiler and decide using a Differential Testing approach if we found a heuristic or not. During output manipulation, we also explore acceptable value ranges of the targeted code. Heuristics identified in this way can then be tuned to optimize an objective function.
We used Heureka to search for heuristics among eight thousand functions from the LLVM optimization passes, which is about 2% of all available functions. We then use identified heuristics to tune the compilation of 38 applications from the NAS and Polybench benchmark suites. Compared to an -Oz baseline we reduce binary sizes by up to 11.6% considering single heuristics only and up to 19.5% when stacking the effects of multiple identified tuning targets and applying a random search with minimal search effort. Generalizing from existing analysis results, Heureka needs, on average, a little under an hour on a single machine to identify relevant heuristic targets for a previously unseen application.

Mon 4 Mar

Displayed time zone: London change

11:30 - 12:50
Machine-Learning Guided OptimizationsMain Conference at Tinto
Chair(s): Zheng Wang University of Leeds
11:30
20m
Talk
AskIt: Unified Programming Interface for Programming with Large Language Models
Main Conference
Katsumi Okuda Massachusetts Institute of Technology; Mitsubishi Electric Corporation, Saman Amarasinghe Massachusetts Institute of Technology
11:50
20m
Talk
Revealing Compiler Heuristics through Automated Discovery and Optimization
Main Conference
Volker Seeker Meta AI Research, Chris Cummins Meta AI Research, Murray Cole University of Edinburgh, Björn Franke University of Edinburgh, Kim Hazelwood Meta AI Research, Hugh Leather Meta AI Research
12:10
20m
Talk
SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly
Main Conference
Jordi Armengol-Estapé University of Edinburgh, Jackson Woodruff University of Edinburgh, Chris Cummins Meta AI Research, Michael F. P. O'Boyle University of Edinburgh
Pre-print
12:30
20m
Talk
TapeFlow: Streaming Gradient Tapes in Automatic Differentiation
Main Conference
Milad Hakimi Simon Fraser University, Arrvindh Shriraman Simon Fraser University
Media Attached