Write a Blog >>
CGO 2022
Sat 2 - Wed 6 April 2022

Dates
Rooms
Tracks
Badges
Your Program
You're viewing the program in a time zone which is different from your device's time zone - change time zone

Sat 2 Apr

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 16:00
Science, Art, Magic: Using and Developing The Graal CompilerWorkshops and Tutorials

https://graalworkshop.github.io/2022/

09:00
5m
Talk
Welcome
Workshops and Tutorials

09:05
55m
Talk
Keynote: Static Java: The GraalVM Native Image Programming Model
Workshops and Tutorials

10:00
30m
Talk
Faster Native Image development build times with Quick Build mode
Workshops and Tutorials

10:30
30m
Talk
Improving GraalVM Reflection File Generation
Workshops and Tutorials

11:00
30m
Break
Break
Workshops and Tutorials

11:30
30m
Talk
Truffle Interpreter Performance without the Holy Graal
Workshops and Tutorials

12:00
30m
Talk
TruffleString: highly optimized cross-language string implementation.
Workshops and Tutorials

12:30
30m
Talk
State of AArch64 on GraalVM
Workshops and Tutorials

13:00
30m
Talk
Call-Target Agnostic Keyword Arguments
Workshops and Tutorials

13:30
30m
Talk
Tuning autovectorization in Graal
Workshops and Tutorials

14:00
30m
Break
Break
Workshops and Tutorials

14:30
75m
Talk
Lightning Talks
Workshops and Tutorials

15:45
15m
Talk
Closing remarks & survey
Workshops and Tutorials

13:30 - 17:00
Autotuning & Reinforcement Learning for Compilers with CompilerGymWorkshops and Tutorials

https://chriscummins.cc/cgo22-compilergym-tutorial

13:30
30m
Talk
Getting Started
Workshops and Tutorials

14:00
30m
Talk
Running CompilerGym on your Own Programs
Workshops and Tutorials

14:30
30m
Talk
CompilerGym Explorer
Workshops and Tutorials

15:00
30m
Talk
Autotuning
Workshops and Tutorials

15:30
30m
Talk
Reinforcement Learning
Workshops and Tutorials

16:00
60m
Talk
Extensions
Workshops and Tutorials

Sun 3 Apr

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 13:00
IMOP: a Self-Stabilizing Source-to-Source Compiler Framework for OpenMP CWorkshops and Tutorials

http://www.cse.iitm.ac.in/~amannoug/imop/tutorials.php

09:00
15m
Talk
Introduction to IMOP
Workshops and Tutorials

09:15
40m
Talk
Fundamental Representations (AST, CFG, and CG)
Workshops and Tutorials

09:55
30m
Talk
Scopes, Symbols, Types, and Environments
Workshops and Tutorials

10:25
5m
Talk
Break
Workshops and Tutorials

10:30
40m
Talk
Code Construction and Transformations
Workshops and Tutorials

11:10
40m
Talk
Data-flow Analyses
Workshops and Tutorials

11:50
10m
Talk
Break
Workshops and Tutorials

12:00
30m
Talk
Concurrency Representations
Workshops and Tutorials

12:30
20m
Talk
Self-stabilization, and Z3-integration
Workshops and Tutorials

12:50
10m
Talk
Discussions and Q&A
Workshops and Tutorials

13:00 - 20:45
13:00
15m
Talk
Opening Remarks
Workshops and Tutorials

13:15
45m
Talk
Opaque Pointers Are Coming
Workshops and Tutorials

14:00
30m
Talk
The Hot Path SSA Form in LLVM
Workshops and Tutorials

14:30
30m
Talk
POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning
Workshops and Tutorials

15:00
30m
Talk
[Tutorial] Learning to combine Instructions in LLVM Compiler
Workshops and Tutorials

15:30
30m
Break
Break
Workshops and Tutorials

16:00
45m
Talk
[Tutorial] An Guide to Performance Debugging LLVM-based Programs
Workshops and Tutorials

16:45
30m
Talk
Compiling, running and benchmarking SNAP with LLVM Flang - experiences with a new compiler
Workshops and Tutorials

17:15
30m
Talk
An Anatomy of Optimized Matrix Multiplication on AArch64
Workshops and Tutorials

17:45
20m
Break
Break
Workshops and Tutorials

18:05
30m
Talk
Improving the OpenMP Offloading Driver: LTO, libraries, and toolchains
Workshops and Tutorials

18:35
40m
Talk
Crash-Analyzer: An LLVM-based Tool for Triaging and Analyzing Crashes
Workshops and Tutorials

19:15
30m
Talk
Prototyping a compiler for homomorphic encryption using MLIR
Workshops and Tutorials

19:45
45m
Talk
[Tutorial] A walk through Flang OpenMP lowering: From FIR to LLVMIR
Workshops and Tutorials

20:30
15m
Talk
Closing remarks
Workshops and Tutorials

Mon 4 Apr

Displayed time zone: Eastern Time (US & Canada) change

08:45 - 09:00
09:00 - 10:00
Keynote (PPoPP)Main Conference

Many Real-World Challenges for Effective Programming of Heterogeneous Systems

Heterogeneous Systems offer tremendous opportunities through hardware innovation, but this leaves a lot unanswered in regards to ‘how will we program them.’ SYCL is a Khronos standard to extend C++ for Heterogeneous Programming, and is instructive to review in terms of the practical problems inherent in extending programming for heterogeneous systems. James will discuss SYCL in order to expose key challenges, and discuss real unsolved problems that stand in the way of ‘standard parallelism’ solving this in C++ and many other programming languages.

Speaker: James Reinders (Intel)

James Reinders is an engineer at Intel focused on enabling parallel programming in a heterogeneous world. James has helped create ten technical books related to parallel programming; his latest book is about SYCL (free download: https://www.apress.com/book/9781484255735). He has had the great fortune to help make key contributions to two of the world’s fastest computers (#1 on Top500 list) as well as many other supercomputers, and software developer tools.

10:00 - 10:20
10:20 - 11:20
Session #1: GPUMain Conference
Chair(s): Madan Musuvathi Microsoft Research
10:20
15m
Talk
A Compiler Framework for Optimizing Dynamic Parallelism on GPUsArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Mhd Ghaith Olabi American University of Beirut, Juan Gómez Luna ETH Zurich, Onur Mutlu ETH Zurich, Wen-mei Hwu University of Illinois at Urbana-Champaign, Izzat El Hajj American University of Beirut
Link to publication
10:35
15m
Talk
Automatic Horizontal Fusion for GPU KernelsArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Ao Li Carnegie Mellon University, Bojian Zheng University of Toronto, Gennady Pekhimenko University of Toronto / Vector Institute, Fan Long University of Toronto, Canada
Link to publication
10:50
15m
Talk
DARM: Control-Flow Melding for SIMT Thread Divergence ReductionArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Charitha Saumya Purdue University, Kirshanthan Sundararajah Purdue University, Milind Kulkarni Purdue University
Link to publication
11:05
15m
Talk
Efficient Execution of OpenMP on GPUsArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
Joseph Huber Oak Ridge National Laboratory, Melanie Cornelius Illinois Institute of Technology, Giorgis Georgakoudis Lawrence Livermore National Laboratory, Shilei Tian Stony Brook University, JoseM Monsalve Diaz Argonne National Laboratory, Kuter Dinel Düzce University, Barbara Chapman Stony Brook University, Johannes Doerfert Argonne National Laboratory
Link to publication
11:20 - 11:40
11:40 - 12:25
Session #2: PerformanceMain Conference
Chair(s): Charith Mendis MIT CSAIL
11:40
15m
Talk
CompilerGym: Robust, Performant Compiler Optimization Environments for AI ResearchArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Chris Cummins Facebook, Bram Wasti Facebook, Jiadong Guo Facebook, Brandon Cui Facebook, Jason Ansel Facebook, Sahir Gomez Facebook, Somya Jain Facebook, Jia Liu Facebook, Olivier Teytaud Facebook, Benoit Steiner Facebook, Yuandong Tian Facebook, Hugh Leather Facebook
Link to publication
11:55
15m
Talk
PALMED: Throughput Characterization for Superscalar ArchitecturesArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Nicolas Derumigny INRIA, Fabian Gruber Université Grenoble Alpes / INRIA Grenoble Rhônes-Alpes, Théophile Bastian INRIA, Christophe Guillon STMicroelectronics, Guillaume Iooss Inria, Louis-Noël Pouchet Colorado State University, Fabrice Rastello Inria, France
Link to publication
12:10
15m
Talk
SRTuner: Effective Compiler Optimization Customization By Exposing Synergistic RelationsArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
Sunghyun Park University of Michigan, Seyyed Salar Latifi Oskouei University of Michigan, Yongjun Park Hanyang University, Armand Behroozi University of Michigan, Byungsoo Jeon Carnegie Mellon University, Scott Mahlke University of Michigan
Link to publication
12:25 - 12:50
12:50 - 13:35
Session #3: Domain-Specific CompilationMain Conference
Chair(s): Tobias Grosser University of Edinburgh
12:50
15m
Talk
GraphIt to CUDA compiler in 2021 LOC: A case for high-performance DSL implementation via staging with BuilDSLArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Ajay Brahmakshatriya Massachusetts Institute of Technology, Saman Amarasinghe Massachusetts Institute of Technology
Link to publication
13:05
15m
Talk
A Compiler for Sound Floating-Point Computations using Affine ArithmeticArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Joao Rivera ETH Zurich, Franz Franchetti Carnegie Mellon University, Markus Püschel ETH Zurich
Link to publication
13:20
15m
Talk
Aggregate Update Problem for Multi-Clocked Dataflow LanguagesArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Hannes Kallwies University of Lübeck, Martin Leucker University of Lübeck, Daniel Thorma University of Lübeck, Torben Scheffel University of Lübeck, Malte Schmitz University of Lübeck
Link to publication
13:35 - 13:45
13:45 - 14:45
Business MeetingMain Conference

Tue 5 Apr

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:00
Keynote (CGO)Main Conference

Compiler 2.0

When I was a graduate student a long time ago, I used to have intense conversations and learned a lot from my peers in other areas of computer science as the program structure, systems, and algorithms used in my compiler were very similar to and inspired by many of the work done by my peers. For example, a Natural Language Recognition System that was developed by my peers, with a single sequential program with multiple passes connected through IRs that systematically transformed an audio stream into text, was structurally similar to the SUIF compiler I was developing. In the intervening 30 years, the information revolution brought us unprecedented advances in algorithms (e.g., machine learning and solvers), systems (e.g., multicores and cloud computing), and program structure (e.g., serverless and low-code frameworks). Thus, a modern NLP system such as Apple’s Siri or Amazon’s Alexa, a thin client on an edge device interfacing to a massively-parallel, cloud-based, centrally-trained Deep Neural Network, has little resemblance to its predecessors. However, the SUIF compiler is still eerily similar to a state-of-the-art modern compiler such as LLVM or MLIR. What happened with compiler construction technology? At worst, as a community, we have been Luddites to the information revolution even though our technology has been critical to it. At best, we have been unable to transfer our research innovations (e.g., polyhedral method or program synthesis) into production compilers. In this talk I hope to inspire the compiler community to radically rethink how to build next generation compilers by giving a few possible examples of using 21st century program structures, algorithms and systems in constructing a compiler.

Speaker: Saman Amarasinghe (MIT)

Saman Amarasinghe is a Professor in the Department of Electrical Engineering and Computer Science at Massachusetts Institute of Technology and a member of its Computer Science and Artificial Intelligence Laboratory (CSAIL) where he leads the Commit compiler group. Under Saman’s guidance, the Commit group developed the StreamIt, PetaBricks, Halide, Simit, MILK, Cimple, TACO, GraphIt, BioStream, CoLa and Seq programming languages and compilers, DynamoRIO, Helium, Tiramisu, Codon, StreamJIT and BuildIt compiler/runtime frameworks, Superword Level Parallelism (SLP), goSLP, VeGen and SuperVectorizer for vectorization, Ithemal machine learning based performance predictor, Program Shepherding to protect programs against external attacks, the OpenTuner extendable autotuner, and the Kendo deterministic execution system. He was the co-leader of the Raw architecture project. Saman was a co-founder of Determina, Lanka Internet Services, Venti Technologies, and DataCebo Corporations. Saman received his BS in Electrical Engineering and Computer Science from Cornell University in 1988, and his MSEE and Ph.D. from Stanford University in 1990 and 1997, respectively. He is an ACM Fellow.

10:00 - 10:20
11:20 - 11:40
11:40 - 12:10
Session #5: Natural-Language TechniquesMain Conference
Chair(s): Weng-Fai Wong National University of Singapore
11:40
15m
Talk
M3V: Multi-Modal Multi-View Context Embedding for Repair Operator Prediction
Main Conference
Xuezheng Xu UNSW Sydney, Xudong Wang UNSW Sydney, Jingling Xue UNSW Sydney
Link to publication
11:55
15m
Talk
Enabling Near Real-Time NLU-Driven Natural Language Programming through Dynamic Grammar Graph-Based Translation
Main Conference
Zifan Nan North Carolina State University, Xipeng Shen North Carolina State University; Facebook, Hui Guan University of Massachusetts, Amherst
Link to publication
12:10 - 12:50
12:50 - 13:35
Session #6: Binary TechniquesMain Conference
Chair(s): Wenwen Wang University of Georgia
12:50
15m
Talk
Recovering Container Class Types in C++ BinariesArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Xudong Wang UNSW Sydney, Xuezheng Xu UNSW Sydney, Qingan Li Wuhan University, China, Jingling Xue UNSW Sydney, Yuan Mengting Wuhan University
Link to publication
13:05
15m
Talk
Automatic Generation of Debug Headers through BlackBox Equivalence CheckingArtifact Available v1.1Artifacts Evaluated – Functional v1.1
Main Conference
Vaibhav Kiran Kurhe Indian Institute of Technology Delhi, Pratik Karia (Indian Institute of Technology Delhi, Shubhani Gupta Indian Institute of Technology Delhi, Abhishek Rose IIT Delhi, Sorav Bansal IIT Delhi and CompilerAI Labs
Link to publication
13:20
15m
Talk
Gadgets Splicing: Dynamic Binary Transformation for Precise RewritingArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
Linan Tian Chinese Academy of Sciences, Yangyang Shi Chinese Academy of Sciences, Liwei Chen Chinese Academy of Sciences, Yanqi Yang Chinese Academy of Sciences, Gang Shi Chinese Academy of Sciences
Link to publication

Wed 6 Apr

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:00
Keynote (HPCA)Main Conference

Integration, Specialization and Approximation: the “ISA” of Post-Moore Servers

Datacenters are growing at unprecedented speeds building a foundation for global IT services, cost-effective containerized apps and novel paradigms including microservices and serverless computing. At the same time, we are entering a new era in computing where scalability no longer comes from higher density in silicon fabrication processes. Now, more than ever server designers are in search of new avenues to bridge the gap between higher demands for scalability and the diminishing returns in server density. In this talk, I will go over the basic anatomy of system hardware and software in a modern server blade which is primarily derived from the CPU-centric desktop PC of the 80s. I will then present opportunities for a clean slate design of servers based on integration, specialization and approximation as three pillars to enable server scalability in the post-Moore era.

Speaker: Babak Falsafi (EcoCloud, EPFL)

Babak is a Professor and the founding director of EcoCloud at EPFL. His contributions to computer systems include the first NUMA multiprocessors built by Sun Microsystems (WildFire/WildCat), memory streaming integrated in IBM BlueGene (temporal) and ARM cores (spatial), and performance evaluation methodologies in use by AMD, HP and Google PerfKit. He has shown that memory consistency models are neither necessary nor sufficient to achieve high performance in servers. These results led to fence speculation in modern CPUs. His work on workload-optimized server processors laid the foundation for the first generation of Cavium ARM server CPUs, ThunderX. He is a recipient of an Alfred P. Sloan Research Fellowship, and a fellow of ACM and IEEE.

10:00 - 10:20
10:20 - 11:20
Session #7: Program Analysis and OptimizationMain Conference
Chair(s): Fabrice Rastello Inria, France
10:20
15m
Talk
Loop Rolling for Code Size ReductionArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Rodrigo C. O. Rocha University of Edinburgh, UK, Pavlos Petoumenos University of Manchester, Björn Franke University of Edinburgh, UK, Pramod Bhatotia University of Edinburgh, Michael F. P. O'Boyle University of Edinburgh
Link to publication
10:35
15m
Talk
Solving PBQP-based Register Allocation using Deep Reinforcement Learning
Main Conference
Minsu Kim Seoul National University, Jeong-Keun Park Seoul National University, Soo-Mook Moon Seoul National University
Link to publication
10:50
15m
Talk
F3M: Fast Focused Function MergingArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Sean Stirling Codeplay, Rodrigo C. O. Rocha University of Edinburgh, UK, Hugh Leather Facebook, Kim Hazelwood Facebook, Michael F. P. O'Boyle University of Edinburgh, Pavlos Petoumenos University of Manchester, UK
Link to publication
11:05
15m
Talk
Sound, Precise, and Fast Abstract Interpretation with Tristate NumbersArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Link to publication
11:20 - 11:35
AwardsMain Conference
Chair(s): Fabrice Rastello Inria, France, Sebastian Hack Saarland University, Germany, Tatiana Shpeisman Google
11:35 - 12:00
12:00 - 13:00
Session #8: IR, Encryption and CompressionMain Conference
Chair(s): Michel Steuwer University of Edinburgh
12:00
15m
Talk
Lambda the Ultimate SSA: Optimizing Functional Programs in SSAArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
Siddharth Bhat IIT Hyderabad, Tobias Grosser University of Edinburgh, Anurudh Peduri IIIT Hyderabad
Link to publication
12:15
15m
Talk
NOELLE Offers Empowering LLVM ExtensionsArtifact Available v1.1Results Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
Angelo Matni Northwestern University, Enrico Armenio Deiana Northwestern University, Yian Su Northwestern University, Lukas Gross Northwestern University, Souradip Ghosh Northwestern University, Sotiris Apostolakis Northwestern University, Ziyang Xu Princeton University, Zujun Tan Princeton University, Ishita Chaturvedi Princeton University, Brian Homerding Northwestern University, Tommy McMichen Northwestern University, David I. August Princeton University, Simone Campanoni Northwestern University
Link to publication
12:30
15m
Talk
HECATE: Performance-aware Scale Optimization for Homomorphic Encryption Compiler
Main Conference
Yongwoo Lee Yonsei University, Seonyoung Heo ETH Zurich, Seonyoung Cheon Yonsei University, Changsu Kim Seoul National University, Eunkyung Kim Samsung SDS, Dongyoon Lee Stony Brook University, Hanjun Kim Yonsei University
Link to publication
12:45
15m
Talk
Unified Compilation for Lossless Compression and Sparse ComputingResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1
Main Conference
Daniel Donenfeld Massachusetts Institute of Technology, Stephen Chou Massachusetts Institute of Technology, Saman Amarasinghe Massachusetts Institute of Technology
Link to publication
13:00 - 13:10
Closing RemarksMain Conference
Chair(s): Jae W. Lee Seoul National University, Korea