CGO 2021
Sat 27 February - Wed 3 March 2021

CGO 2021

February 27th - March 3rd, 2021, Virtual Conference

Co-located with PPoPP, CC and HPCA

The International Symposium on Code Generation and Optimization (CGO) provides a premier venue to bring together researchers and practitioners working at the interface of hardware and software on a wide range of optimization and code generation techniques and related issues. The conference spans the spectrum from purely static to fully dynamic approaches, and from pure software-based methods to specific architectural features and support for code generation and optimization.

Dates
Tracks
You're viewing the program in a time zone which is different from your device's time zone change time zone

Mon 1 Mar

Displayed time zone: Eastern Time (US & Canada) change

08:45 - 09:00
09:00 - 10:00
Keynote (PPoPP)Main Conference

Atomicity without Trust

There is increasing interest in distributed systems where participants stand to benefit from cooperation but do trust one another not to cheat. Although blockchain-based commerce is perhaps the most visible example of such systems, the problem of economic exchange among mutually untrusting autonomous parties is a fundamental one independent of particular technologies.

This talk argues that such systems require rethinking our notions of correctness for distributed concurrency control and fault-tolerance. Addressing this challenge brings up questions familiar from classical distributed systems: how to combine multiple steps into a single atomic action, how to recover from failures, and how to coordinate concurrent access to data. Commerce among untrusting parties is a kind of fun-house mirror of classical distributed computing: familiar features are recognizable but distorted. For example, classical atomic transactions are often described in terms of the well-known ACID properties: atomicity, consistency, isolation, and durability. We will see that untrusting cooperation requires structures superficially similar to, but fundamentally different from, classical atomic transactions.

Speaker: Maurice Herlihy (Brown University)

10:00 - 11:00
Session #1: Compiler InfrastructureMain Conference
Chair(s): Michael Kruse Argonne National Laboratory
10:00
15m
Talk
MLIR: Scaling Compiler Infrastructure for Domain Specific ComputationArtifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Chris Lattner SiFive, Mehdi Amini Google, Uday Bondhugula Indian Institute of Science, Albert Cohen Google, Andy Davis Google, Jacques Pienaar Google, River Riddle Google, Tatiana Shpeisman Google, Nicolas Vasilache Google, Oleksandr Zinenko Google
10:15
15m
Talk
Progressive Raising in Multi-level IRResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Lorenzo Chelini TU Eindhoven, Andi Drebes INRIA, Oleksandr Zinenko Google, Albert Cohen Google, Henk Corporaal TU Eindhoven, Tobias Grosser University of Edinburgh, Nicolas Vasilache Google
10:30
15m
Talk
Towards a Domain-Extensible Compiler: Optimizing an Image Processing Pipeline on Mobile CPUsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Thomas Koehler University of Glasgow, Michel Steuwer The University of Edinburgh
10:45
15m
Talk
BuildIt: A type based multistage programming framework for code generation in C++Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ajay Brahmakshatriya Massachusetts Institute of Technology, Saman Amarasinghe Massachusetts Institute of Technology
11:00 - 11:10
Break (10min)Main Conference
11:10 - 12:10
Session #2: Dealing with PrecisionMain Conference
Chair(s): Uma Srinivasan Twitter
11:10
15m
Talk
An Interval Compiler for Sound Floating Point ComputationsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Joao Rivera ETH Zurich, Franz Franchetti Carnegie Mellon University, USA, Markus Püschel ETH Zurich, Switzerland
11:25
15m
Talk
Seamless Compiler Integration of Variable Precision Floating-Point ArithmeticResults Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
Tiago Jost Univ. Grenoble Alpes CEA, LIST, Grenoble, France, Yves Durand Univ. Grenoble Alpes CEA, LIST, Grenoble, France, Christian Fabre Univ. Grenoble Alpes CEA, LIST, Grenoble, France, Albert Cohen Google, Frédéric Pétrot Univ. Grenoble Alpes, CNRS, Grenoble INP, TIMA, Grenoble, France
11:40
15m
Talk
UNIT: Unifying Tensorized Instruction CompilationArtifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Jian Weng UCLA, Animesh Jain Amazon Web Services, Jie Wang , Leyuan Wang Amazon Web Services, USA, Yida Wang Amazon, Tony Nowatzki University of California, Los Angeles
11:55
15m
Talk
Unleashing the Low-Precision Computation Potential of Tensor Cores on GPUs
Main Conference
Guangli Li Institute of Computing Technology, Chinese Academy of Sciences, Jingling Xue UNSW Sydney, Lei Liu Institute of Computing Technology,Chinese Academy of Sciences, Xueying Wang Institute of Computing Technology,Chinese Academy of Sciences;University of Chinese Academy of Sciences, Xiu Ma Jilin University, Xiao Dong Institute of Computing Technology, Chinese Academy of Sciences, Jiansong Li Institute of Computing Technology,Chinese Academy of Sciences;University of Chinese Academy of Sciences, Xiaobing Feng ICT CAS
12:10 - 12:30
Break (20min)Main Conference
12:30 - 13:30
Session #3: Binary Profiling, Tracing, SamplingMain Conference
Chair(s): Wei Wang University of Texas at San Antonio, USA
12:30
15m
Talk
Cinnamon: A Domain-Specific Language for Binary Profiling and Monitoring
Main Conference
Mahwish Arif University of Cambridge, Ruoyu Zhou University of Cambridge, Hsi-Ming Ho University of Sussex, Timothy M. Jones University of Cambridge, UK
12:45
15m
Talk
GPA: A GPU Performance Advisor Based on Instruction SamplingResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Keren Zhou Rice University, Xiaozhu Meng Rice University, Ryuichi Sai Rice University, John Mellor-Crummey Rice University
13:00
15m
Talk
ELFies: Executable Region Checkpoints for Performance Analysis and Simulation
Main Conference
Harish Patil Intel, USA, Alexander Isaev Intel, Wim Heirman Intel, Alen Sabu National University of Singapore, Ali Hajiabadi National University of Singapore, Trevor E. Carlson National University of Singapore
13:15
15m
Talk
Vulkan Vision: Ray Tracing Workload Characterization using Automatic Graphics InstrumentationResults Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
David Pankratz University of Alberta, Tyler Nowicki Huawei Technologies Canada, Ahmed Eltantawy Huawei Technologies Canada, Jose Nelson Amaral University of Alberta
13:30 - 14:30
Business MeetingMain Conference

Tue 2 Mar

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:00
Keynote (CGO)Main Conference

Data Layout and Data Representation Optimizations to Reduce Data Movement

Code generation and optimization for the diversity of current and future architectures must focus on reducing data movement to achieve high performance. How data is laid out in memory, and representations that compress data (e.g., reduced floating point precision) have a profound impact on data movement. Moreover, the cost of data movement in a program is architecture-specific, and consequently, optimizing data layout and data representation must be performed by a compiler once the target architecture is known. With this context in mind, this talk will provide examples of data layout and data representation optimizations, and call for integrating these data properties into code generation and optimization systems.

Speaker: Mary Hall (University of Utah)

Mary Hall is a Professor and Director of the School of Computing at University of Utah. She received a PhD in Computer Science from Rice University. Her research focus brings together compiler optimizations targeting current and future high-performance architectures on real-world applications. Hall’s prior work has developed compiler techniques for exploiting parallelism and locality on a diversity of architectures: automatic parallelization for SMPs, superword-level parallelism for multimedia extensions, processing-in-memory architectures, FPGAs and more recently many-core CPUs and GPUs. Professor Hall is an IEEE Fellow, an ACM Distinguished Scientist and a member of the Computing Research Association Board of Directors. She actively participates in mentoring and outreach programs to encourage the participation of women and other groups underrepresented in computer science.

10:00 - 11:00
Session #4: Parallelism - Optimizing, Modeling, TestingMain Conference
Chair(s): Michael F. P. O'Boyle University of Edinburgh
10:00
15m
Talk
Loop Parallelization using Dynamic Commutativity Analysis
Main Conference
Christos Vasiladiotis University of Edinburgh, Roberto Castañeda Lozano University of Edinburgh, Murray Cole University of Edinburgh, UK, Björn Franke University of Edinburgh, UK
10:15
15m
Talk
Fine-grained Pipeline Parallelization for Network Function Programs
Main Conference
Seungbin Song Yonsei University, Heelim Choi Yonsei University, Hanjun Kim Yonsei University
10:30
15m
Talk
YaskSite – Stencil Optimization Techniques Applied to Explicit ODE Methods on Modern ArchitecturesResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Christie Louis Alappat Friedrich Alexander University, Erlangen-Nuremberg, Johannes Seiferth University of Bayreuth, Georg Hager Friedrich Alexander University, Erlangen-Nuremberg, Matthias Korch University of Bayreuth, Thomas Rauber University of Bayreuth, Gerhard Wellein Friedrich Alexander University, Erlangen-Nuremberg
10:45
15m
Talk
GoBench: a Benchmark Suite of Real-World Go Concurrency BugsResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Ting Yuan Institute of Computing Technology, CAS, Guangwei Li Institute of Computing Technology, Jie Lu , Chen Liu , Lian Li Institute of Computing Technology at Chinese Academy of Sciences, China, Jingling Xue UNSW Sydney
11:00 - 11:10
Break (10min)Main Conference
11:10 - 12:10
Session #5: Memory Optimization and SafenessMain Conference
Chair(s): EunJung (EJ) Park Los Alamos National Laboratory
11:10
15m
Talk
Memory-Safe Elimination of Side ChannelsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Luigi Soares Federal University of Minas Gerais, Fernando Magno Quintão Pereira Federal University of Minas Gerais
11:25
15m
Talk
Variable-sized Blocks for Locality-aware SpMV
Main Conference
Naveen Namashivayam HPE, Sanyam Mehta HPE, Pen-Chung Yew University of Minnesota
11:40
15m
Talk
Object Versioning for Flow-Sensitive Pointer AnalysisResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Mohamad Barbar University of Technology, Sydney, Yulei Sui University of Technology Sydney, Shiping Chen Data61 at CSIRO, Australia / UNSW, Australia
11:55
15m
Talk
Scaling up the IFDS Algorithm with Efficient Disk-based ComputingArtifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Haofeng Li Institute of Computing Technology, CAS; University of Chinese Academy of Sciences, Haining Meng Institute of Computing Technology, CAS; University of Chinese Academy of Sciences, Hengjie Zheng Institute of Computing Technology, Chinese Academy of Sciences, Liqing Cao Institute of Computing Technology, Chinese Academy of Sciences, Jie Lu , Lian Li Institute of Computing Technology at Chinese Academy of Sciences, China, Lin Gao TianqiSoft Inc.
12:10 - 12:30
Break (20min)Main Conference
12:30 - 14:30
CGO Student Research CompetitionMain Conference / Student Research Competition
12:30
10m
Talk
A New Memory Layout for Self-Rebalancing Trees
Student Research Competition
Paul Iannetta ENS Lyon
12:40
10m
Talk
Automatic Inspection of Program State for Debugging and Verification Purposes
Student Research Competition
José Wesley de Souza Magalhães Federal University of Minas Gerais
12:50
10m
Talk
Compiler Framework for Low Overhead Fork-Join Parallelism
Student Research Competition
13:00
10m
Talk
Data vs. Instructions: Runtime Code Generation for Convolutions
Student Research Competition
Malith Jayaweera Northeastern University
13:10
10m
Talk
Fast Structural Register Allocation
Student Research Competition
William Zhang Carnegie Mellon University, Pranav Kumar Carnegie Mellon University
13:20
10m
Talk
Fine Grained Control of Program Transformations via Strategic Rewriting in MLIR
Student Research Competition
Martin Lücke University of Edinburgh
13:30
10m
Talk
Towards an Exploration Tool for Program Optimization Using Heuristic Search Algorithms
Student Research Competition
Johannes Lenfers University of Münster
13:40
10m
Talk
When Binary Optimization Meets Static Profiling
Student Research Competition
Angelica Moreira Universidade Federal de Minas Gerais

Wed 3 Mar

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:00
Keynote (HPCA)Main Conference

A Journey to a Commercial-Grade Processing-In-Memory (PIM) Chip Development

Emerging applications demand high off-chip memory bandwidth, but it becomes very expensive to further increase the bandwidth of off-chip memory under stringent physical constraints of chip packages and system boards. Besides, energy efficiency of moving data across the memory hierarchy of processors has steadily worsened with the stagnant technology scaling and poor data reuse characteristics of the emerging applications. To cost-effectively increase the bandwidth and energy efficiency, researchers began to reconsider the past processing-in-memory (PIM) architectures and advance them further, especially with recent integration technologies such as 2.5D/3D stacking. Albeit the recent advances, no major memory manufacturer had developed even a proof-of-concept silicon yet, not to mention a product. In this talk, I will start with discussing various practical and technical challenges that have been overlooked by researchers and prevented the industry from successfully commercializing PIM. Then I will present a practical PIM architecture that considers various aspects of successful commercialization in the near future. Finally, I present a journey to the development of a commercial-grade PIM chip, which was designed based on a commercial HBM2, fabricated with a 20nm DRAM technology, integrated with unmodified commercial processors, and successfully ran various memory-bound machine learning applications with more than 2x improvement in system performance 70% reduction in system energy consumption.

Speaker: Nam Sung Kim (University of Illinois at Urbana-Champaign / Samsung Electronics)

Nam Sung Kim is a Senior Vice President at Samsung Electronics as well as a Professor at the University of Illinois. At Samsung he led the architecture definitions and designs of next generation DRAM devices including HBM, LPDDR, DDR, and GDDR. He has published more than 200 refereed articles to highly-selective conferences and journals in the field of circuit, architecture, and computer-aided design. For his contributions to developing power-efficient computer architectures, he was elevated to IEEE and ACM Fellows in 2016 and 2021, respectively, and received the ACM SIGARCH/IEEE-CS TCCA Influential ISCA Paper Award in 2017. He is also a hall of fame member of all three major computer architecture conferences, ISCA, MICRO, and HPCA.

10:00 - 11:00
Session #6: Compiling Graph Algorithms, Compiling for GPUsMain Conference
Chair(s): Maria Jesus Garzaran Intel Corporation and University of Illinois at Urbana-Champaign
10:00
15m
Talk
Compiling Graph Applications for GPUs with GraphItResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ajay Brahmakshatriya Massachusetts Institute of Technology, Yunming Zhang , Changwan Hong Massachusetts Institute of Technology, Shoaib Kamil Adobe Research, Julian Shun MIT, Saman Amarasinghe Massachusetts Institute of Technology
10:15
15m
Talk
Efficient Execution of Graph Algorithms on CPU with SIMD ExtensionsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ruohuang Zheng University of Rochester, Sreepathi Pai University of Rochester
10:30
15m
Talk
r3d3: Optimized Query Compilation on GPUsResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Alexander Krolik McGill University, Canada, Clark Verbrugge McGill University, Canada, Laurie Hendren McGill University, Canada
10:45
15m
Talk
C-for-Metal: High Performance SIMD Programming on Intel GPUsResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Guei-Yuan Lueh Intel Corporation, Kaiyu Chen Intel Corporation, Gang Chen Intel Corporation, Joel Fuentes Intel Corporation, Wei-Yu Chen Intel Corporation, Fangwen Fu Intel Corporation, Hong Jiang Intel Corporation, Hongzheng Li Intel Corporation, Daniel Rhee Intel Corporation
11:00 - 11:10
Break (10min)Main Conference
11:10 - 11:55
Session #7: Compiling for Spatial, Quantum, and Embedded DevicesMain Conference
Chair(s): Wei-Fen Lin National Cheng Kung University
11:10
15m
Talk
Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum CircuitsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ji Liu North Carolina State University, Luciano Bello IBM Research, Huiyang Zhou North Carolina State U.
11:25
15m
Talk
StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing SystemsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Johannes de Fine Licht , Andreas Kuster ETH Zurich, Tiziano De Matteis ETH Zurich, Tal Ben-Nun Department of Computer Science, ETH Zurich, Dominic Hofer ETH Zurich, Torsten Hoefler ETH Zurich
11:40
15m
Talk
Thread-aware Area-efficient High-level Synthesis Compiler for Embedded Devices
Main Conference
Changsu Kim POSTECH, Shinnung Jeong Yonsei University, Sungjun Cho POSTECH, Yongwoo Lee Yonsei University, William Song Yonsei University, Youngsok Kim Yonsei University, Hanjun Kim Yonsei University
11:55 - 12:10
Award CeremonyMain Conference

Best Paper Award

Compiler Graph Applications for GPUs with GraphIt

Ajay Brahmakshatriya, Yunming Zhang, Changwan Hong, Shoaib Kamil, Julian Shun, Saman Amarasinghe

Test-of-Time Award

Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code (CGO ’10)

Hongtao Yu, Jingling Xue, Wei Huo, Xiaobing Feng, Zhaoqing Zhang

12:10 - 12:30
Break (20min)Main Conference
12:30 - 13:30
Session #8: JIT and Binary Translation; Optimizing for Code SizeMain Conference
Chair(s): Probir Roy University of Michigan at Dearborn
12:30
15m
Talk
HHVM Jump-Start: Boosting both Warmup and Steady-State Performance at Scale
Main Conference
Guilherme Ottoni Facebook, Bin Liu Facebook
12:45
15m
Talk
Enhancing Atomic Instruction Emulation for Cross-ISA Dynamic Binary TranslationResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ziyi Zhao Nankai University, Zhang Jiang Nankai University, Xiaoli Gong Nankai University, Ying Chen Nankai University, Wenwen Wang University of Georgia, Pen-Chung Yew University of Minnesota
13:00
15m
Talk
An Experience with Code-size Optimization for Production iOS Mobile ApplicationsArtifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Milind Chabbi Uber Technologies Inc., Jin Lin Uber Technologies, Raj Barik Uber Technologies Inc.
13:15
15m
Talk
AnghaBench: a Suite with One Million Compilable C Benchmarks for Code-Size Reduction
Main Conference
Anderson Faustino da Silva State University of Maringá, Bruno Conde Kind UFMG, José Wesley Magalhães Federal University of Minas Gerais, Jeronimo Nunes Rocha UFMG, Breno Guimaraes UFMG, Fernando Magno Quintão Pereira Federal University of Minas Gerais
13:30 - 15:00
Joint Session PanelMain Conference

Panelists: John L. Hennessy Alphabet and Stanford, David Patterson Google and U.C. Berkeley, Margaret Martonosi NSF CISE and Princeton, Bill Dally NVIDIA and Stanford, Natalie Enright Jerger U. Toronto and ACM D&I Council, Kim Hazelwood Facebook AI Research, Timothy M. Pinkston USC

13:30
90m
Talk
“Valuing Diversity, Equity, and Inclusion in Our Computing Community”
Main Conference

File Attached

Accepted Papers

Title
An Experience with Code-size Optimization for Production iOS Mobile ApplicationsArtifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
AnghaBench: a Suite with One Million Compilable C Benchmarks for Code-Size Reduction
Main Conference
An Interval Compiler for Sound Floating Point ComputationsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
BuildIt: A type based multistage programming framework for code generation in C++Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
C-for-Metal: High Performance SIMD Programming on Intel GPUsResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Cinnamon: A Domain-Specific Language for Binary Profiling and Monitoring
Main Conference
Compiling Graph Applications for GPUs with GraphItResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Efficient Execution of Graph Algorithms on CPU with SIMD ExtensionsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
ELFies: Executable Region Checkpoints for Performance Analysis and Simulation
Main Conference
Enhancing Atomic Instruction Emulation for Cross-ISA Dynamic Binary TranslationResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Fine-grained Pipeline Parallelization for Network Function Programs
Main Conference
GoBench: a Benchmark Suite of Real-World Go Concurrency BugsResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
GPA: A GPU Performance Advisor Based on Instruction SamplingResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
HHVM Jump-Start: Boosting both Warmup and Steady-State Performance at Scale
Main Conference
Loop Parallelization using Dynamic Commutativity Analysis
Main Conference
Memory-Safe Elimination of Side ChannelsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
MLIR: Scaling Compiler Infrastructure for Domain Specific ComputationArtifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Object Versioning for Flow-Sensitive Pointer AnalysisResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Progressive Raising in Multi-level IRResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
r3d3: Optimized Query Compilation on GPUsResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum CircuitsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Scaling up the IFDS Algorithm with Efficient Disk-based ComputingArtifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Seamless Compiler Integration of Variable Precision Floating-Point ArithmeticResults Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing SystemsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Thread-aware Area-efficient High-level Synthesis Compiler for Embedded Devices
Main Conference
Towards a Domain-Extensible Compiler: Optimizing an Image Processing Pipeline on Mobile CPUsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
UNIT: Unifying Tensorized Instruction CompilationArtifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Unleashing the Low-Precision Computation Potential of Tensor Cores on GPUs
Main Conference
“Valuing Diversity, Equity, and Inclusion in Our Computing Community”
Main Conference

File Attached
Variable-sized Blocks for Locality-aware SpMV
Main Conference
Vulkan Vision: Ray Tracing Workload Characterization using Automatic Graphics InstrumentationResults Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
YaskSite – Stencil Optimization Techniques Applied to Explicit ODE Methods on Modern ArchitecturesResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference

Call for Papers

Call for Papers Including Call for Tool and Practical Experience Papers

The International Symposium on Code Generation and Optimization (CGO) is a premier venue to bring together researchers and practitioners working at the interface of hardware and software on a wide range of optimization and code generation techniques and related issues. The conference spans the spectrum from purely static to fully dynamic approaches, and from pure software-based methods to specific architectural features and support for code generation and optimization.

Original contributions are solicited on, but not limited to, the following topics:

  • Code Generation, Translation, Transformation, and Optimization for performance, energy, virtualization, portability, security, or reliability concerns, and architectural support
  • Efficient execution of dynamically typed and higher-level languages
  • Optimization and code generation for emerging programming models, platforms, domain-specific languages
  • Dynamic/static, profile-guided, feedback-directed, and machine learning based optimization
  • Static, Dynamic, and Hybrid Analysis for performance, energy, memory locality, throughput or latency, security, reliability, or functional debugging
  • Program characterization methods
  • Efficient profiling and instrumentation techniques; architectural support
  • Novel and efficient tools
  • Compiler design, practice and experience
  • Compiler abstraction and intermediate representations
  • Vertical integration of language features, representations, optimizations, and runtime support for parallelism
  • Solutions that involve cross-layer (HW/OS/VM/SW) design and integration
  • Deployed dynamic/static compiler and runtime systems for general purpose, embedded system and Cloud/HPC platforms
  • Parallelism, heterogeneity, and reconfigurable architectures
  • Optimizations for heterogeneous or specialized targets, GPUs, SoCs, CGRA
  • Compiler support for vectorization, thread extraction, task scheduling, speculation, transaction, memory management, data distribution and synchronization

Artifact Evaluation

The Artifact Evaluation process is run by a separate committee whose task is to assess how the artifacts support the work described in the papers. This process contributes to improve reproducibility in research that should be a great concern to all of us. There is also some evidence that papers with a supporting artifact receive higher citations than papers without (Artifact Evaluation: Is It a Real Incentive? by B. Childers and P. Chrysanthis).

Authors of accepted papers at CGO have the option of submitting their artifacts for evaluation within two weeks of paper acceptance. To ease the organization of the AE committee, we kindly ask authors to indicate at the time they submit the paper, whether they are interested in submitting an artifact. Papers that go through the Artifact Evaluation process successfully will receive a seal of approval printed on the papers themselves. Additional information is available on the CGO AE web page. Authors of accepted papers are encouraged, but not required, to make these materials publicly available upon publication of the proceedings, by including them as “source materials” in the ACM Digital Library.


Call for Tools and Practical Experience Papers

Last year CGO had a special category of papers called “Tools and Practical Experience,” which was very successful. CGO this year will have the same category of papers. Such a paper is subject to the same page length guidelines, except that it must give a clear account of its functionality and a summary about the practice experience with realistic case studies, and describe all the supporting artifacts available.

For papers submitted in this category that present a tool it is mandatory to submit an artifact to the Artifact Evaluation process and to be successfully evaluated. These papers will initially be conditionally accepted based on the condition that an artifact is submitted to the Artifact Evaluation process and that this artifact is successfully evaluated. Authors are not required to make their tool publicly available, but we do require that an artifact is submitted and successfully evaluated.

Papers submitted in this category presenting practical experience are encouraged but not required to submit an artifact to the Artifact Evaluation process.

The selection criteria for papers in this category are:

  • Originality: Papers should present CGO-related technologies applied to real-world problems with scope or characteristics that set them apart from previous solutions.
  • Usability: The presented Tools or compilers should have broad usage or applicability. They are expected to assist in CGO-related research, or could be extended to investigate or demonstrate new technologies. If significant components are not yet implemented, the paper will not be considered.
  • Documentation: The tool or compiler should be presented on a web-site giving documentation and further information about the tool.
  • Benchmark Repository: A suite of benchmarks for testing should be provided.
  • Availability: Preferences will be given to tools or compilers that are freely available (at either the source or binary level). Exceptions may be made for industry and commercial tools that cannot be made publicly available for business reasons.
  • Foundations: Papers should incorporate the principles underpinning Code Generation and Optimization (CGO). However, a thorough discussion of theoretical foundations is not required; a summary of such should suffice.
  • Artifact Evaluation: The submitted artifact must be functional and supports the claims made in the paper. Submission of an artifact is mandatory for papers presenting a tool.

Authors should carefully consider the difference in focus with the co-located conferences when deciding where to submit a paper. CGO will make the proceedings freely available via the ACM DL platform during the period from two weeks before to two weeks after the conference. This option will facilitate easy access to the proceedings by conference attendees, and it will also enable the community at large to experience the excitement of learning about the latest developments being presented in the period surrounding the event itself.

Submission Site

Papers can be submitted at https://cgo21.hotcrp.com.

Submission Guidelines

Please make sure that your paper satisfies ALL of the following requirements before it is submitted:

  • The paper must be original material that has not been previously published in another conference or journal, nor is currently under review by another conference or journal. Note that you may submit material presented previously at a workshop without copyrighted proceedings.

  • Your submission is limited to ten (10) letter-size (8.5″x11″), single-spaced, double-column pages, using 10pt or larger font, not including references. There is no page limit for references. We highly recommend the IEEE templates for conference proceedings because this format will be used in the proceedings. The ACM SIGPLAN templates may also be used for reviews, and in that case, please use the following options: \documentclass[sigplan,10pt,review,anonymous]{acmart}\settopmatter{printfolios=true,printccs=false,printacmref=false}. Submissions not adhering to these submission guidelines may be outright rejected at the discretion of the program chairs. (Please make sure your paper prints satisfactorily on letter-size (8.5″x11″) paper: this is especially important for submissions from countries where A4 paper is standard.)

  • Papers are to be submitted for double-blind review. Blind reviewing of papers will be done by the program committee, assisted by outside referees. Author names as well as hints of identity are to be removed from the submitted paper. Use care in naming your files. Source file names, e.g., Joe.Smith.dvi, are often embedded in the final output as readily accessible comments. In addition, do not omit references to provide anonymity, as this leaves the reviewer unable to grasp the context. Instead, if you are extending your own work, you need to reference and discuss the past work in third person, as if you were extending someone else’s research. We realize in doing this that for some papers it will still be obvious who the authors are. In this case, the submission will not be penalized as long a concerted effort was made to reference and describe the relationship to the prior work as if you were extending someone else’s research. For example, if your name is Joe Smith:

    In previous work [1,2], Smith presented a new branch predictor for …. In this paper, we extend their work by …

    Bibliography

    [1] Joe Smith, “A Simple Branch Predictor for …,” Proceedings of CGO 2019.

    [2] Joe Smith, “A More Complicated Branch Predictor for…,” Proceedings of CGO 2019.

  • Your submission must be formatted for black-and-white printers and not color printers. This is especially true for plots and graphs in the paper.
  • Please make sure that the labels on your graphs are readable without the aid of a magnifying glass. Typically the default font sizes on the graph axes in a program like Microsoft Excel are too small.
  • Please number the pages.
  • The paper must be submitted in PDF. We cannot accept any other format, and we must be able to print the document just as we receive it. We strongly suggest that you use only the four widely-used printer fonts: Times, Helvetica, Courier and Symbol.
  • Please make sure that the output has been formatted for printing on LETTER size paper. If generating the paper using “dvips”, use the option “-P cmz -t letter”, and if that is not supported, use “-t letter”.
  • The Artifact Evaluation process is run by a separate committee whose task is to assess how the artifacts support the work described in the papers. Authors of accepted papers have the option of submitting their artifacts for evaluation within two weeks of paper acceptance. To ease the organization of the AE committee, we kindly ask authors to indicate at the time they submit the paper, whether they are interested in submitting an artifact. Papers that go through the Artifact Evaluation process successfully will receive a seal of approval printed on the papers themselves. Additional information is available on the CGO AE web page. Authors of accepted papers are encouraged, but not required, to make these materials publicly available upon publication of the proceedings, by including them as “source materials” in the ACM Digital Library.
  • Authors must register all their conflicts on the paper submission site. Conflicts are needed to ensure appropriate assignment of reviewers. If a paper is found to have an undeclared conflict that causes a problem OR if a paper is found to declare false conflicts in order to abuse or “game” the review system, the paper may be rejected.

  • Please declare a conflict of interest with the following people for any author of your paper:

    • Your Ph.D. advisor(s), post-doctoral advisor(s), Ph.D. students, and post-doctoral advisees, forever.
    • Family relations by blood or marriage, or their equivalent, forever (if they might be potential reviewers).
    • People with whom you have collaborated in the last FIVE years, including:
    • Co-authors of accepted/rejected/pending papers.
    • Co-PIs on accepted/rejected/pending grant proposals.
    • Funders (decision-makers) of your research grants, and researchers whom you fund.
    • People (including students) who shared your primary institution(s) in the last FIVE years.
    • Other relationships, such as close personal friendship, that you think might tend to affect your judgment or be seen as doing so by a reasonable person familiar with the relationship.
    • “Service” collaborations such as co-authoring a report for a professional organization, serving on a program committee, or co-presenting tutorials, do not themselves create a conflict of interest. Co-authoring a paper that is a compendium of various projects with no true collaboration among the projects does not constitute a conflict among the authors of the different projects.
    • On the other hand, there may be others not covered by the above with whom you believe a COI exists, for example, an ongoing collaboration that has not yet resulted in the creation of a paper or proposal. Please report such COIs; however, you may be asked to justify them. Please be reasonable. For example, you cannot declare a COI with a reviewer just because that reviewer works on topics similar to or related to those in your paper. The PC Chair may contact co-authors to explain a COI whose origin is unclear.
    • We hope to draw most reviewers from the PC and the ERC, but others from the community may also write reviews. Please declare all your conflicts (not just restricted to the PC and ERC). When in doubt, contact the program co-chairs.