CGO 2021
Sat 27 February - Wed 3 March 2021
Dates
Tracks
You're viewing the program in a time zone which is different from your device's time zone - change time zone

Conference Day
Mon 1 Mar

Displayed time zone: Eastern Time (US & Canada) change

08:45 - 09:00
09:00 - 10:00
Keynote (PPoPP)Main Conference

Atomicity without Trust

There is increasing interest in distributed systems where participants stand to benefit from cooperation but do trust one another not to cheat. Although blockchain-based commerce is perhaps the most visible example of such systems, the problem of economic exchange among mutually untrusting autonomous parties is a fundamental one independent of particular technologies.

This talk argues that such systems require rethinking our notions of correctness for distributed concurrency control and fault-tolerance. Addressing this challenge brings up questions familiar from classical distributed systems: how to combine multiple steps into a single atomic action, how to recover from failures, and how to coordinate concurrent access to data. Commerce among untrusting parties is a kind of fun-house mirror of classical distributed computing: familiar features are recognizable but distorted. For example, classical atomic transactions are often described in terms of the well-known ACID properties: atomicity, consistency, isolation, and durability. We will see that untrusting cooperation requires structures superficially similar to, but fundamentally different from, classical atomic transactions.

Speaker: Maurice Herlihy (Brown University)

10:00 - 11:00
Session #1: Compiler InfrastructureMain Conference
Chair(s): Michael KruseArgonne National Laboratory
10:00
15m
Talk
MLIR: Scaling Compiler Infrastructure for Domain Specific ComputationArtifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Chris LattnerSiFive, Mehdi AminiGoogle, Uday BondhugulaIndian Institute of Science, Albert CohenGoogle, Andy DavisGoogle, Jacques PienaarGoogle, River RiddleGoogle, Tatiana ShpeismanGoogle, Nicolas VasilacheGoogle, Oleksandr ZinenkoGoogle
10:15
15m
Talk
Progressive Raising in Multi-level IRResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Lorenzo CheliniTU Eindhoven, Andi DrebesINRIA, Oleksandr ZinenkoGoogle, Albert CohenGoogle, Henk CorporaalTU Eindhoven, Tobias GrosserUniversity of Edinburgh, Nicolas VasilacheGoogle
10:30
15m
Talk
Towards a Domain-Extensible Compiler: Optimizing an Image Processing Pipeline on Mobile CPUsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Thomas KoehlerUniversity of Glasgow, Michel SteuwerThe University of Edinburgh
10:45
15m
Talk
BuildIt: A type based multistage programming framework for code generation in C++Results Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ajay BrahmakshatriyaMassachusetts Institute of Technology, Saman AmarasingheMassachusetts Institute of Technology
11:00 - 11:10
Break (10min)Main Conference
11:10 - 12:10
Session #2: Dealing with PrecisionMain Conference
Chair(s): Uma SrinivasanTwitter
11:10
15m
Talk
An Interval Compiler for Sound Floating Point ComputationsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Joao RiveraETH Zurich, Franz FranchettiCarnegie Mellon University, USA, Markus PüschelETH Zurich, Switzerland
11:25
15m
Talk
Seamless Compiler Integration of Variable Precision Floating-Point ArithmeticResults Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
Tiago JostUniv. Grenoble Alpes CEA, LIST, Grenoble, France, Yves DurandUniv. Grenoble Alpes CEA, LIST, Grenoble, France, Christian FabreUniv. Grenoble Alpes CEA, LIST, Grenoble, France, Albert CohenGoogle, Frédéric PétrotUniv. Grenoble Alpes, CNRS, Grenoble INP, TIMA, Grenoble, France
11:40
15m
Talk
UNIT: Unifying Tensorized Instruction CompilationArtifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Jian WengUCLA, Animesh JainAmazon Web Services, Jie Wang, Leyuan WangAmazon Web Services, USA, Yida WangAmazon, Tony NowatzkiUniversity of California, Los Angeles
11:55
15m
Talk
Unleashing the Low-Precision Computation Potential of Tensor Cores on GPUs
Main Conference
Guangli LiInstitute of Computing Technology, Chinese Academy of Sciences, Jingling XueUNSW Sydney, Lei LiuInstitute of Computing Technology,Chinese Academy of Sciences, Xueying WangInstitute of Computing Technology,Chinese Academy of Sciences;University of Chinese Academy of Sciences, Xiu MaJilin University, Xiao DongInstitute of Computing Technology, Chinese Academy of Sciences, Jiansong LiInstitute of Computing Technology,Chinese Academy of Sciences;University of Chinese Academy of Sciences, Xiaobing FengICT CAS
12:10 - 12:30
Break (20min)Main Conference
12:30 - 13:30
Session #3: Binary Profiling, Tracing, SamplingMain Conference
Chair(s): Wei WangUniversity of Texas at San Antonio, USA
12:30
15m
Talk
Cinnamon: A Domain-Specific Language for Binary Profiling and Monitoring
Main Conference
Mahwish ArifUniversity of Cambridge, Ruoyu ZhouUniversity of Cambridge, Hsi-Ming HoUniversity of Sussex, Timothy M. JonesUniversity of Cambridge, UK
12:45
15m
Talk
GPA: A GPU Performance Advisor Based on Instruction SamplingResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Keren ZhouRice University, Xiaozhu MengRice University, Ryuichi SaiRice University, John Mellor-CrummeyRice University
13:00
15m
Talk
ELFies: Executable Region Checkpoints for Performance Analysis and Simulation
Main Conference
Harish PatilIntel, USA, Alexander IsaevIntel, Wim HeirmanIntel, Alen SabuNational University of Singapore, Ali HajiabadiNational University of Singapore, Trevor E. CarlsonNational University of Singapore
13:15
15m
Talk
Vulkan Vision: Ray Tracing Workload Characterization using Automatic Graphics InstrumentationResults Reproduced v1.1Artifacts Evaluated – Functional v1.1
Main Conference
David PankratzUniversity of Alberta, Tyler NowickiHuawei Technologies Canada, Ahmed EltantawyHuawei Technologies Canada, Jose Nelson AmaralUniversity of Alberta
13:30 - 14:30
Business MeetingMain Conference

Conference Day
Tue 2 Mar

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:00
Keynote (CGO)Main Conference

Data Layout and Data Representation Optimizations to Reduce Data Movement

Code generation and optimization for the diversity of current and future architectures must focus on reducing data movement to achieve high performance. How data is laid out in memory, and representations that compress data (e.g., reduced floating point precision) have a profound impact on data movement. Moreover, the cost of data movement in a program is architecture-specific, and consequently, optimizing data layout and data representation must be performed by a compiler once the target architecture is known. With this context in mind, this talk will provide examples of data layout and data representation optimizations, and call for integrating these data properties into code generation and optimization systems.

Speaker: Mary Hall (University of Utah)

Mary Hall is a Professor and Director of the School of Computing at University of Utah. She received a PhD in Computer Science from Rice University. Her research focus brings together compiler optimizations targeting current and future high-performance architectures on real-world applications. Hall’s prior work has developed compiler techniques for exploiting parallelism and locality on a diversity of architectures: automatic parallelization for SMPs, superword-level parallelism for multimedia extensions, processing-in-memory architectures, FPGAs and more recently many-core CPUs and GPUs. Professor Hall is an IEEE Fellow, an ACM Distinguished Scientist and a member of the Computing Research Association Board of Directors. She actively participates in mentoring and outreach programs to encourage the participation of women and other groups underrepresented in computer science.

10:00 - 11:00
Session #4: Parallelism - Optimizing, Modeling, TestingMain Conference
Chair(s): Michael F. P. O'BoyleUniversity of Edinburgh
10:00
15m
Talk
Loop Parallelization using Dynamic Commutativity Analysis
Main Conference
Christos VasiladiotisUniversity of Edinburgh, Roberto Castañeda LozanoUniversity of Edinburgh, Murray ColeUniversity of Edinburgh, UK, Björn FrankeUniversity of Edinburgh, UK
10:15
15m
Talk
Fine-grained Pipeline Parallelization for Network Function Programs
Main Conference
Seungbin SongYonsei University, Heelim ChoiYonsei University, Hanjun KimYonsei University
10:30
15m
Talk
YaskSite – Stencil Optimization Techniques Applied to Explicit ODE Methods on Modern ArchitecturesResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Christie Louis AlappatFriedrich Alexander University, Erlangen-Nuremberg, Johannes SeiferthUniversity of Bayreuth, Georg HagerFriedrich Alexander University, Erlangen-Nuremberg, Matthias KorchUniversity of Bayreuth, Thomas RauberUniversity of Bayreuth, Gerhard WelleinFriedrich Alexander University, Erlangen-Nuremberg
10:45
15m
Talk
GoBench: a Benchmark Suite of Real-World Go Concurrency BugsResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Ting YuanInstitute of Computing Technology, CAS, Guangwei LiInstitute of Computing Technology, JieLu , liuchen , Lian LiInstitute of Computing Technology at Chinese Academy of Sciences, China, Jingling XueUNSW Sydney
11:00 - 11:10
Break (10min)Main Conference
11:10 - 12:10
Session #5: Memory Optimization and SafenessMain Conference
Chair(s): EunJung (EJ) ParkLos Alamos National Laboratory
11:10
15m
Talk
Memory-Safe Elimination of Side ChannelsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Luigi SoaresFederal University of Minas Gerais, Fernando Magno Quintão PereiraFederal University of Minas Gerais
11:25
15m
Talk
Variable-sized Blocks for Locality-aware SpMV
Main Conference
Naveen NamashivayamHPE, Sanyam MehtaHPE, Pen-Chung YewUniversity of Minnesota
11:40
15m
Talk
Object Versioning for Flow-Sensitive Pointer AnalysisResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Mohamad BarbarUniversity of Technology, Sydney, Yulei SuiUniversity of Technology Sydney, Shiping ChenData61 at CSIRO, Australia / UNSW, Australia
11:55
15m
Talk
Scaling up the IFDS Algorithm with Efficient Disk-based ComputingArtifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Haofeng LiInstitute of Computing Technology, CAS; University of Chinese Academy of Sciences, Haining MengInstitute of Computing Technology, CAS; University of Chinese Academy of Sciences, Hengjie ZhengInstitute of Computing Technology, Chinese Academy of Sciences, Liqing CaoInstitute of Computing Technology, Chinese Academy of Sciences, JieLu , Lian LiInstitute of Computing Technology at Chinese Academy of Sciences, China, Lin GaoTianqiSoft Inc.
12:10 - 12:30
Break (20min)Main Conference
12:30 - 14:30
CGO Student Research CompetitionMain Conference / Student Research Competition
12:30
10m
Talk
A New Memory Layout for Self-Rebalancing Trees
Student Research Competition
12:40
10m
Talk
Automatic Inspection of Program State for Debugging and Verification Purposes
Student Research Competition
José Wesley de Souza MagalhãesFederal University of Minas Gerais
12:50
10m
Talk
Compiler Framework for Low Overhead Fork-Join Parallelism
Student Research Competition
13:00
10m
Talk
Data vs. Instructions: Runtime Code Generation for Convolutions
Student Research Competition
Malith JayaweeraNortheastern University
13:10
10m
Talk
Fast Structural Register Allocation
Student Research Competition
William ZhangCarnegie Mellon University, Pranav KumarCarnegie Mellon University
13:20
10m
Talk
Fine Grained Control of Program Transformations via Strategic Rewriting in MLIR
Student Research Competition
Martin LückeUniversity of Edinburgh
13:30
10m
Talk
Towards an Exploration Tool for Program Optimization Using Heuristic Search Algorithms
Student Research Competition
Johannes LenfersUniversity of Münster
13:40
10m
Talk
When Binary Optimization Meets Static Profiling
Student Research Competition
Angelica Aparecida MoreiraUniversidade Federal de Minas Gerais

Conference Day
Wed 3 Mar

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:00
Keynote (HPCA)Main Conference

A Journey to a Commercial-Grade Processing-In-Memory (PIM) Chip Development

Emerging applications demand high off-chip memory bandwidth, but it becomes very expensive to further increase the bandwidth of off-chip memory under stringent physical constraints of chip packages and system boards. Besides, energy efficiency of moving data across the memory hierarchy of processors has steadily worsened with the stagnant technology scaling and poor data reuse characteristics of the emerging applications. To cost-effectively increase the bandwidth and energy efficiency, researchers began to reconsider the past processing-in-memory (PIM) architectures and advance them further, especially with recent integration technologies such as 2.5D/3D stacking. Albeit the recent advances, no major memory manufacturer had developed even a proof-of-concept silicon yet, not to mention a product. In this talk, I will start with discussing various practical and technical challenges that have been overlooked by researchers and prevented the industry from successfully commercializing PIM. Then I will present a practical PIM architecture that considers various aspects of successful commercialization in the near future. Finally, I present a journey to the development of a commercial-grade PIM chip, which was designed based on a commercial HBM2, fabricated with a 20nm DRAM technology, integrated with unmodified commercial processors, and successfully ran various memory-bound machine learning applications with more than 2x improvement in system performance 70% reduction in system energy consumption.

Speaker: Nam Sung Kim (University of Illinois at Urbana-Champaign / Samsung Electronics)

Nam Sung Kim is a Senior Vice President at Samsung Electronics as well as a Professor at the University of Illinois. At Samsung he led the architecture definitions and designs of next generation DRAM devices including HBM, LPDDR, DDR, and GDDR. He has published more than 200 refereed articles to highly-selective conferences and journals in the field of circuit, architecture, and computer-aided design. For his contributions to developing power-efficient computer architectures, he was elevated to IEEE and ACM Fellows in 2016 and 2021, respectively, and received the ACM SIGARCH/IEEE-CS TCCA Influential ISCA Paper Award in 2017. He is also a hall of fame member of all three major computer architecture conferences, ISCA, MICRO, and HPCA.

10:00 - 11:00
Session #6: Compiling Graph Algorithms, Compiling for GPUsMain Conference
Chair(s): Maria Jesus GarzaranIntel Corporation and University of Illinois at Urbana-Champaign
10:00
15m
Talk
Compiling Graph Applications for GPUs with GraphItResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ajay BrahmakshatriyaMassachusetts Institute of Technology, Yunming Zhang, Changwan HongMassachusetts Institute of Technology, Shoaib KamilAdobe Research, Julian ShunMIT, Saman AmarasingheMassachusetts Institute of Technology
10:15
15m
Talk
Efficient Execution of Graph Algorithms on CPU with SIMD ExtensionsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ruohuang ZhengUniversity of Rochester, Sreepathi PaiUniversity of Rochester
10:30
15m
Talk
r3d3: Optimized Query Compilation on GPUsResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Alexander KrolikMcGill University, Canada, Clark VerbruggeMcGill University, Canada, Laurie HendrenMcGill University, Canada
10:45
15m
Talk
C-for-Metal: High Performance SIMD Programming on Intel GPUsResults Reproduced v1.1Artifacts Evaluated – Functional v1.1Artifact Available v1.1
Main Conference
Guei-Yuan LuehIntel Corporation, Kaiyu ChenIntel Corporation, Gang ChenIntel Corporation, Joel FuentesIntel Corporation, Wei-Yu ChenIntel Corporation, Fangwen FuIntel Corporation, Hong JiangIntel Corporation, Hongzheng LiIntel Corporation, Daniel RheeIntel Corporation
11:00 - 11:10
Break (10min)Main Conference
11:10 - 11:55
Session #7: Compiling for Spatial, Quantum, and Embedded DevicesMain Conference
Chair(s): Wei-Fen LinNational Cheng Kung University
11:10
15m
Talk
Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum CircuitsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ji LiuNorth Carolina State University, Luciano BelloIBM Research, Huiyang ZhouNorth Carolina State U.
11:25
15m
Talk
StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing SystemsResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Johannes de Fine Licht, Andreas KusterETH Zurich, Tiziano De MatteisETH Zurich, Tal Ben-NunDepartment of Computer Science, ETH Zurich, Dominic HoferETH Zurich, Torsten HoeflerETH Zurich
11:40
15m
Talk
Thread-aware Area-efficient High-level Synthesis Compiler for Embedded Devices
Main Conference
Changsu KimPOSTECH, Shinnung JeongYonsei University, Sungjun ChoPOSTECH, Yongwoo LeeYonsei University, William SongYonsei University, Youngsok KimYonsei University, Hanjun KimYonsei University
11:55 - 12:10
Award CeremonyMain Conference

Best Paper Award

Compiler Graph Applications for GPUs with GraphIt

Ajay Brahmakshatriya, Yunming Zhang, Changwan Hong, Shoaib Kamil, Julian Shun, Saman Amarasinghe

Test-of-Time Award

Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code (CGO ’10)

Hongtao Yu, Jingling Xue, Wei Huo, Xiaobing Feng, Zhaoqing Zhang

12:10 - 12:30
Break (20min)Main Conference
12:30 - 13:30
Session #8: JIT and Binary Translation; Optimizing for Code SizeMain Conference
Chair(s): Probir RoyUniversity of Michigan at Dearborn
12:30
15m
Talk
HHVM Jump-Start: Boosting both Warmup and Steady-State Performance at Scale
Main Conference
Guilherme OttoniFacebook, Bin LiuFacebook
12:45
15m
Talk
Enhancing Atomic Instruction Emulation for Cross-ISA Dynamic Binary TranslationResults Reproduced v1.1Artifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Ziyi ZhaoNankai University, Zhang JiangNankai University, Xiaoli GongNankai University, Ying ChenNankai University, Wenwen WangUniversity of Georgia, Pen-Chung YewUniversity of Minnesota
13:00
15m
Talk
An Experience with Code-size Optimization for Production iOS Mobile ApplicationsArtifacts Evaluated – Reusable v1.1Artifact Available v1.1
Main Conference
Milind ChabbiUber Technologies Inc., Jin LinUber Technologies, Raj BarikUber Technologies Inc.
13:15
15m
Talk
AnghaBench: a Suite with One Million Compilable C Benchmarks for Code-Size Reduction
Main Conference
Anderson Faustino da SilvaState University of Maringá, Bruno Conde KindUFMG, José Wesley MagalhãesFederal University of Minas Gerais, Jeronimo Nunes RochaUFMG, Breno GuimaraesUFMG, Fernando Magno Quintão PereiraFederal University of Minas Gerais
13:30 - 15:00
Joint Session PanelMain Conference

Panelists: John L. Hennessy Alphabet and Stanford, David Patterson Google and U.C. Berkeley, Margaret Martonosi NSF CISE and Princeton, Bill Dally NVIDIA and Stanford, Natalie Enright Jerger U. Toronto and ACM D&I Council, Kim Hazelwood Facebook AI Research, Timothy M. Pinkston USC

13:30
90m
Talk
“Valuing Diversity, Equity, and Inclusion in Our Computing Community”
Main Conference

File Attached

Call for Papers

The ACM Student Research Competition (SRC) offers a unique forum for undergraduate and graduate students to present their original research before a panel of judges and attendees at CGO. Participants must be undergraduates or graduate students pursuing an academic degree at the time of initial submission. Participants must be current student members of the ACM.

To participate in the competition, a student must submit an extended abstract (2-page). The abstracts will be reviewed by a selection committee and selected abstracts will be invited to present at a virtual presentation session. Short presentations (10 minutes + 5 minutes questions) are evaluated by a jury during the session. Based on the submitted abstract and the presentation, the winner of CGO’s Student Research Competition will be selected, who will receive an award. In addition, the winner will be invited to participate in the grand 2021 ACM SRC competition. Further information on the ACM SRC is available at src.acm.org.

Submissions in the form of an extended abstract are solicited in any topics relevant to the main conference, including:

  • Code Generation, Translation, Transformation, and Optimization for performance, energy, virtualization, portability, security, or reliability concerns, and architectural support
  • Efficient execution of dynamically typed and higher-level languages Optimization and code generation for emerging programming models, platforms, domain-specific languages
  • Dynamic/static, profile-guided, feedback-directed, and machine learning-based optimization
  • Static, Dynamic, and Hybrid Analysis for performance, energy, memory locality, throughput or latency, security, reliability, or functional debugging
  • Program characterization methods
  • Efficient profiling and instrumentation techniques; architectural support
  • Novel and efficient tools
  • Compiler design, practice, and experience
  • Compiler abstraction and intermediate representations
  • Vertical integration of language features, representations, optimizations, and runtime support for parallelism
  • Solutions that involve cross-layer (HW/OS/VM/SW) design and integration
  • Deployed dynamic/static compiler and runtime systems for general-purpose, embedded system and Cloud/HPC platforms
  • Parallelism, heterogeneity, and reconfigurable architectures
  • Optimizations for heterogeneous or specialized targets, GPUs, SoCs, CGRA
  • Compiler-support for vectorization, thread extraction, task scheduling, speculation, transaction, memory management, data distribution, and synchronization


Supporter

alt text

Deadline extended to Dec 19 2020 AoE.

Submission Site

Abstracts can be submitted at https://cgo2021src.hotcrp.com.

Submission Guidelines

  • Submissions must be original material that has not been previously published in another conference or journal, nor is currently under review by another conference or journal.

  • Your submission is limited to two (2) letter-size pages, including all text and figures. There is no page limit for references.

  • Please format your submission using the SIGPLAN format at http://www.sigplan.org/Resources/Author/. Please use the provided 8.5″x11″ single-spaced, double-column LaTex or Word templates.

Please contact CGO ’21 SRC chair if you need more information.