Sat 27 FebDisplayed time zone: Eastern Time (US & Canada) change
| 08:00 - 12:00 | Science, Art, Magic: Using and Developing The Graal Compiler Workshop (GRAAL)  - Part IWorkshops and Tutorials | ||
| 08:0060m Talk | Welcome and keynote: Performance Benchmarking GraalVM Workshops and Tutorials | ||
| 09:0030m Talk | Babashka: a native Clojure interpreter for scripting. Workshops and Tutorials | ||
| 09:3030m Talk | Truffle Startup and Warmup Challenges and Opportunities Workshops and Tutorials | ||
| 10:0030m Talk | One more gap bridged towards practice – support serialization feature in native image Workshops and Tutorials | ||
| 10:3030m Talk | Improving Compiler Optimizations by Employing Machine Learning Workshops and Tutorials | ||
| 11:0030m Talk | GraalVM at Facebook Workshops and Tutorials | ||
| 11:3030m Talk | Tracking Performance of Graal on Public Benchmarks Workshops and Tutorials | ||
| 09:00 - 13:00 |  IMOP: a Self-Stabilizing Source-to-Source Compiler Framework for OpenMP C (IMOP)Workshops and Tutorials | ||
| 09:0015m Talk | Introduction to IMOP Workshops and Tutorials | ||
| 09:1540m Talk | Fundamental Representations (AST, CFG, and CG) Workshops and Tutorials | ||
| 09:5530m Talk | Scopes, Symbols, Types, and Environments Workshops and Tutorials | ||
| 10:255m Break | Break Workshops and Tutorials | ||
| 10:3040m Talk | Code Construction and Transformations Workshops and Tutorials | ||
| 11:1040m Talk | Data-flow Analyses Workshops and Tutorials | ||
| 11:5010m Break | Break Workshops and Tutorials | ||
| 12:0030m Talk | Concurrency Representations Workshops and Tutorials | ||
| 12:3020m Talk | Self-stabilization, and Z3-integration Workshops and Tutorials | ||
| 12:5010m Talk | Discussions and Q&A Workshops and Tutorials | ||
| 13:00 - 17:00 | Design Space Exploration (DSE)Workshops and Tutorials | ||
| 13:0010m Talk | Introduction Workshops and Tutorials | ||
| 13:1030m Talk | Design Space Exploration Workshops and Tutorials | ||
| 13:4025m Talk | Hands-on: HyperMapper Demo Workshops and Tutorials | ||
| 14:2030m Talk | The Spatial programming language and compiler Workshops and Tutorials | ||
| 14:5030m Talk | Hands-on: Spatial Demo Workshops and Tutorials | ||
| 15:3530m Talk | DSE advanced topics Workshops and Tutorials | ||
| 16:0530m Talk | DSE Use case Workshops and Tutorials | ||
| 16:3525m Talk | Discussions/panel - Q&A Workshops and Tutorials | ||
| 15:00 - 17:00 | Science, Art, Magic: Using and Developing The Graal Compiler Workshop (GRAAL)  - Part IIWorkshops and Tutorials | ||
| 15:0030m Talk | Performance understanding tools for GraalVM using eBPF Workshops and Tutorials | ||
| 15:3030m Talk | Strato (Twitter PaaS) & Graal Native Image Workshops and Tutorials | ||
| 16:0060m Talk | Panel Session Workshops and Tutorials | ||
Sun 28 FebDisplayed time zone: Eastern Time (US & Canada) change
| 09:20 - 13:00 | |||
| 09:2010m Talk | Opening Workshops and Tutorials | ||
| 09:3030m Talk | Towards Automatic Scheduling for Tensorized Computation Workshops and Tutorials | ||
| 10:0030m Talk | Polyhedral Building Blocks for High-Performance Code Generation in MLIR Workshops and Tutorials | ||
| 10:3030m Talk | A high-performance polyhedral math library as a foundation for AI compilers Workshops and Tutorials | ||
| 11:0030m Break | Break Workshops and Tutorials | ||
| 11:3030m Talk | PolyDL: Polyhedral Compiler Optimizations for Deep Learning Workloads Workshops and Tutorials | ||
| 12:0030m Talk | Understanding the Poplar Graph Compiler for IPUs Workshops and Tutorials | ||
| 12:3030m Talk | Memory access planning for NPUs Workshops and Tutorials | ||
| 17:00 - 20:30 | |||
| 17:0030m Talk | Polyhedral compilation techniques for code generation on spatial architectures Workshops and Tutorials | ||
| 17:3030m Talk | Learning to optimize neural networks quickly Workshops and Tutorials | ||
| 18:0030m Talk | An MLIR-Based end-to-end dynamic shape compiler Workshops and Tutorials | ||
| 18:3030m Break | Break Workshops and Tutorials | ||
| 19:0030m Talk | Realize implicit GEMM-based convolutions on AMD GPU using MLIR Workshops and Tutorials | ||
| 19:3030m Talk | DPC++ Compiler and Performance Tuning for AI workloads Workshops and Tutorials | ||
| 20:0030m Talk | oneDNN Graph API: unify deep learning framework integration and maximize compute efficiency for multiple AI hardware Workshops and Tutorials | ||
Mon 1 MarDisplayed time zone: Eastern Time (US & Canada) change
| 08:45 - 09:00 | OpeningMain Conference | ||
| 09:00 - 10:00 | Keynote (PPoPP)Main Conference Atomicity without Trust There is increasing interest in distributed systems where participants stand to benefit from cooperation but do trust one another not to cheat. Although blockchain-based commerce is perhaps the most visible example of such systems, the problem of economic exchange among mutually untrusting autonomous parties is a fundamental one independent of particular technologies. This talk argues that such systems require rethinking our notions of correctness for distributed concurrency control and fault-tolerance. Addressing this challenge brings up questions familiar from classical distributed systems: how to combine multiple steps into a single atomic action, how to recover from failures, and how to coordinate concurrent access to data. Commerce among untrusting parties is a kind of fun-house mirror of classical distributed computing: familiar features are recognizable but distorted. For example, classical atomic transactions are often described in terms of the well-known ACID properties: atomicity, consistency, isolation, and durability. We will see that untrusting cooperation requires structures superficially similar to, but fundamentally different from, classical atomic transactions. Speaker: Maurice Herlihy (Brown University) | ||
| 11:00 - 11:10 | Break (10min)Main Conference | ||
| 11:10 - 12:10 | |||
| 11:1015m Talk | An Interval Compiler for Sound Floating Point Computations Main Conference Joao Rivera ETH Zurich, Franz Franchetti Carnegie Mellon University, USA, Markus Püschel ETH Zurich, Switzerland | ||
| 11:2515m Talk | Seamless Compiler Integration of Variable Precision Floating-Point Arithmetic Main Conference Tiago Jost Univ. Grenoble Alpes CEA, LIST, Grenoble, France, Yves Durand Univ. Grenoble Alpes CEA, LIST, Grenoble, France, Christian Fabre Univ. Grenoble Alpes CEA, LIST, Grenoble, France, Albert Cohen Google, Frédéric Pétrot Univ. Grenoble Alpes, CNRS, Grenoble INP, TIMA, Grenoble, France | ||
| 11:4015m Talk | UNIT: Unifying Tensorized Instruction Compilation Main Conference Jian Weng UCLA, Animesh Jain Amazon Web Services, Jie Wang , Leyuan Wang Amazon Web Services, USA, Yida Wang Amazon, Tony Nowatzki University of California, Los Angeles | ||
| 11:5515m Talk | Unleashing the Low-Precision Computation Potential of Tensor Cores on GPUs Main Conference Guangli Li Institute of Computing Technology, Chinese Academy of Sciences, Jingling Xue UNSW Sydney, Lei Liu Institute of Computing Technology,Chinese Academy of Sciences, Xueying Wang Institute of Computing Technology,Chinese Academy of Sciences;University of Chinese Academy of Sciences, Xiu Ma Jilin University, Xiao Dong Institute of Computing Technology, Chinese Academy of Sciences, Jiansong Li Institute of Computing Technology,Chinese Academy of Sciences;University of Chinese Academy of Sciences, Xiaobing Feng ICT CAS | ||
| 12:10 - 12:30 | Break (20min)Main Conference | ||
| 12:30 - 13:30 | Session #3: Binary Profiling, Tracing, SamplingMain Conference Chair(s): Wei Wang University of Texas at San Antonio, USA | ||
| 12:3015m Talk | Cinnamon: A Domain-Specific Language for Binary Profiling and Monitoring Main Conference Mahwish Arif University of Cambridge, Ruoyu Zhou University of Cambridge, Hsi-Ming Ho University of Sussex, Timothy M. Jones University of Cambridge, UK | ||
| 12:4515m Talk | GPA: A GPU Performance Advisor Based on Instruction Sampling Main Conference Keren Zhou Rice University, Xiaozhu Meng Rice University, Ryuichi Sai Rice University, John Mellor-Crummey Rice University | ||
| 13:0015m Talk | ELFies: Executable Region Checkpoints for Performance Analysis and Simulation Main Conference Harish Patil Intel, USA, Alexander Isaev Intel, Wim Heirman Intel, Alen Sabu National University of Singapore, Ali Hajiabadi National University of Singapore, Trevor E. Carlson National University of Singapore | ||
| 13:1515m Talk | Vulkan Vision: Ray Tracing Workload Characterization using Automatic Graphics Instrumentation Main Conference David Pankratz University of Alberta, Tyler Nowicki Huawei Technologies Canada, Ahmed Eltantawy Huawei Technologies Canada, Jose Nelson Amaral University of Alberta | ||
| 13:30 - 14:30 | Business MeetingMain Conference | ||
Tue 2 MarDisplayed time zone: Eastern Time (US & Canada) change
| 09:00 - 10:00 | Keynote (CGO)Main Conference Data Layout and Data Representation Optimizations to Reduce Data Movement Code generation and optimization for the diversity of current and future architectures must focus on reducing data movement to achieve high performance. How data is laid out in memory, and representations that compress data (e.g., reduced floating point precision) have a profound impact on data movement. Moreover, the cost of data movement in a program is architecture-specific, and consequently, optimizing data layout and data representation must be performed by a compiler once the target architecture is known. With this context in mind, this talk will provide examples of data layout and data representation optimizations, and call for integrating these data properties into code generation and optimization systems. Speaker: Mary Hall (University of Utah) Mary Hall is a Professor and Director of the School of Computing at University of Utah. She received a PhD in Computer Science from Rice University. Her research focus brings together compiler optimizations targeting current and future high-performance architectures on real-world applications. Hall’s prior work has developed compiler techniques for exploiting parallelism and locality on a diversity of architectures: automatic parallelization for SMPs, superword-level parallelism for multimedia extensions, processing-in-memory architectures, FPGAs and more recently many-core CPUs and GPUs. Professor Hall is an IEEE Fellow, an ACM Distinguished Scientist and a member of the Computing Research Association Board of Directors. She actively participates in mentoring and outreach programs to encourage the participation of women and other groups underrepresented in computer science. | ||
| 10:00 - 11:00 | Session #4: Parallelism - Optimizing, Modeling, TestingMain Conference Chair(s): Michael F. P. O'Boyle University of Edinburgh | ||
| 10:0015m Talk | Loop Parallelization using Dynamic Commutativity Analysis Main Conference Christos Vasiladiotis University of Edinburgh, Roberto Castañeda Lozano University of Edinburgh, Murray Cole University of Edinburgh, UK, Björn Franke University of Edinburgh, UK | ||
| 10:1515m Talk | Fine-grained Pipeline Parallelization for Network Function Programs Main Conference | ||
| 10:3015m Talk | YaskSite – Stencil Optimization Techniques Applied to Explicit ODE Methods on Modern Architectures Main Conference Christie Louis Alappat Friedrich Alexander University, Erlangen-Nuremberg, Johannes Seiferth University of Bayreuth, Georg Hager Friedrich Alexander University, Erlangen-Nuremberg, Matthias Korch University of Bayreuth, Thomas Rauber University of Bayreuth, Gerhard Wellein Friedrich Alexander University, Erlangen-Nuremberg | ||
| 10:4515m Talk | GoBench: a Benchmark Suite of Real-World Go Concurrency Bugs Main Conference Ting Yuan Institute of Computing Technology, CAS, Guangwei Li Institute of Computing Technology, Jie Lu , Chen Liu , Lian Li Institute of Computing Technology at Chinese Academy of Sciences, China, Jingling Xue UNSW Sydney | ||
| 11:00 - 11:10 | Break (10min)Main Conference | ||
| 11:10 - 12:10 | Session #5: Memory Optimization and SafenessMain Conference Chair(s): EunJung (EJ) Park Los Alamos National Laboratory | ||
| 11:1015m Talk | Memory-Safe Elimination of Side Channels Main Conference Luigi Soares Federal University of Minas Gerais, Fernando Magno Quintão Pereira Federal University of Minas Gerais | ||
| 11:2515m Talk | Variable-sized Blocks for Locality-aware SpMV Main Conference | ||
| 11:4015m Talk | Object Versioning for Flow-Sensitive Pointer Analysis Main Conference Mohamad Barbar University of Technology, Sydney, Yulei Sui University of Technology Sydney, Shiping Chen Data61 at CSIRO, Australia / UNSW, Australia | ||
| 11:5515m Talk | Scaling up the IFDS Algorithm with Efficient Disk-based Computing Main Conference Haofeng Li Institute of Computing Technology, CAS; University of Chinese Academy of Sciences, Haining Meng Institute of Computing Technology, CAS; University of Chinese Academy of Sciences, Hengjie Zheng Institute of Computing Technology, Chinese Academy of Sciences, Liqing Cao Institute of Computing Technology, Chinese Academy of Sciences, Jie Lu , Lian Li Institute of Computing Technology at Chinese Academy of Sciences, China, Lin Gao TianqiSoft Inc. | ||
| 12:10 - 12:30 | Break (20min)Main Conference | ||
Wed 3 MarDisplayed time zone: Eastern Time (US & Canada) change
| 09:00 - 10:00 | Keynote (HPCA)Main Conference A Journey to a Commercial-Grade Processing-In-Memory (PIM) Chip Development Emerging applications demand high off-chip memory bandwidth, but it becomes very expensive to further increase the bandwidth of off-chip memory under stringent physical constraints of chip packages and system boards. Besides, energy efficiency of moving data across the memory hierarchy of processors has steadily worsened with the stagnant technology scaling and poor data reuse characteristics of the emerging applications. To cost-effectively increase the bandwidth and energy efficiency, researchers began to reconsider the past processing-in-memory (PIM) architectures and advance them further, especially with recent integration technologies such as 2.5D/3D stacking. Albeit the recent advances, no major memory manufacturer had developed even a proof-of-concept silicon yet, not to mention a product. In this talk, I will start with discussing various practical and technical challenges that have been overlooked by researchers and prevented the industry from successfully commercializing PIM. Then I will present a practical PIM architecture that considers various aspects of successful commercialization in the near future. Finally, I present a journey to the development of a commercial-grade PIM chip, which was designed based on a commercial HBM2, fabricated with a 20nm DRAM technology, integrated with unmodified commercial processors, and successfully ran various memory-bound machine learning applications with more than 2x improvement in system performance 70% reduction in system energy consumption. Speaker: Nam Sung Kim (University of Illinois at Urbana-Champaign / Samsung Electronics) Nam Sung Kim is a Senior Vice President at Samsung Electronics as well as a Professor at the University of Illinois. At Samsung he led the architecture definitions and designs of next generation DRAM devices including HBM, LPDDR, DDR, and GDDR. He has published more than 200 refereed articles to highly-selective conferences and journals in the field of circuit, architecture, and computer-aided design. For his contributions to developing power-efficient computer architectures, he was elevated to IEEE and ACM Fellows in 2016 and 2021, respectively, and received the ACM SIGARCH/IEEE-CS TCCA Influential ISCA Paper Award in 2017. He is also a hall of fame member of all three major computer architecture conferences, ISCA, MICRO, and HPCA. | ||
| 10:00 - 11:00 | Session #6: Compiling Graph Algorithms, Compiling for GPUsMain Conference Chair(s): Maria Jesus Garzaran Intel Corporation and University of Illinois at Urbana-Champaign | ||
| 10:0015m Talk | Compiling Graph Applications for GPUs with GraphIt Main Conference Ajay Brahmakshatriya Massachusetts Institute of Technology, Yunming Zhang , Changwan Hong Massachusetts Institute of Technology, Shoaib Kamil Adobe Research, Julian Shun MIT, Saman Amarasinghe Massachusetts Institute of Technology | ||
| 10:1515m Talk | Efficient Execution of Graph Algorithms on CPU with SIMD Extensions Main Conference | ||
| 10:3015m Talk | r3d3: Optimized Query Compilation on GPUs Main Conference Alexander Krolik McGill University, Canada, Clark Verbrugge McGill University, Canada, Laurie Hendren McGill University, Canada | ||
| 10:4515m Talk | C-for-Metal: High Performance SIMD Programming on Intel GPUs Main Conference Guei-Yuan Lueh Intel Corporation, Kaiyu Chen Intel Corporation, Gang Chen Intel Corporation, Joel Fuentes Intel Corporation, Wei-Yu Chen Intel Corporation, Fangwen Fu Intel Corporation, Hong Jiang Intel Corporation, Hongzheng Li Intel Corporation, Daniel Rhee Intel Corporation | ||
| 11:00 - 11:10 | Break (10min)Main Conference | ||
| 11:10 - 11:55 | Session #7: Compiling for Spatial, Quantum, and Embedded DevicesMain Conference Chair(s): Wei-Fen Lin National Cheng Kung University | ||
| 11:1015m Talk | Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum Circuits Main Conference Ji Liu North Carolina State University, Luciano Bello IBM Research, Huiyang Zhou North Carolina State U. | ||
| 11:2515m Talk | StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems Main Conference Johannes de Fine Licht , Andreas Kuster ETH Zurich, Tiziano De Matteis ETH Zurich, Tal Ben-Nun Department of Computer Science, ETH Zurich, Dominic Hofer ETH Zurich, Torsten Hoefler ETH Zurich | ||
| 11:4015m Talk | Thread-aware Area-efficient High-level Synthesis Compiler for Embedded Devices Main Conference Changsu Kim POSTECH, Shinnung Jeong Yonsei University, Sungjun Cho POSTECH, Yongwoo Lee Yonsei University, William Song Yonsei University, Youngsok Kim Yonsei University, Hanjun Kim Yonsei University | ||
| 11:55 - 12:10 | Award CeremonyMain Conference Best Paper Award Compiler Graph Applications for GPUs with GraphIt Ajay Brahmakshatriya, Yunming Zhang, Changwan Hong, Shoaib Kamil, Julian Shun, Saman Amarasinghe Test-of-Time Award Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code (CGO ’10) Hongtao Yu, Jingling Xue, Wei Huo, Xiaobing Feng, Zhaoqing Zhang | ||
| 12:10 - 12:30 | Break (20min)Main Conference | ||
| 13:30 - 15:00 | Joint Session PanelMain Conference Panelists: John L. Hennessy Alphabet and Stanford, David Patterson Google and U.C. Berkeley, Margaret Martonosi NSF CISE and Princeton, Bill Dally NVIDIA and Stanford, Natalie Enright Jerger U. Toronto and ACM D&I Council, Kim Hazelwood Facebook AI Research, Timothy M. Pinkston USC | ||
| 13:3090m Talk | “Valuing Diversity, Equity, and Inclusion in Our Computing Community” Main Conference File Attached | ||