AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators, N. Agostini, J. Haris, P. Gibson, M. Jayaweera, N. Rubin, A. Tumeo, J. Abellán, J. Cano, D. Kaeli, Pre-print, Artifact
PresCount: Effective Register Allocation for Bank Conflict Reduction, Xiaofeng Guan, Hao Zhou, Guoqing Bao, Handong Li, Liang Zhu, Jianguo Yao, Pre-print
Retargeting and Respecializing GPU Workloads for Performance Portability, I. Ivanov, O. Zinenko, J. Domke, T. Endo, W. Moses, Video, Artifact
SCHEMATIC: Compile-time checkpoint placement and memory allocation for intermittent systems, H. Reymond, J. Béchennec, M. Briday, S. Faucou, I. Puaut, E. Rohou, Pre-print, Video, Artifact
Compile-time Analysis of Compiler Frameworks for Query Compilation, A. Engelke, T. Schwarz, Pre-print, Artifact
A Tensor Algebra Compiler for Sparse Differentiation, A. Shaikhha, M. Huot, S. Hashemian
Tackling the Matrix Multiplication Micro-kernel Generation with Exo, A. Castelló, J. Bellavita, G. Dinh, Y. Ikarashi, H. Martínez, Pre-print, Artifact
Distinguished Paper Award: A System-Level Dynamic Binary Translator Using Automatically-Learned Translation Rules, Jinhu Jiang, Chaoyi Liang, Rongchao Dong, Zhongjun Zhou, Wenwen Wang, Penchung Yew and Weihua Zhang, Pre-print
TapeFlow: Streaming Gradient Tapes in Automatic Differentiation, M. Hakimi, A. Shriraman
A Framework for Fine-Grained Synchronization of Dependent GPU Kernels, A. Jangda, S. Maleki, M. Dehnavi, M. Musuvathi, O. Saarikivi, Pre-print, Artifact
Instruction Scheduling for the GPU on the GPU, G. Shobaki, P. Muyan-Ozcelik, J. Hutton, B. Linck, V. Malyshenko, A. Kerbow, R. Ramirez-Ortega, V. Gordon
Experiences Building an MLIR-Based SYCL Compiler, E. Tiotto, V. Perez, W. Tsang, L. Sommer, J. Oppermann, V. Lomüller, M. Goli, J. Brodman, Pre-print, Artifact
SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly, J. Armengol-Estapé, J. Woodruff, C. Cummins, M. O'Boyle, Pre-print, Artifact
Distinguished Paper Award: Revealing Compiler Heuristics through Automated Discovery and Optimization, V. Seeker, C. Cummins, M. Cole, B. Franke, K. Hazelwood, H. Leather
OptiWISE: Combining Sampling and Instrumentation for Granular CPI Analysis, Y. Guo, A. Chadwick, M. Erdos, U. Bora, I. Vougioukas, G. Gabrielli, T. Jones, Artifact
Distinguished Paper Award: JITSPMM: Just-in-Time Instruction Generation for Accelerated Sparse Matrix-Matrix Multiplication, Q. Fu, T. Rolinger, H. Huang, Pre-print, Artifact
Energy-Aware Tile Size Selection for Affine Programs on GPUs, M. Jayaweera, M. Kong, Y. Wang, D. Kaeli, Pre-print, Artifact
Ecmas: Efficient Circuit Mapping and Scheduling for Surface Code, M. Zhu, H. Fu, J. Wu, C. Zhang, W. Xie, X. Li, Pre-print, Video
Enhancing Performance Through Control-flow Unmerging and Loop Unrolling on GPUs, A. Murtovi, G. Georgakoudis, K. Parasyris, C. Liao, I. Laguna, B. Steffen
Latent Idiom Recognition for a Minimalist Functional Array Language using Equality Saturation, J. Van der Cruysse, C. Dubach, Pre-print, Artifact
EasyView: Bringing Performance Profiles into Integrated Development Environments, Q. Zhao, M. Chabbi, X. Liu, Pre-print, Artifact
Distinguished Paper Award: oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation, J. Li, Z. Qin, Y. Mei, J. Cui, Y. Song, C. Chen, Y. Zhang, L. Du, X. Cheng, B. Jin, Y. Zhang, J. Ye, E. Lin, D. Lavery, I. Safonov
Enabling Fine-Grained Incremental Builds By Making Compiler Stateful, R. Han, J. Zhao, H. Kim, Pre-print
Revamping Sampling-Based PGO with Context-Sensitivity and Pseudo-Instrumentation, W. He, H. Yu, L. Wang, T. Oh, Pre-print
DrPy: Pinpointing Inefficient Memory Usage in Multi-Layer Python Applications, J. Cui, Q. Zhao, Y. Hao, X. Liu
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler, G. Consolaro, Z. ZHANG, H. Razanajato, N. Lossing, N. Tchoulak, A. Susungi, A. Alves, D. Barthou, R. Zhang, C. Ancourt, C. Bastoul, Pre-print, Artifact
Representing Data Collections in an SSA Form, T. McMichen, N. Greiner, P. Zhong, F. Sossai, A. Patel, S. Campanoni, Pre-print
Compiler Testing with Relaxed Memory Models, L. Geeson, L. Smith, Pre-print, Artifact
BEC: Bit-level Static Analysis for Reliability against Soft Errors, Y. Ko, B. Burgstaller, Pre-print, Artifact
Boosting the Performance of Multi-solver IFDS Algorithms with Flow-Sensitivity Optimizations, H. Li, J. Lu, H. Meng, L. Cao, L. Li, L. Gao, Pre-print
AskIt: Unified Programming Interface for Programming with Large Language Models, K. Okuda, S. Amarasinghe, Artifact
Unveiling and Vanquishing Goroutine Leaks in Enterprise Microservices: A Dynamic Analysis Approach, G. Saioc, D. Shirchenko, M. Chabbi, Pre-print, Video, Artifact
Seer: Predictive Runtime Kernel Selection for Irregular Problems, R. Swann, M. Osama, K. Sangaiah, J. Mahmud
High-Throughput, Formal-Methods-Assisted Fuzzing for LLVM, Y. Fan, J. Regehr
Whose baseline (compiler) is it anyway?, B. Titzer, Pre-print
EasyTracker: A Python Library for Controlling and Inspecting Program Execution, Théo Barollet, Christophe Guillon, Manuel Selva, François Broquedis, Florent Bouchez-Tichadou, Fabrice Rastello, Pre-print, Artifact
One Automaton To Rule Them All: Beyond Multiple Regular Expressions Execution, L. Cicolini, F. Carloni, M. Santambrogio, D. Conficconi, Pre-print, Video, Artifact