AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators
This paper addresses the need for automatic and efficient generation of host driver code for arbitrary custom AXI-based accelerators targeting linear algebra algorithms, an important workload in various applications, including machine learning and scientific computing. While existing tools have focused on automating accelerator prototyping, little attention has been paid to the host-accelerator interaction. This paper introduces AXI4MLIR, an extension of the MLIR compiler framework designed to facilitate the automated generation of host-accelerator driver code. With new MLIR attributes and transformations, AXI4MLIR empowers users to specify accelerator features (including their instructions) and communication patterns and exploit the host memory hierarchy. We demonstrate AXI4MLIR's versatility across different types of accelerators and problems, showcasing significant CPU cache reference reductions (up to 56%) and up to a 1.65$\times$ speedup compared to manually optimized driver code implementations. AXI4MLIR implementation is open-source and available at: \url{https://github.com/AXI4MLIR/axi4mlir}.
Mon 4 MarDisplayed time zone: London change
16:10 - 17:30 | |||
16:10 20mTalk | AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators Main Conference Nicolas Bohm Agostini Northeastern University; Pacific Northwest National Laboratory, Jude Haris University of Glasgow, Perry Gibson University of Glasgow, Malith Jayaweera Northeastern University, norm rubin Northeastern University, Antonino Tumeo Pacific Northwest National Laboratory, José L. Abellán University of Murcia, José Cano University of Glasgow, David Kaeli Northeastern University Pre-print | ||
16:30 20mTalk | Ecmas: Efficient Circuit Mapping and Scheduling for Surface Code Main Conference Mingzheng Zhu University of Science and Technology of China, Hao Fu University of Science and Technology of China, Jun Wu University of Science and Technology of China, Chi Zhang University of Science and Technology of China, Wei Xie University of Science and Technology of China, Xiang-Yang Li University of Science and Technology of China Pre-print | ||
16:50 20mTalk | PresCount: Effective Register Allocation for Bank Conflict Reduction Main Conference Xiaofeng Guan Shanghai Jiao Tong University; Shanghai Enflame Technology, Hao Zhou Shanghai Enflame Technology, Guoqing Bao Shanghai Enflame Technology, Handong Li Shanghai Jiao Tong University, Liang Zhu Shanghai Jiao Tong University, Jianguo Yao Shanghai Jiao Tong University; Shanghai Enflame Technology Pre-print | ||
17:10 20mTalk | Tackling the Matrix Multiplication Micro-kernel Generation with Exo Main Conference Adrián Castelló Universitat Politècnica de València, Julian Bellavita Cornell University, Grace Dinh University of California at Berkeley, Yuka Ikarashi Massachusetts Institute of Technology, Héctor Martínez Universidad de Córdoba Pre-print |