Matching Linear Algebra and Tensor Code to Specialized Hardware Accelerators
Dedicated tensor accelerators demonstrate the importance of linear algebra in modern applications. Such accelerators have the potential for impressive performance gains, but require programmers to rewrite code using vendor APIs - a barrier to wider scale adoption. Recent work overcomes this by matching and replacing patterns within code, but such approaches are fragile and fail to cope with the diversity of real-world codes.
We develop ATC, a compiler that uses program synthesis to map regions of code to specific APIs. The mapping space that ATC explores is combinatorially large, requiring the development of program classification, dynamic analysis, variable constraint generation and lexical distance matching techniques to make it tractable.
We apply ATC to real-world tensor and linear algebra codes and evaluate them against four state-of-the-art approaches. We accelerate between 2.6x and 7x more programs, leading to over an order of magnitude performance improvement.
Sat 25 FebDisplayed time zone: Eastern Time (US & Canada) change
14:20 - 15:20
|A Sound and Complete Algorithm for Code Generation in Distance-Based ISA
Shu Sugita University of Tokyo, Toru Koizumi University of Tokyo, Ryota Shioya University of Tokyo, Hidetsugu Irie University of Tokyo, Shuichi Sakai University of TokyoDOI
|Matching Linear Algebra and Tensor Code to Specialized Hardware Accelerators
Pablo Antonio Martínez University of Murcia, Jackson Woodruff University of Edinburgh, Jordi Armengol-Estapé University of Edinburgh, Gregorio Bernabé University of Murcia, José Manuel García University of Murcia, Michael F. P. O'Boyle University of EdinburghDOI
|Torchy: A Tracing JIT Compiler for PyTorch
Nuno P. Lopes INESC-ID; Instituto Superior Técnico - University of LisbonDOI