Fusion of Operators of Computational Graphs via Greedy Clustering: The XNNC Experience
Tensor compilers like XLA, TVM, and TensorRT operate on computational graphs, where vertices represent operations and edges represent data flow between these operations. Operator fusion is a compiler optimization that merges operators within the computational graph to improve their efficiency. This paper presents the operator fusion algorithm recently deployed in the Xtensa Neural Network Compiler (XNNC)—Cadence Tensilica’s tensor compiler. The algorithm clusters nodes within the computational graph and iteratively grows these clusters until reaching a fixed point. A priority queue, sorted by the estimated profitability of merging cluster candidates, guides this iterative process. It balances precision and practicality, producing more efficient model implementations than XNNC’s previous fusion approach, which was based on a depth-first traversal of the computational graph. Moreover, unlike recently proposed exhaustive or evolutionary search methods, this algorithm terminates quickly while often yielding equally efficient models.
Sat 1 MarDisplayed time zone: Pacific Time (US & Canada) change
10:30 - 12:00 | Compilers and OptimizationMain Conference at Acacia A Chair(s): Jens Palsberg University of California, Los Angeles (UCLA) | ||
10:30 30mTalk | pyATF: Constraint-Based Auto-Tuning in Python Main Conference Richard Schulze University of Muenster, Sergei Gorlatch University of Muenster, Ari Rasch University of Muenster Link to publication DOI Pre-print Media Attached | ||
11:00 30mTalk | Overloading the Dot Main Conference | ||
11:30 30mTalk | Fusion of Operators of Computational Graphs via Greedy Clustering: The XNNC Experience Main Conference Michael Canesche Cadence Design Systems, Vanderson Martins do Rosario Cadence Design Systems, Edson Borin State University of Campinas, Fernando Magno Quintão Pereira Federal University of Minas Gerais |