CC 2025
Sat 1 - Sun 2 March 2025

Automatic differentiation (AD) is at the core of all machine learning frameworks and has applications in scientific computing as well. Theoretical research on reverse-mode automatic differentiation focuses on functional, higher-order languages, enabling AD to be formulated as a series of local, concise program rewrites. These theoretical approaches focus on correctness but disregard efficiency. Practical implementations, however, employ mutation and taping techniques to enhance efficiency. This approach, however, necessitates intricate, low-level, and non-local program transformations.

In this work, we introduce MimIrADe, a functionally inspired AD technique implemented within a higher-order, graph-based (“sea of nodes”) intermediate representation (IR). Our method consists of a streamlined implementation and incorporates standard optimizations, resulting in an efficient AD system. The higher-order nature of the IR enables us to utilize concise functional AD methods, expressing AD through local rewrites. This locality facilitates modular high-level extensions, such as matrix operations, in a straightforward manner. Additionally, the graph-based structure of the IR ensures that critical implementation aspects -particularly the handling of shared pullback invocations - are managed naturally and efficiently. Our AD pass supports a comprehensive set of features, including non-scalar types, pointers, and higher-order recursive functions.

We demonstrate through standard benchmarks that a suite of common optimizations effectively eliminates the overhead typically associated with functional AD approaches, producing differentiated code that performs on par with leading mutation and taping techniques. At the same time, MimIrADe’s implementation is an order of magnitude less complex compared to its contenders.