LAGrad: Statically Optimized Differentiable Programming in MLIR (CC 2023 - Research Papers)

Who

Mai Jacob Peng, Christophe Dubach

Track

CC 2023 Research Papers

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sun 26 Feb 2023 11:40 - 12:00 at St. Laurent 3 - Optimizations Chair(s): Louis-Noël Pouchet

Abstract

Automatic differentiation (AD) is a central algorithm in deep learning and the emerging field of differentiable programming. However, the performance of AD remains a significant bottleneck in these fields. Training large models requires repeatedly evaluating gradients via AD potentially millions of times. Additionally, the most common form of AD incurs an asymptotically large memory cost relative to the original function being differentiated.

This paper introduces LAGrad, a reverse-mode, source-to-source AD system that leverages high-level information in MLIR to produce efficient differentiated code. LAGrad employs a collection of novel static optimizations that benefit from the semantics of high-level MLIR dialects to exploit the sparsity and structured control flow of generated code.

Using these, LAGrad is able to achieve speedups of up to $2.8\times$ and use $35\times$ less memory relative to state of the art AD systems on real-world machine learning and computer vision benchmarks.

DOI

https://doi.org/10.1145/3578360.3580259

Mai Jacob Peng

McGill University

Canada

Christophe Dubach

McGill University; Mila