LLM Compiler: Foundation Language Models for Compiler Optimization (CC 2025 - Main Conference)

Who

Chris Cummins, Volker Seeker, Dejan Grubisic, Baptiste Rozière, Jonas Gehring, Gabriel Synnaeve, Hugh Leather

Track

CC 2025 Main Conference

Time Zone

The program is currently displayed in (GMT-08:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-08:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 1 Mar 2025 15:00 - 15:30 at Acacia A - Machine Learning and PL I Chair(s): Sara Achour

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across a variety of software engineering and coding tasks. However, their application in the domain of code and compiler optimization remains underexplored. Training LLMs is resource-intensive, requiring substantial GPU hours and extensive data collection, which can be prohibitive. To address this gap, we introduce LLM Compiler, a suite of robust, openly available, pre-trained models specifically designed for compiler tasks. Built on the foundation of Code Llama, LLM Compiler enhances the understanding of compiler intermediate representations (IRs), assembly language, and optimization techniques. The models have been trained on a vast corpus of 546 billion tokens of LLVM-IR and assembly code and have undergone instruction fine-tuning to interpret compiler behavior.

To demonstrate the utility of these research tools, we also present fine-tuned versions of the models with enhanced capabilities in optimizing code size and disassembling from x86_64 and ARM assembly back into LLVM-IR. These achieve 77% of the optimising potential of an autotuning search, and 45% disassembly round trip (14% exact match).

LLM Compiler is released under a bespoke commercial license to allow wide reuse and is available in two sizes: 7 billion and 13 billion parameters. Our aim is to provide scalable, cost-effective foundational tools for further research and development in compiler optimization by both academic researchers and industry practitioners. Since we released LLM Compiler the community has quantized, repackaged, and downloaded the models over 250k times.

Chris Cummins

United States

Volker Seeker

Meta AI Research

United States

Dejan Grubisic

Meta

Baptiste Rozière

Meta

Jonas Gehring

Meta

Gabriel Synnaeve

Meta

Hugh Leather

Meta AI Research

United States

Time Zone

The program is currently displayed in (GMT-08:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-08:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sat 1 Mar
Displayed time zone: Pacific Time (US & Canada) change

14:00 - 15:30	Machine Learning and PL IMain Conference at Acacia A Chair(s): Sara Achour Stanford University

14:00 30m Talk		DFA-Net: A Compiler-Specific Neural Architecture for Robust Generalization in Data Flow Analyses Main Conference Alexander Brauckmann University of Edinburgh, Anderson Faustino da Silva State University of Maringá, Jeronimo Castrillon TU Dresden, Germany, Hugh Leather Meta AI Research
14:30 30m Talk		Finding Missed Code Size Optimizations in Compilers using Large Language Models Main Conference Davide Italiano Meta, Chris Cummins Meta
15:00 30m Talk		LLM Compiler: Foundation Language Models for Compiler Optimization Main Conference Chris Cummins Meta, Volker Seeker Meta AI Research, Dejan Grubisic Meta, Baptiste Rozière Meta, Jonas Gehring Meta, Gabriel Synnaeve Meta, Hugh Leather Meta AI Research