SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly
Decompilation is a well-studied area with numerous high-
quality tools available. These are frequently used for security
tasks and to port legacy code. However, they regularly generate
difficult-to-read programs and require a large amount of
engineering effort to support new programming languages
and ISAs. Recent interest in neural approaches has produced
portable tools that generate readable code. Nevertheless, to-date
such techniques are usually restricted to synthetic programs
without optimization, and no models have evaluated their
portability. Furthermore, while the code generated may be
more readable, it is usually incorrect.
This paper presents SLaDe, a Small Language model
Decompiler based on a sequence-to-sequence Transformer
trained over real-world code and augmented with a type
inference engine. We utilize a novel tokenizer, dropout-free
regularization, and type inference to generate programs that
are more readable and accurate than standard analytic and
recent neural approaches. Unlike standard approaches, SLaDe
can infer out-of-context types and unlike neural approaches, it
generates correct code.
We evaluate SLaDe on over 4,000 ExeBench functions on
two ISAs and at two optimization levels. SLaDe is up to 6×
more accurate than Ghidra, a state-of-the-art, industrial-strength
decompiler and up to 4× more accurate than the large language
model ChatGPT and generates significantly more readable code
than both.
Mon 4 MarDisplayed time zone: London change
11:30 - 12:50 | Machine-Learning Guided OptimizationsMain Conference at Tinto Chair(s): Zheng Wang University of Leeds | ||
11:30 20mTalk | AskIt: Unified Programming Interface for Programming with Large Language Models Main Conference Katsumi Okuda Massachusetts Institute of Technology; Mitsubishi Electric Corporation, Saman Amarasinghe Massachusetts Institute of Technology | ||
11:50 20mTalk | Revealing Compiler Heuristics through Automated Discovery and Optimization Main Conference Volker Seeker Meta AI Research, Chris Cummins Meta AI Research, Murray Cole University of Edinburgh, Björn Franke University of Edinburgh, Kim Hazelwood Meta AI Research, Hugh Leather Meta AI Research | ||
12:10 20mTalk | SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly Main Conference Jordi Armengol-Estapé University of Edinburgh, Jackson Woodruff University of Edinburgh, Chris Cummins Meta AI Research, Michael F. P. O'Boyle University of Edinburgh Pre-print | ||
12:30 20mTalk | TapeFlow: Streaming Gradient Tapes in Automatic Differentiation Main Conference Media Attached |