Despite extensive usage in high-performance, low-level systems programming applications, including the Linux kernel, C is susceptible to vulnerabilities, primarily due to manual memory management and unsafe pointer operations. Rust, a modern systems programming language, offers a compelling alternative by ensuring memory safety without sacrificing performance through its unique ownership model and type system. We present an automated approach to translate C to safe Rust. Our technique uses a synergistic combination of LLM-driven code generation guided by dynamic-analysis-generated execution information. Our approach exposes novel insights on scaling, testing, and combining the strengths of LLMs and dynamic analysis. We apply our approach to successfully translate Zopfli, a high-performance compression library with ∼3000 LoC and 98 functions. We validate the translation by testing equivalence with the source C program on a set of inputs. To our knowledge, this is the largest automated and test-validated C to safe Rust code translation achieved so far.
Jan Kels Heinrich-Heine-Universität Düsseldorf, Abdelhalim Dahou GESIS – Leibniz-Institute for the Social Sciences, Brigitte Mathiak GESIS – Leibniz-Institute for the Social Sciences
Shihao Xia The Pennsylvania State University, Mengting He The Pennsylvania State University, Linhai Song The Pennsylvania State University, Yiying Zhang University of California San Diego