This program is tentative and subject to change.
Memory safety violations in low-level code, written in languages like C, continues to remain one of the major sources of software vulnerabilities. One method of removing such violations by construction is to port C code to a safe C dialect. Such dialects rely on programmer-supplied annotations to guarantee safety with minimal runtime overhead. This porting, however, is a manual process that imposes significant burden on the programmer and, hence, there has been limited adoption of this technique.
The task of porting not only requires inferring annotations, but may also need refactoring/rewriting of the code to make it amenable to such annotations. In this paper, we use Large Language Models (LLMs) towards addressing both these concerns. We show how to harness LLM capabilities to do complex code reasoning as well as rewriting of large codebases. We also present a novel framework for whole-program transformations that leverages lightweight static analysis to break the transformation into smaller steps that can be carried out effectively by an LLM. We implement our ideas in a tool called MSA that targets the CheckedC dialect. We evaluate MSA on several micro-benchmarks, as well as real-world code ranging up to 20K lines of code. We showcase superior performance compared to a vanilla LLM baseline, as well as demonstrate improvement over a state-of-the-art symbolic (non-LLM) technique.
This program is tentative and subject to change.
Thu 1 MayDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 15:30 | |||
14:00 15mTalk | Large Language Models as Configuration Validators Research Track Xinyu Lian University of Illinois at Urbana-Champaign, Yinfang Chen University of Illinois at Urbana-Champaign, Runxiang Cheng University of Illinois at Urbana-Champaign, Jie Huang University of Illinois at Urbana-Champaign, Parth Thakkar Meta Platforms, Inc., Minjia Zhang UIUC, Tianyin Xu University of Illinois at Urbana-Champaign | ||
14:15 15mTalk | LLM Assistance for Memory Safety Research Track Nausheen Mohammed Microsoft Research, Akash Lal Microsoft Research, Aseem Rastogi Microsoft Research, Subhajit Roy IIT Kanpur, Rahul Sharma Microsoft Research | ||
14:30 15mTalk | Vulnerability Detection with Code Language Models: How Far Are We? Research Track Yangruibo Ding Columbia University, Yanjun Fu University of Maryland, Omniyyah Ibrahim King Abdulaziz City for Science and Technology, Chawin Sitawarin University of California, Berkeley, Xinyun Chen , Basel Alomair King Abdulaziz City for Science and Technology, David Wagner UC Berkeley, Baishakhi Ray Columbia University, New York;, Yizheng Chen University of Maryland | ||
14:45 15mTalk | Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications Research Track Wei Ma , Daoyuan Wu The Hong Kong University of Science and Technology, Yuqiang Sun Nanyang Technological University, Tianwen Wang National University of Singapore, Shangqing Liu Nanyang Technological University, Jian Zhang Nanyang Technological University, Yue Xue , Yang Liu Nanyang Technological University | ||
15:00 15mTalk | Towards Neural Synthesis for SMT-assisted Proof-Oriented ProgrammingAward Winner Research Track Saikat Chakraborty Microsoft Research, Gabriel Ebner Microsoft Research, Siddharth Bhat University of Cambridge, Sarah Fakhoury Microsoft Research, Sakina Fatima University of Ottawa, Shuvendu K. Lahiri Microsoft Research, Nikhil Swamy Microsoft Research | ||
15:15 15mTalk | Prompt-to-SQL Injections in LLM-Integrated Web Applications: Risks and Defenses Research Track Rodrigo Resendes Pedro INESC-ID / IST, Universidade de Lisboa, Miguel E. Coimbra INESC-ID / IST, Universidade de Lisboa, Daniel Castro INESC-ID / IST, Universidade de Lisboa, Paulo Carreira INESC-ID / IST, Universidade de Lisboa, Nuno Santos INESC-ID / Instituto Superior Tecnico, University of Lisbon |