Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights
Large Language Models for Code (or code LLMs) are increasingly gaining popularity and capabilities, offering a wide array of functionalities such as code completion, code generation, code summarization, test generation, code translation, and more. To leverage code LLMs to their full potential, developers must provide code-specific contextual information to the models. These are typically derived and distilled using program analysis tools. However, there exists a significant gap—these static analysis tools are often language-specific and come with a steep learning curve, making their effective use challenging. These tools are tailored to specific program languages, requiring developers to learn and manage multiple tools to cover various aspects of the their code base. Moreover, the complexity of configuring and integrating these tools into the existing development environments add an additional layer of difficulty. This challenge limits the potential benefits that could be gained from more widespread and effective use of static analysis in conjunction with LLMs.
To address this challenge, we present CodeLLM-Devkit (hereafter, CLDK), an open-source library that significantly simplifies the process of performing program analysis at various levels of granularity for different programming languages to support code LLM use cases. As a Python library, CLDK offers developers an intuitive and user-friendly interface, making it incredibly easy to provide rich program analysis context to code LLMs. With this library, developers can effortlessly integrate detailed, code-specific insights that enhance the operational efficiency and effectiveness of LLMs in coding tasks. CLDK is available as an open-source library at https://github.com/IBM/codellm-devkit.
Wed 25 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
11:00 - 12:20 | Program Analysis 3Research Papers / Demonstrations / Industry Papers at Cosmos 3D Chair(s): Earl T. Barr University College London | ||
11:00 10mTalk | MITHRAS: A Dynamic Analysis Framework for the Mobile-IoT Ecosystem Demonstrations Francesco Pagano University of Verona, Mariano Ceccato University of Verona, Alessio Merlo CASD - School of Advanced Defense Studies, Paolo Tonella USI Lugano | ||
11:10 10mTalk | Refactoring Detection in C++ Programs with RefactoringMiner++ Demonstrations Benjamin Ritz Graz University of Technology, Aleksandar Karakaš Carinthia University of Applied Sciences, Denis Helic Graz University of Technology | ||
11:20 20mTalk | Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights Industry Papers Rahul Krishna IBM Research, Rangeet Pan IBM Research, Saurabh Sinha IBM Research, Srikanth Tamilselvam IBM Research, Raju Pavuluri IBM T.J. Watson Research Center, Maja Vukovic IBM Research | ||
11:40 20mTalk | Towards Diverse Program Transformations for Program Simplification Research Papers Haibo Wang Concordia University, Zezhong Xing Southern University of Science and Technology, Chengnian Sun University of Waterloo, Zheng Wang University of Leeds, Shin Hwei Tan Concordia University DOI | ||
12:00 20mTalk | CRISPE: Semantic-Guided Execution Planning and Dynamic Reasoning for Enhancing Code Coverage Prediction Research Papers Hridya Dhulipala University of Texas at Dallas, Aashish Yadavally University of Texas at Dallas, Smit Soneshbhai Patel University of Texas at Dallas, Tien N. Nguyen University of Texas at Dallas DOI |
Cosmos 3D is the fourth room in the Cosmos 3 wing.
When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.