Program Decomposition and Translation with Static Analysis
Thu 18 Apr 2024 11:36 - 11:48 at Vianna da Motta - SRC Presentations Chair(s): Mattia Fazzini, André Restivo
The rising popularity of Large Language Models (LLMs) has motivated exploring their use in code-related tasks. Code LLMs with more than millions of parameters are trained on a massive amount of code in different Programming Languages (PLs). Such models are used for automating various Software Engineering (SE) tasks using prompt engineering. However, given the very large size of industry-scale project files, a major issue of these LLMs is their limited context window size, motivating the question of “Can these LLMs process very large files and can we effectively perform prompt engineering?”. Code translation aims to convert source code from one PL to another. In this work, we assess the effect of method-level program decomposition on context window of LLMs and investigate how this approach can enable translation of very large files which originally could not be done due to out-of-context issue. Our observations from 20 well-known java projects and approximately 60K methods suggest that method-level program decomposition significantly improves the limited context window problem of LLMs by 99.5%. Furthermore, our empirical analysis indicate that with method-level decomposition, each input fragment on average only consumes 5% of the context window, leaving more context space for prompt engineering and the output. Finally, we investigate the effectiveness of a Call Graph (CG) approach for translating very large files when doing method-level program decomposition.
Poster (icse24_poster.pdf) | 1.51MiB |
Ali Reza Ibrahimzada is a PhD student at the University of Illinois Urbana-Champaign (UIUC). He is also a member of the Intelligent CAT Lab, which is part of the Programming Languages, Formal Methods, and Software Engineering (PLFMSE) group at UIUC. Ali Reza is mostly interested in using Deep Neural Networks for Software and Program Analysis.
Wed 17 AprDisplayed time zone: Lisbon change
16:00 - 17:30 | SRC PostersSRC - ACM Student Research Competition at Open Space Chair(s): Mattia Fazzini University of Minnesota, André Restivo LIACC, Universidade do Porto, Porto, Portugal | ||
16:00 90mPoster | Program Decomposition and Translation with Static Analysis SRC - ACM Student Research Competition Ali Reza Ibrahimzada University of Illinois Urbana-Champaign DOI Pre-print File Attached | ||
16:00 90mPoster | IntTracer: Sanitization-aware IO2BO Vulnerability Detection across Codebases SRC - ACM Student Research Competition Xiang Chen Shanghai Jiao Tong University | ||
16:00 90mPoster | Vulnerability Root Cause Function Locating For Java Vulnerabilities SRC - ACM Student Research Competition Lyuye Zhang Nanyang Technological University | ||
16:00 90mPoster | Flakiness Repair in the Era of Large Language Models SRC - ACM Student Research Competition Yang Chen University of Illinois at Urbana-Champaign | ||
16:00 90mPoster | Refining Abstract Specifications into Dangerous Traffic Scenarios SRC - ACM Student Research Competition Aren Babikian McGill University | ||
16:00 90mPoster | An Ensemble Method for Bug Triaging using Large Language Models SRC - ACM Student Research Competition Atish Kumar Dipongkor University of Central Florida | ||
16:00 90mPoster | Classifying Source Code: How Far Can Compressor-based Classifiers Go? SRC - ACM Student Research Competition Zhou Yang Singapore Management University |
Thu 18 AprDisplayed time zone: Lisbon change
11:00 - 12:30 | SRC PresentationsSRC - ACM Student Research Competition at Vianna da Motta Chair(s): Mattia Fazzini University of Minnesota, André Restivo LIACC, Universidade do Porto, Porto, Portugal | ||
11:00 12mPoster | An Ensemble Method for Bug Triaging using Large Language Models SRC - ACM Student Research Competition Atish Kumar Dipongkor University of Central Florida | ||
11:12 12mPoster | Classifying Source Code: How Far Can Compressor-based Classifiers Go? SRC - ACM Student Research Competition Zhou Yang Singapore Management University | ||
11:24 12mPoster | Flakiness Repair in the Era of Large Language Models SRC - ACM Student Research Competition Yang Chen University of Illinois at Urbana-Champaign | ||
11:36 12mPoster | Program Decomposition and Translation with Static Analysis SRC - ACM Student Research Competition Ali Reza Ibrahimzada University of Illinois Urbana-Champaign DOI Pre-print File Attached | ||
11:48 12mPoster | Refining Abstract Specifications into Dangerous Traffic Scenarios SRC - ACM Student Research Competition Aren Babikian McGill University |