An Empirical Study of Python Library Migration Using Large Language Models
This program is tentative and subject to change.
Library migration is the process of replacing one library with another library that provides similar functionality. Manual library migration is time consuming and error prone, as it requires developers to understand the APIs of both libraries, map them, and perform the necessary code transformations. Due to its difficulty, most of the existing automated techniques and tooling stop at the API mapping stage or support a limited set of code transformations. On the other hand, Large Language Models (LLMs) are good at generating and transforming code and finding similar code, which are necessary upstream tasks for library migration. Such capabilities suggest that LLMs may be suitable for library migration. Accordingly, this paper investigates the effectiveness of LLMs for migration between Python libraries. We evaluate three LLMs, LLama 3.1, GPT-4o mini, and GPT-4o on PYMIGBENCH, where we migrate 321 real-world library migrations that include 2,989 migration-related code changes. To measure correctness, we (1) compare the LLM’s migrated code with the developers’ migrated code in the benchmark and (2) run the unit tests available in the client repositories. We find that LLama 3.1, GPT-4o mini, and GPT-4o correctly migrate 89%, 89%, and 94% of the migration-related code changes, respectively. We also find that 36%, 52% and 64% of the LLama 3.1, GPT-4o mini, and GPT-4o migrations pass the same tests that passed in the developer’s migration. To ensure the LLMs are not reciting the migrations, we also evaluate them on 10 new repositories where the migration never happened. Overall, our results suggest that LLMs can be effective in migrating code between libraries, but we also identify some open challenges.
This program is tentative and subject to change.
Tue 18 NovDisplayed time zone: Seoul change
14:00 - 15:30 | |||
14:00 10mTalk | Enhancing LLMs with Staged Grouping and Dehallucination for Header File Decomposition Research Papers Yue Wang Peking University, Jiaxuan Sun Peking University, Yanzhen Zou Peking University, Bing Xie Peking University | ||
14:10 10mResearch paper | Speculative Automated Refactoring of Imperative Deep Learning Programs to Graph Execution Research Papers Raffi Khatchadourian CUNY Hunter College, Tatiana Castro Vélez University of Puerto Rico, Rio Piedras Campus, Mehdi Bagherzadeh Oakland University, Nan Jia City University of New York (CUNY) Graduate Center, Anita Raja City University of New York (CUNY) Hunter College Pre-print Media Attached | ||
14:20 10mTalk | An Empirical Study of Python Library Migration Using Large Language Models Research Papers Mohayeminul Islam University of Alberta, Ajay Jha North Dakota State University, May Mahmoud New York University Abu Dhabi, Ildar Akhmetov Northeastern University, Sarah Nadi New York University Abu Dhabi | ||
14:30 10mTalk | Measuring the Impact of Predictive Models on the Software Project: A Cost, Service Time, and Risk Evaluation of a Metric-based Defect Severity Prediction Model Journal-First Track Umamaheswara Sharma B National Institute of Technology, Calicut, Ravichandra Sadam National Institute of Technology Warangal | ||
14:40 10mTalk | Demystifying the Evolution of Neural Networks with BOM Analysis: Insights from a Large-Scale Study of 55,997 GitHub Repositories Research Papers xiaoning ren , Yuhang Ye University of Science and Technology of China, Xiongfei Wu University of Luxembourg, Yueming Wu Huazhong University of Science and Technology, Yinxing Xue Institute of AI for Industries, Chinese Academy of Sciences | ||
14:50 10mTalk | Fact-Aligned and Template-Constrained Static Analyzer Rule Enhancement with LLMs Research Papers Zongze Jiang Huazhong University of Science and Technology, Ming Wen Huazhong University of Science and Technology, Ge Wen Huazhong University of Science and Technology, Hai Jin Huazhong University of Science and Technology | ||
15:00 10mTalk | MCTS-Refined CoT: High-Quality Fine-Tuning Data for LLM-Based Repository Issue Resolution Research Papers Yibo Wang Northeastern University, Zhihao Peng Northeastern University, Ying Wang Northeastern University, Zhao Wei Tencent, Hai Yu Northeastern University, China, Zhiliang Zhu Northeastern University, China | ||
15:10 10mTalk | Software Reconfiguration in Robotics Journal-First Track Patrizio Pelliccione Gran Sasso Science Institute, L'Aquila, Italy, Sven Peldszus IT University of Copenhagen, Davide Brugali University of Bergamo, Italy, Daniel Strüber Chalmers | University of Gothenburg / Radboud University, Thorsten Berger Ruhr University Bochum | ||
15:20 10mTalk | CROSS2OH: Enabling Seamless Porting of C/C++ Software Libraries to OpenHarmony Research Papers Qian Zhang University of California at Riverside, Li Tsz On The Hong Kong University of Science and Technology, Ying Wang Northeastern University, Li Li Beihang University, Shing-Chi Cheung Hong Kong University of Science and Technology | ||