A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How (ASE 2024 - Research Papers)

Who

Chaozheng Wang, Shuzheng Gao, Cuiyun Gao, Wenxuan Wang, Chun Yong Chong, Shan Gao, Michael Lyu

Track

ASE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 30 Oct 2024 13:30 - 13:45 at Camellia - LLM for SE 2 Chair(s): Wenxi Wang

Abstract

API suggestion is a critical task in modern software development, assisting programmers by predicting and recommending third-party APIs based on the current context. Recent advancements in large code models (LCMs) have shown promise in the API suggestion task. However, they mainly focus on suggesting which APIs to use, ignoring that programmers may demand more assistance while using APIs in practice including when to use the suggested APIs and how to use the APIs. To mitigate the gap, we conduct a systematic evaluation of LCMs for the API suggestion task in the paper.

To facilitate our investigation, we first build a benchmark that contains a diverse collection of code snippets, covering 176 APIs used in 853 popular Java projects. Three distinct scenarios in the API suggestion task are then considered for evaluation, including (1) \textit{when to use}'', which aims at determining the desired position and timing for API usage; (2)\textit{which to use}‘‘, which aims at identifying the appropriate API from a given library; and (3) \textit{how to use}'', which aims at predicting the arguments for a given API. The consideration of the three scenarios allows for a comprehensive assessment of LCMs' capabilities in suggesting APIs for developers. During the evaluation, we choose nine popular LCMs with varying model sizes for the three scenarios. We also perform an in-depth analysis of the influence of context selection on the model performance. Our experimental results reveal multiple key findings. For instance, LCMs present the best performance in thehow to use’’ scenario while performing the worst in the when to use'' scenario, e.g., the average performance gap of LCMs betweenwhen to use'' and how to use'' scenarios achieves 34\%, indicating that thewhen to use'' scenario is more challenging. Furthermore, enriching context information substantially improves the model performance. Specifically, by incorporating the contexts, smaller-sized LCMs can outperform those twenty times larger models without the contexts provided. Based on these findings, we finally provide insights and implications for researchers and developers, which can lay the groundwork for future advancements in the API suggestion task.

Chaozheng Wang

The Chinese University of Hong Kong

Shuzheng Gao

Chinese University of Hong Kong

China

Cuiyun Gao

Harbin Institute of Technology

China

Wenxuan Wang

Chinese University of Hong Kong

China

Chun Yong Chong

Huawei

Shan Gao

Huawei

Michael Lyu

The Chinese University of Hong Kong

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 30 Oct
Displayed time zone: Pacific Time (US & Canada) change

13:30 - 15:00	LLM for SE 2NIER Track / Research Papers / Industry Showcase / Tool Demonstrations at Camellia Chair(s): Wenxi Wang University of Virgina

13:30 15m Talk		A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How Research Papers Chaozheng Wang The Chinese University of Hong Kong, Shuzheng Gao Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Wenxuan Wang Chinese University of Hong Kong, Chun Yong Chong Huawei, Shan Gao Huawei, Michael Lyu The Chinese University of Hong Kong
13:45 15m Talk		AutoDW: Automatic Data Wrangling Leveraging Large Language Models Industry Showcase Lei Liu Fujitsu Laboratories of America, Inc., So Hasegawa Fujitsu Research of America Inc., Shailaja Keyur Sampat Fujitsu Research of America Inc., Maria Xenochristou Fujitsu Research of America Inc., Wei-Peng Chen Fujitsu Research of America, Inc., Takashi Kato Fujitsu Research, Taisei Kakibuchi Fujitsu Research, Tatsuya Asai Fujitsu Research
14:00 15m Talk		Instructive Code Retriever: Learn from Large Language Model's Feedback for Code Intelligence Tasks Research Papers jiawei lu Zhejiang University, Haoye Wang Hangzhou City University, Zhongxin Liu Zhejiang University, Keyu Liang Zhejiang University, Lingfeng Bao Zhejiang University, Xiaohu Yang Zhejiang University
14:15 15m Talk		WaDec: Decompile WebAssembly Using Large Language Model Research Papers Xinyu She Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Haoyu Wang Huazhong University of Science and Technology
14:30 10m Talk		LLM4Workflow: An LLM-based Automated Workflow Model Generation Tool Tool Demonstrations Jia Xu Anhui University, Weilin Du Anhui University, Xiao Liu School of Information Technology, Deakin University, Xuejun Li School of Computer Science and Technology, Anhui University
14:40 10m Talk		GPTZoo: A Large-scale Dataset of GPTs for the Research Community NIER Track Xinyi Hou Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Shenao Wang Huazhong University of Science and Technology, Haoyu Wang Huazhong University of Science and Technology
14:50 10m Talk		Emergence of A Novel Domain Expert: A Generative AI-based Framework for Software Function Point Analysis NIER Track Zheng Zhao BUAA, Ran Zhao ZAQC, Hongxiang Jiang BUAA, Bing He ZAQC