Internetware 2024
Wed 24 - Fri 26 July 2024 Macau, China

In empirical software engineering (EMSE), various activities require human participation, such as data collection, processing, analysis, and comprehension. On one hand, these processes are time-consuming and labor-intensive. On the other hand, human participation may introduce bias. With the rise of large language models (LLMs) like ChatGPT, the potential for these models to enhance productivity has become apparent. However, the auxiliary capabilities and effectiveness of LLMs in EMSE tasks have rarely been explored. To fill this gap, in this paper, we evaluate the performance of LLMs by using scenarios of human participation in EMSE tasks, i.e., EMSEbench. We conduct replication experiments using four LLMs (ChatGPT4.0, ERNIE Bot4.0, Gemini3.0, and ChatGLM4.0), evaluating the difference in performance across seven scenarios collected from papers published in top SE venues. In the experiments, we perform three types of prompts, i.e., zero-shot, one-shot, and optimized one-shot. Besides, we leverage the concept of multi-agent workflow to explore the performance improvement and limitations of LLMs. Our study summarizes six findings, which facilitate the understanding of the auxiliary of LLMs in EMSE tasks.

Wed 24 Jul

Displayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change

11:20 - 12:35
Session 1: AI for Software EngineeringResearch Track / Tool Demonstration Track / New Idea Track at Main Conference Room
Chair(s): Yongqiang Tian The Hong Kong University of Science and Technology
11:20
15m
Full-paper
An Empirical Study on Code Search Pre-trained Models: Academic Progresses vs. Industry Requirements
Research Track
Kuo Chi , Chuanyi Li Nanjing University, Jidong Ge Nanjing University, Bin Luo Nanjing University
11:35
15m
Full-paper
CRABS-former: Cross-Architecture Binary Code Similarity Detection based on Transformer
Research Track
Yuhong Feng Shenzhen University, Haoran Li Shenzhen University, Yixuan Cao ShenZhen University, Yufeng Wang ShenZhen University, Haiyue Feng College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
11:50
15m
Full-paper
On the Heterophily of Program Graphs: A Case Study of Graph-based Type Inference
Research Track
Senrong Xu , Jiamei Shen , Yunfang Li , Yuan Yao Nanjing University, Ping Yu , Feng Xu Nanjing University, Xiaoxing Ma Nanjing University
12:05
15m
Full-paper
An Exploratory Evaluation of Large Language Models Using Empirical Software Engineering Tasks
Research Track
Wenjun Liang Nanjing University of Aeronautics and Astronautics, China, Guanping Xiao Nanjing University of Aeronautics and Astronautics
12:20
15m
Full-paper
LLM-Enhanced Theorem Proving with Term Explanation and Tactic Parameter Repair
Research Track
Xingpeng Liu , Hengzhu Liu , Xiaodong Yi , Ji Wang School of Computer, National University of Defense Technology, China