ChatGPT-Based Test Generation for Refactoring Engines Enhanced by Feature Analysis on Examples
Software refactoring is widely employed to improve software quality. However, conducting refactorings manually is tedious, time-consuming, and error-prone. Consequently, automated and semi-automated tool support is highly desirable for software refactoring in the industry, and most of the main-stream IDEs provide powerful tool support for refactoring. However, complex refactoring engines are prone to errors, which in turn may result in imperfect and incorrect refactorings. To this end, in this paper, we propose a ChatGPT-based approach to testing refactoring engines. We first manually analyze bug reports and test cases associated with refactoring engines, and construct a feature library containing fine-grained features that may trigger defects in refactoring engines. The approach automatically generates prompts according to both predefined prompt templates and features randomly selected from the feature library, requesting ChatGPT to generate test programs with the requested features. Test programs generated by ChatGPT are then forwarded to multiple refactoring engines for differential testing. To the best of our knowledge, it is the first approach in testing refactoring engines that guides test program generation with features derived from existing bugs. It is also the first approach in this line that exploits LLMs in the generation of test programs. Our initial evaluation of four main-stream refactoring engines suggests that the proposed approach is effective. It identified a total of 115 previously unknown bugs besides 28 inconsistent refactoring behaviors among different engines. Among the 115 bugs, 78 have been manually confirmed by the original developers of the tested engines, i.e., IntelliJ IDEA, Eclipse, VScode-Java, and NetBeans.
Fri 2 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | AI for SE 3New Ideas and Emerging Results (NIER) / Journal-first Papers / Research Track / SE In Practice (SEIP) at Canada Hall 1 and 2 Chair(s): Ying Zou Queen's University, Kingston, Ontario | ||
11:00 15mTalk | A First Look at Conventional Commits Classification Research Track Qunhong Zeng Beijing Institute of Technology, Yuxia Zhang Beijing Institute of Technology, Zhiqing Qiu Beijing Institute of Technology, Hui Liu Beijing Institute of Technology | ||
11:15 15mTalk | ChatGPT-Based Test Generation for Refactoring Engines Enhanced by Feature Analysis on Examples Research Track Chunhao Dong Beijing Institute of Technology, Yanjie Jiang Peking University, Yuxia Zhang Beijing Institute of Technology, Yang Zhang Hebei University of Science and Technology, Hui Liu Beijing Institute of Technology | ||
11:30 15mTalk | SECRET: Towards Scalable and Efficient Code Retrieval via Segmented Deep Hashing Research Track Wenchao Gu The Chinese University of Hong Kong, Ensheng Shi Xi’an Jiaotong University, Yanlin Wang Sun Yat-sen University, Lun Du Microsoft Research, Shi Han Microsoft Research, Hongyu Zhang Chongqing University, Dongmei Zhang Microsoft Research, Michael Lyu The Chinese University of Hong Kong | ||
11:45 15mTalk | UniGenCoder: Merging Seq2Seq and Seq2Tree Paradigms for Unified Code Generation New Ideas and Emerging Results (NIER) Liangying Shao School of Informatics, Xiamen University, China, Yanfu Yan William & Mary, Denys Poshyvanyk William & Mary, Jinsong Su School of Informatics, Xiamen University, China | ||
12:00 15mTalk | How is Google using AI for internal code migrations? SE In Practice (SEIP) Stoyan Nikolov Google, Inc., Daniele Codecasa Google, Inc., Anna Sjovall Google, Inc., Maxim Tabachnyk Google, Siddharth Taneja Google, Inc., Celal Ziftci Google, Satish Chandra Google, Inc | ||
12:15 7mTalk | LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation Journal-first Papers Sarah Fakhoury Microsoft Research, Aaditya Naik University of Pennsylvania, Georgios Sakkas University of California at San Diego, Saikat Chakraborty Microsoft Research, Shuvendu K. Lahiri Microsoft Research Link to publication | ||
12:22 7mTalk | The impact of Concept drift and Data leakage on Log Level Prediction Models Journal-first Papers Youssef Esseddiq Ouatiti Queen's university, Mohammed Sayagh ETS Montreal, University of Quebec, Noureddine Kerzazi Ensias-Rabat, Bram Adams Queen's University, Ahmed E. Hassan Queen’s University, Youssef Esseddiq Ouatiti Queen's university |