Automatic Generation of Test Cases based on Bug Reports: a Feasibility Study with Large Language Models (ICSE 2024 - Posters)

Fri 12 - Sun 21 April 2024 Lisbon, Portugal

Who

Laura Plein, Wendkuuni Arzouma Marc Christian OUEDRAOGO, Jacques Klein, Tegawendé F. Bissyandé

Track

ICSE 2024 Posters

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 18 Apr 2024 15:30 - 16:00 at Open Space - Posters 4

Abstract

Software testing is a core discipline in software engineering where a large array of research results has been produced, notably in automatic test generation. However, the resulting test suits are often incomplete and can be qualified as simple (e.g. unit tests): they only cover parts of the project or they are produced after the bug is fixed and therefore can only serve as regression tests. Yet, several research challenges, such as automatic program repair, build on the assumption that available test suites are sufficient. There is thus a need to break existing barriers in automatic test case generation. While prior work largely focused on random unit testing inputs, we propose to consider generating test cases that realistically represent complex user execution scenarios, which reveal buggy behaviour. Such scenarios are informally described in bug reports, which should therefore be considered as natural inputs for specifying bug-triggering test cases. In this work, we investigate the feasibility of performing this generation by leveraging large language models (LLMs) and using bug reports as inputs. Our experiments consider various settings, including the use of ChatGPT, as an online service for accessing an LLM, as well as CodeGPT, an existing code-related pre-trained LLM that was fine-tuned for our task. Our study is carried out on the Defects4J dataset. Overall, we experimentally show that bug reports associated to up to 50% of Defects4J bugs can prompt ChatGPT to generate an executable test case. We show that even new bug reports (i.e., previously-unseen data to mitigate data leakage threat to validity), can indeed be used as input for generating the executable test cases.

Laura Plein

University of Luxembourg

Wendkuuni Arzouma Marc Christian OUEDRAOGO

University of Luxembourg

Luxembourg

Jacques Klein

University of Luxembourg

Luxembourg

Tegawendé F. Bissyandé

University of Luxembourg

Luxembourg

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 18 Apr
Displayed time zone: Lisbon change

15:30 - 16:00	Posters 4Posters at Open Space

15:30 30m Poster		Towards Data Augmentation for Supervised Code Translation Posters Binger Chen Technische Universität Berlin, Jacek golebiowski Amazon AWS, Ziawasch Abedjan Leibniz Universität Hannover
15:30 30m Poster		GDPR indications in commits messages in GitHub repositories Posters Georgia Kapitsaki University of Cyprus, Maria Papoutsoglou University of Cyprus
15:30 30m Poster		Automatic Generation of Test Cases based on Bug Reports: a Feasibility Study with Large Language Models Posters Laura Plein University of Luxembourg, Wendkuuni Arzouma Marc Christian OUEDRAOGO University of Luxembourg, Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg
15:30 30m Poster		How Does Pre-trained Language Model Perform on Deep Learning Framework Bug Prediction? Posters Xiaoting Du Beijing University of Posts and Telecommunications, Chenglong Li Beihang University, Xiangyue Ma Beihang University, Zheng Zheng Beihang University
15:30 30m Poster		xNose: A Test Smell Detector for C# Posters Partha Protim Paul Shahjalal University of Science & Technology, Md Tonoy Akanda Shahjalal University of Science & Technology, Mohammed Raihan Ullah Shahjalal University of Science & Technology, Dipto Mondal Shahjalal University of Science & Technology, Nazia Sultana Chowdhury Shahjalal University of Science & Technology, Fazle Mohammed Tawsif University of Southern California DOI Pre-print
15:30 30m Poster		Data vs. Model Machine Learning Fairness Testing: An Empirical Study Posters Arumoy Shome Delft University of Technology, Luís Cruz Delft University of Technology, Arie van Deursen Delft University of Technology
15:30 30m Poster		On the Effects of Program Slicing for Vulnerability Detection during Code Inspection: Extended Abstract Posters Aurora Papotti Vrije Universiteit Amsterdam, Fabio Massacci University of Trento; Vrije Universiteit Amsterdam, Katja Tuma Vrije Universiteit Amsterdam
15:30 30m Poster		Multi-step Automated Generation of Parameter Docstrings in Python: An Exploratory Study Posters Vatsal Venkatkrishna Australian National University, Durga Shree Nagabushanam Australian National University, Emmanuel Iko-Ojo Simon Australian National University, Melina Vidoni Australian National University DOI Authorizer link
15:30 30m Poster		Lightweight Semantic Conflict Detection with Static Analysis Posters Galileu Santos de Jesus Federal University of Pernambuco, Paulo Borba Federal University of Pernambuco, Rodrigo Bonifácio Computer Science Department - University of Brasília, Matheus Barbosa de Oliveira Federal University of Pernambuco
15:30 30m Poster		Energy Consumption of Automated Program Repair Posters Matias Martinez Universitat Politècnica de Catalunya (UPC), Silverio Martínez-Fernández UPC-BarcelonaTech, Xavier Franch Universitat Politècnica de Catalunya
15:30 30m Poster		ReviewRanker: A Semi-Supervised Learning Based Approach for Code Review Quality Estimation Posters Saifullah Mahbub United International University, Md. Easin Arafat Eötvös Loránd University, Chowdhury Rafeed Rahman National University of Singapore, Zannatul Ferdows United International University, Masum Hasan University of Rochester
15:30 30m Poster		LogPrompt: Prompt Engineering Towards Zero-Shot and Interpretable Log Analysis Posters Yilun Liu Huawei co. LTD, Shimin Tao University of Science and Technology of China; Huawei co. LTD, Weibin Meng Huawei co. LTD, Feiyu Yao Huawei co. LTD, Xiaofeng Zhao Huawei co. LTD, Hao Yang Huawei co. LTD
15:30 30m Poster		High-precision Online Log Parsing with Large Language Models Posters XiaoLei Chen Fudan University, Jie Shi Fudan University, ChenJ , Peng Wang Fudan University, Wei Wang Fudan University
15:30 30m Poster		Multi-requirement Parametric Falsification Posters Matteo Camilli Politecnico di Milano, Raffaela Mirandola Karlsruhe Institute of Technology (KIT)