Using GitHub Copilot for Test Generation in Python: An Empirical Study (AST 2024)

Who

Khalid El Haji, Carolin Brandt, Andy Zaidman

Track

AST 2024

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 15 Apr 2024 14:00 - 14:20 at Amália Rodrigues - Session 2: Test Generation Chair(s): Sarmad Bashir

Abstract

Writing unit tests is a crucial task in software development, but it is also recognized as a time-consuming and tedious task. As such, numerous test generation approaches have been proposed and investigated. However, most of these test generation tools produce tests that are typically difficult to understand. Recently, Large Language Models (LLMs) have shown promising results in generating source code and supporting software engineering tasks. As such, we investigate the usability of tests generated by GitHub Copilot, a proprietary closed-source code generation tool that uses an LLM. We evaluate GitHub Copilot’s test generation abilities both within and without an existing test suite, and we study the impact of different code commenting strategies on test generations. Our investigation evaluates the usability of 290 tests generated by GitHub Copilot for 53 sampled tests from open source projects. Our findings highlight that within an existing test suite, approximately 45.28% of the tests generated by Copilot are passing tests; 54.72% of generated tests are failing, broken, or empty tests. Furthermore, if we generate tests using Copilot without an existing test suite in place, we observe that 92.45% of the tests are failing, broken, or empty tests. Additionally, we study how test method comments influence the usability of test generations.

Link to Preprint

https://carolin-brandt.de/publications/elhaji-ast24.pdf

DOI

https://doi.org/10.1145/3644032.3644443

Khalid El Haji

Delft University of Technology

Netherlands

Carolin Brandt

Delft University of Technology

Netherlands

Andy Zaidman

Delft University of Technology

Netherlands

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 15 Apr
Displayed time zone: Lisbon change

14:00 - 15:30	Session 2: Test GenerationAST 2024 at Amália Rodrigues Chair(s): Sarmad Bashir RISE Research Institutes of Sweden

14:00 20m Full-paper		Using GitHub Copilot for Test Generation in Python: An Empirical Study AST 2024 Khalid El Haji Delft University of Technology, Carolin Brandt Delft University of Technology, Andy Zaidman Delft University of Technology DOI Pre-print
14:20 20m Full-paper		Grammar-Based Action Selection Rules for Scriptless Testing AST 2024 Lianne V. Hufkens Open Universiteit, Fernando Pastor Ricós Universitat Politècnica de València, Beatriz Marín Universitat Politècnica de València, Tanja E. J. Vos Universitat Politècnica de València and Open Universiteit
14:40 20m Full-paper		Fences: Systematic Sample Generation for JSON Schemas using Boolean Algebra and Flow Graphs AST 2024 Björn Otto Institute for Automation and Communication, OVGU Magdeburg, Tobias Kleinert Chair of Information and Automation Systems for Process and Material Technology, RWTH Aachen
15:00 10m Poster		Generating Software Tests for Mobile Applications Using Fine-Tuned Large Language Models AST 2024 Jacob Hoffmann Institute AIFB, Karlsruhe Institue of Technology (KIT), Demian Frister Institute of Applied Informatics and Formal Description Methods (AIFB) Karlsruhe Institue of Technology (KIT) DOI