Using GitHub Copilot for Test Generation in Python: An Empirical Study
Writing unit tests is a crucial task in software development, but it is also recognized as a time-consuming and tedious task. As such, numerous test generation approaches have been proposed and investigated. However, most of these test generation tools produce tests that are typically difficult to understand. Recently, Large Language Models (LLMs) have shown promising results in generating source code and supporting software engineering tasks. As such, we investigate the usability of tests generated by GitHub Copilot, a proprietary closed-source code generation tool that uses an LLM. We evaluate GitHub Copilot’s test generation abilities both within and without an existing test suite, and we study the impact of different code commenting strategies on test generations. Our investigation evaluates the usability of 290 tests generated by GitHub Copilot for 53 sampled tests from open source projects. Our findings highlight that within an existing test suite, approximately 45.28% of the tests generated by Copilot are passing tests; 54.72% of generated tests are failing, broken, or empty tests. Furthermore, if we generate tests using Copilot without an existing test suite in place, we observe that 92.45% of the tests are failing, broken, or empty tests. Additionally, we study how test method comments influence the usability of test generations.
Mon 15 AprDisplayed time zone: Lisbon change
14:00 - 15:30 | Session 2: Test GenerationAST 2024 at Amália Rodrigues Chair(s): Sarmad Bashir RISE Research Institutes of Sweden | ||
14:00 20mFull-paper | Using GitHub Copilot for Test Generation in Python: An Empirical Study AST 2024 Khalid El Haji Delft University of Technology, Carolin Brandt Delft University of Technology, Andy Zaidman Delft University of Technology DOI Pre-print | ||
14:20 20mFull-paper | Grammar-Based Action Selection Rules for Scriptless Testing AST 2024 Lianne V. Hufkens Open Universiteit, Fernando Pastor Ricós Universitat Politècnica de València, Beatriz Marín Universitat Politècnica de València, Tanja E. J. Vos Universitat Politècnica de València and Open Universiteit | ||
14:40 20mFull-paper | Fences: Systematic Sample Generation for JSON Schemas using Boolean Algebra and Flow Graphs AST 2024 Björn Otto Institute for Automation and Communication, OVGU Magdeburg, Tobias Kleinert Chair of Information and Automation Systems for Process and Material Technology, RWTH Aachen | ||
15:00 10mPoster | Generating Software Tests for Mobile Applications Using Fine-Tuned Large Language Models AST 2024 Jacob Hoffmann Institute AIFB, Karlsruhe Institue of Technology (KIT), Demian Frister Institute of Applied Informatics and Formal Description Methods (AIFB) Karlsruhe Institue of Technology (KIT) DOI |