Don't Settle for the First! How Many GitHub Copilot Solutions Should You Check? (ICSME 2025 - Journal First Track) - ICSME 2025 - International Conference on Software Maintenance and Evolution

Who

Julian Oertel, Jil Klünder, Regina Hebig

Track

ICSME 2025 Journal First Track

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 11 Sep 2025 16:30 - 16:45 at Case Room 3 260-055 - Session 11 - Human Factors 1 Chair(s): Gregorio Robles, Alexander Serebrenik

Abstract

Context. With the integration of generative artificial intelligence (GenAI) tools such as GitHub Copilot into development processes, developers can be supported when writing code. Objectives. As GitHub Copilot has a feature to provide up to ten solutions at once, we explore, how developers should approach those solutions with the goal of providing recommendations to achieve suitable trade-offs in finding correct solutions and checking solutions. Methods. In this study, we analyze a total of 2025 coding problems provided by LeetCode and 17,048 solutions to solve these problems generated by GitHub Copilot in Python. We focus on three key issues: firstly, whether it is beneficial to consider multiple solutions; secondly, the impact of the position of a solution; and thirdly, the number of solutions that should be checked by a developer. Results. Overall, our results point to the following observations: (1) solutions are not less likely to be correct if they appear at later positions; (2) when looking for a solution to a common problem, checking four to five solutions is generally enough; (3) novel or difficult problems are unlikely to be solved by GitHub Copilot; (4) skipping the first solution is advised when considering only one solution, as the first solution is less likely to be correct; and (5) checking all solutions is necessary to not miss correct solutions, but the effort is usually not justified. Conclusion. Based on our study, we conclude that there is potential for improvement in better supporting developers. For instance, there are few cases where ten generated solutions provide more value than fewer solutions. Depending on the use scenario, it could be more useful if GitHub Copilot allowed developers to request a single, comprehensive solution.

Julian Oertel

University of Rostock

Germany

Jil Klünder

University of Applied Sciences | FHDW Hannover

Germany

Regina Hebig

Universität Rostock, Rostock, Germany

Germany

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 11 Sep
Displayed time zone: Auckland, Wellington change

15:30 - 17:00	Session 11 - Human Factors 1Journal First Track / Research Papers Track at Case Room 3 260-055 Chair(s): Gregorio Robles Universidad Rey Juan Carlos, Alexander Serebrenik Eindhoven University of Technology

15:30 15m		Characterizing the System Evolution That is Proposed After a Software Incident Research Papers Track Matt Pope Brigham Young University, Jonathan Sillito Brigham Young University
15:45 15m		Social Media Reactions to Open Source Promotions: AI-Powered GitHub Projects on Hacker News Research Papers Track Prachnachai Meakpaiboonwattana Mahidol University, Warittha Tarntong Mahidol University, Thai Mekratanavorakul Mahidol University, Chaiyong Rakhitwetsagul Mahidol University, Thailand, Pattaraporn Sangaroonsilp Mahidol University, Raula Gaikovina Kula The University of Osaka, Morakot Choetkiertikul Mahidol University, Thailand, Kenichi Matsumoto Nara Institute of Science and Technology, Thanwadee Sunetnanta Mahidol University
16:00 15m		Does Editing Improve Answer Quality on Stack Overflow? A Data-Driven Investigation Research Papers Track Saikat Mondal University of Saskatchewan, Chanchal K. Roy University of Saskatchewan Pre-print
16:15 15m		Accessibility Rank: A Machine Learning Approach for Prioritizing Accessibility User Feedback Journal First Track Xiaoqi Chai Beihang University (Work conducted at The University of Auckland), James Tizard University of Auckland, Kelly Blincoe University of Auckland
16:30 15m		Don't Settle for the First! How Many GitHub Copilot Solutions Should You Check? Journal First Track Julian Oertel University of Rostock, Jil Klünder University of Applied Sciences \| FHDW Hannover, Regina Hebig Universität Rostock, Rostock, Germany
16:45 15m		Adoption of Automated Software Engineering Tools and Techniques in Thailand Journal First Track Chaiyong Rakhitwetsagul Mahidol University, Thailand, Jens Krinke University College London, Morakot Choetkiertikul Mahidol University, Thailand, Thanwadee Sunetnanta Mahidol University, Federica Sarro University College London