How do programmers evaluate AI-generated code? (ESEIW 2025 - IDoESE - Doctoral Symposium )

Track

ESEIW 2025 IDoESE - Doctoral Symposium

Time Zone

The program is currently displayed in (GMT-10:00) Hawaii.

Use conference time zone: (GMT-10:00) HawaiiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 1 Oct 2025 02:15 - 03:00 at Online - Session IV

Abstract

Abstract—[Background] Applications of generative artificial intelligence are being proposed at a rapid pace to support various software engineering tasks. Although versatile, the performance of the tools depends on multiple factors that are not always visible to the users, and they tend to camouflage their failure points (i.e., to ”hallucinate”). In programming, the existing literature suggests that they shift the effort from writing to reading, comprehending, evaluating, and repairing generated code. The tools can also enable outsourcing these efforts, even when it might be unwise.

[Goal] The broad aim of this research is to investigate the interaction between programmers and code generation tools to understand how the tools support the needs of software practitioners. The specific approach of this research is to examine how, when, to what extent, and to what effect programmers read, comprehend, evaluate, critically reflect on, and rely on AI-generated code, with the goal of theory building.

[Research Methods] Research is expected to consist of in-depth qualitative analyses of realistic contexts. Research methods include a practitioner survey and interviews, a scoping review, and observational field studies.

[Expected Contributions] The expected contributions include empirical evidence about how programmers interact with generative AI and evaluate generated code in diverse and realistic scenarios. This evidence is used as a basis for theory building, with the broader goal of increasing our knowledge regarding the nature of support generative AI tools provide.

Time Zone

The program is currently displayed in (GMT-10:00) Hawaii.

Use conference time zone: (GMT-10:00) HawaiiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 1 Oct
Displayed time zone: Hawaii change

02:15 - 03:00	Session IVIDoESE - Doctoral Symposium at Online

02:15 45m Talk		How do programmers evaluate AI-generated code? IDoESE - Doctoral Symposium Samuli Määttä University of Oulu

How do programmers evaluate AI-generated code?

Program Display Configuration

Program Display Configuration

Wed 1 OctDisplayed time zone: Hawaii change

Samuli Määttä

University of Oulu

Wed 1 Oct
Displayed time zone: Hawaii change