Empirical Studies in AI-driven Software Engineering (AI4SE) (ESEIW 2024 - ISERN)

Who

Andreas Jedlitschka, Sira Vegas, Silverio Martínez-Fernández

Track

ESEIW 2024 ISERN

Time Zone

The program is currently displayed in (GMT+02:00) Brussels, Copenhagen, Madrid, Paris.

Use conference time zone: (GMT+02:00) Brussels, Copenhagen, Madrid, ParisSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 22 Oct 2024 16:00 - 17:30 at Agora (in front of Plaça Telecos) - Session 10

Abstract

Abstract: AI, and more recently LLM in particular, have found their way into all phases of the software development process. But little is known about what it means to have human and AI agents “working” closely together on a development task. E.g., when LLMs generate code, what does it mean for the quality assurance phase? The plan for continuing the work beyond ISERN includes developing recommendations, because we already saw several (not to say many) papers with some kind of “evaluation”.

Therefore, our main question is, what does this mean for empirical software engineering? While we do not foresee a fundamental shift in the application or methodology of empirical methods (such as experiments, etc.), we recognize the need to consider new aspects during the design of our studies. E.g., should we consider the AI agents, simply as a type of subject in our studies? How do we compare their “performance” to human “performance”?

There are also many more questions, such as how to evaluate the interaction between humans and AI agents.

Session Goals: To generate a common opinion on how to approach studies in the context of AI-driven/-supported software engineering tasks.

Development of the Session:

We plan for a 90 minutes session.
After a very brief introduction to the topic, we will use group work for different empirical methods and, depending on the number of attendees, for different phases of the development process (e.g., testing). The tasks to be conducted during the group work will be:
- Agree on one specific study (and the empirical method to be used).
- Sketch the design of such a study.
- Elaborate on challenges identified during the design. Mainly, what questions they have discussed concerning the AI part.
- We (Organizers) will collect all challenges and cluster them.
- Each group gets a cluster to work out solution proposals.

Background and Recommended Reading: Knowledge about the design of empirical studies.

Expected Outcomes and Plan for Continuing the Work beyond ISERN:

The optimal outcome would be an action plan, i.e., after the workshop, we have a common opinion on how we need to tackle studies in the context of AI-driven/-supported engineering tasks.
The plan for continuing the work beyond ISERN includes developing recommendations.

Andreas JedlitschkaCo-chair

Fraunhofer IESE

Germany

Sira VegasCo-chair

Universidad Politecnica de Madrid

Spain

Silverio Martínez-FernándezCo-chair

UPC-BarcelonaTech