Abstract: AI, and more recently LLM in particular, have found their way into all phases of the software development process. But little is known about what it means to have human and AI agents “working” closely together on a development task. E.g., when LLMs generate code, what does it mean for the quality assurance phase? The plan for continuing the work beyond ISERN includes developing recommendations, because we already saw several (not to say many) papers with some kind of “evaluation”.
Therefore, our main question is, what does this mean for empirical software engineering? While we do not foresee a fundamental shift in the application or methodology of empirical methods (such as experiments, etc.), we recognize the need to consider new aspects during the design of our studies. E.g., should we consider the AI agents, simply as a type of subject in our studies? How do we compare their “performance” to human “performance”?
There are also many more questions, such as how to evaluate the interaction between humans and AI agents.
Session Goals: To generate a common opinion on how to approach studies in the context of AI-driven/-supported software engineering tasks.
Development of the Session:
- 
We plan for a 90 minutes session. 
- 
After a very brief introduction to the topic, we will use group work for different empirical methods and, depending on the number of attendees, for different phases of the development process (e.g., testing). The tasks to be conducted during the group work will be: - 
Agree on one specific study (and the empirical method to be used). 
- 
Sketch the design of such a study. 
- 
Elaborate on challenges identified during the design. Mainly, what questions they have discussed concerning the AI part. 
- 
We (Organizers) will collect all challenges and cluster them. 
- 
Each group gets a cluster to work out solution proposals. 
 
- 
Background and Recommended Reading: Knowledge about the design of empirical studies.
Expected Outcomes and Plan for Continuing the Work beyond ISERN:
- The optimal outcome would be an action plan, i.e., after the workshop, we have a common opinion on how we need to tackle studies in the context of AI-driven/-supported engineering tasks.
- The plan for continuing the work beyond ISERN includes developing recommendations.
Tue 22 OctDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
| 16:00 - 17:30 | |||
| 16:0090m Other | Empirical Studies in AI-driven Software Engineering (AI4SE) ISERN C: Andreas Jedlitschka Fraunhofer IESE, C: Sira Vegas Universidad Politecnica de Madrid, C: Silverio Martínez-Fernández UPC-BarcelonaTech | ||


