ICSE 2025
Sat 26 April - Sun 4 May 2025 Ottawa, Ontario, Canada
Sun 27 Apr 2025 16:45 - 17:07 at 213 - Session 3: Evaluating and Improving Bot Impact Chair(s): Ahmad Abdellatif

As Large Language Models (LLMs) are increasingly adopted in software engineering, recently in the form of conversational assistants, ensuring these technologies align with developers’ needs is essential. The limitations of traditional human-centered methods for evaluating LLM-based tools at scale raise the need for automatic evaluation. In this paper, we advocate combining insights from human-computer interaction (HCI) and artificial intelligence (AI) research to enable human-centered automatic evaluation of LLM-based conversational SE assistants. We identify requirements for such evaluation and challenges down the road, working towards a framework that ensures these assistants are designed and deployed in line with user needs.

Sun 27 Apr

Displayed time zone: Eastern Time (US & Canada) change

16:00 - 17:30
Session 3: Evaluating and Improving Bot ImpactBotSE at 213
Chair(s): Ahmad Abdellatif University of Calgary

Ahmad Abdellatif

16:00
22m
Talk
Towards a Newcomers Dataset to Assess Conversational Agent�s Efficacy in Mentoring Newcomers
BotSE
Misan Etchie NAU RESHAPE LAB, Hunter Beach NAU RESHAPE LAB, Katia Romero Felizardo NAU RESHAPE LAB, Igor Steinmacher NAU RESHAPE LAB
16:22
22m
Talk
Bot-Driven Development: From Simple Automation to Autonomous Software Development Bots
BotSE
Christoph Treude Singapore Management University, Chris Poskitt Singapore Management University
Pre-print
16:45
22m
Talk
Bridging HCI and AI Research for the Evaluation of Conversational SE Assistants
BotSE
Jonan Richards Radboud University, Mairieli Wessel Radboud University
17:07
22m
Talk
Reducing Alert Fatigue via AI-Assisted Negotiation: A Case for Dependabot
BotSE
Raula Gaikovina Kula The University of Osaka