TCSE logo 
 Sigsoft logo
Sustainability badge
Thu 1 May 2025 15:00 - 15:15 at 215 - SE for AI 3 Chair(s): Lina Marsso

Large Language Models (LLM) have shown stunning abilities to carry out tasks that were previously conducted by humans. The future role of humans and the responsibilities assigned to non-human LLMs affects society fundamentally. In that context, LLMs have often been compared to humans. However, it is surprisingly difficult to make a fair empirical comparison between humans and LLMs. To address those difficulties, we aim at establishing a systematic approach to guide researchers in comparing LLMs with humans across various linguistic and cognitive tasks. We developed a reference model of the information flow in an exploratory research study. Through a literature review, we examined key differences and similarities among several existing studies. We propose a framework to support researchers in designing and executing studies, and in assessing LLMs with respect to humans. Future studies can use the reference model as guidance for designing and reporting their own unique study design by mapping key decisions to the decision points of that reference model. We want to support researchers and the society to take a maturation step in this emerging and constantly growing field.

Thu 1 May

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30
SE for AI 3Research Track / SE in Society (SEIS) / Journal-first Papers at 215
Chair(s): Lina Marsso École Polytechnique de Montréal
14:00
15m
Talk
Dissecting Global Search: A Simple yet Effective Method to Boost Individual Discrimination Testing and RepairSE for AI
Research Track
Lili Quan Tianjin University, Li Tianlin NTU, Xiaofei Xie Singapore Management University, Zhenpeng Chen Nanyang Technological University, Sen Chen Nankai University, Lingxiao Jiang Singapore Management University, Xiaohong Li Tianjin University
Pre-print
14:15
15m
Talk
FixDrive: Automatically Repairing Autonomous Vehicle Driving Behaviour for $0.08 per ViolationSE for AI
Research Track
Yang Sun Singapore Management University, Chris Poskitt Singapore Management University, Kun Wang Zhejiang University, Jun Sun Singapore Management University
Link to publication DOI Pre-print File Attached
14:30
15m
Talk
MARQ: Engineering Mission-Critical AI-based Software with Automated Result Quality AdaptationSE for AIArtifact-FunctionalArtifact-AvailableArtifact-Reusable
Research Track
Uwe Gropengießer Technical University of Darmstadt, Elias Dietz Technical University of Darmstadt, Florian Brandherm Technical University of Darmstadt, Achref Doula Technical University of Darmstadt, Osama Abboud Munich Research Center, Huawei, Xun Xiao Munich Research Center, Huawei, Max Mühlhäuser Technical University of Darmstadt
14:45
15m
Talk
An Empirical Study of Challenges in Machine Learning Asset ManagementSE for AI
Journal-first Papers
Zhimin Zhao Queen's University, Yihao Chen Queen's University, Abdul Ali Bangash Software Analysis and Intelligence Lab (SAIL), Queen's University, Canada, Bram Adams Queen's University, Ahmed E. Hassan Queen’s University
15:00
15m
Talk
A Reference Model for Empirically Comparing LLMs with HumansSE for AI
SE in Society (SEIS)
Kurt Schneider Leibniz Universität Hannover, Software Engineering Group, Farnaz Fotrousi Chalmers University of Technology and University of Gothenburg, Rebekka Wohlrab Chalmers University of Technology
15:15
7m
Talk
Building Domain-Specific Machine Learning Workflows: A Conceptual Framework for the State-of-the-PracticeSE for AI
Journal-first Papers
Bentley Oakes Polytechnique Montréal, Michalis Famelis Université de Montréal, Houari Sahraoui DIRO, Université de Montréal
DOI Pre-print File Attached
:
:
:
: