Can Large Language Models (LLMs) compete with Human Requirement Reviewers? - Replication of an Inspection Experiment on Requirements Documents (PROFES 2024 - Research Papers)

Who

Daniel Seifert, Lisa Jöckel, Adam Trendowicz, Marcus Ciolkowski, Thorsten Honroth, Andreas Jedlitschka

Track

PROFES 2024 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Athens.

Use conference time zone: (GMT+02:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 3 Dec 2024 14:00 - 14:18 at UT Library - Room 2 (Seminar Room Tõstamaa) - PROFES Session 3: AI-Driven Approaches for Requirements Engineering Chair(s): Rebekka Wohlrab

Abstract

The applications of large language models (LLMs) for software engineering are growing, especially for code – typically for generating code, or for detecting or fixing quality problems. As software requirements are commonly written in natural language, it seems promising to leverage the capabilities of LLMs for detecting requirement issues. We replicated an inspection experiment where computer science students searched for defects in requirement documents using different reading techniques. For our replication, we used GPT-4-Turbo instead of human reviewers. Additionally, we considered GPT-3.5-Turbo, Nous-Hermes-2-Mixtral-8x7B-DPO, and Phi-3-medium-128k-instruct. We focus on single prompt approaches and refrain from more complex approaches (e.g., stepwise or agent-based). We proceeded in two phases. First, we explored the general feasibility of using LLMs for requirements inspection on a practice document and examined different prompts. Second, we applied selected approaches to two requirements documents and compared the approaches to each other and to human reviewers. The approaches include variations in reading techniques (ad-hoc, perspective-based, checklist-based), LLMs, and the instructions and material provided. We found that LLMs (a) report only a limited number of deficits despite having enough tokens on hand, which (b) do not vary a lot between the different prompts. They (c) seldom match the sample solution, and (d) only provide useful insights to a small degree.

Daniel Seifert

Fraunhofer IESE

Germany

Lisa Jöckel

Fraunhofer

Germany

Adam Trendowicz

Marcus Ciolkowski

QAware

Germany

Thorsten Honroth

Andreas Jedlitschka

Fraunhofer IESE

Germany

Time Zone

The program is currently displayed in (GMT+02:00) Athens.

Use conference time zone: (GMT+02:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 3 Dec
Displayed time zone: Athens change

14:00 - 15:30	PROFES Session 3: AI-Driven Approaches for Requirements EngineeringShort Papers and Posters / Research Papers / Industry Papers at UT Library - Room 2 (Seminar Room Tõstamaa) Chair(s): Rebekka Wohlrab Chalmers University of Technology

14:00 18m Research paper		Can Large Language Models (LLMs) compete with Human Requirement Reviewers? - Replication of an Inspection Experiment on Requirements Documents Research Papers Daniel Seifert Fraunhofer IESE, Lisa Jöckel Fraunhofer, Adam Trendowicz , Marcus Ciolkowski QAware, Thorsten Honroth , Andreas Jedlitschka Fraunhofer IESE
14:18 18m Industry talk		AI Act High-Risk Requirements Readiness: Industrial Perspectives and Case Company Insights Industry Papers Matthias Wagner Lund University, Rushali Gupta Lund University, Markus Borg CodeScene, Emelie Engstrom Lund University, Michal Lysek Independent Researcher
14:36 12m Short-paper		Early Results of an AI Multiagent System for Requirements Elicitation and Analysis Short Papers and Posters Malik Sami Tampere University, Muhammad Waseem University of Jyväskylä, Jyväskylä, Finland, Zheying Zhang Tampere University, Zeeshan Rasheed Tampere University, Kari Systa Tampere University, Pekka Abrahamsson Tampere University
14:48 12m Short-paper		ReqGenie: GPT-Powered Conversational-AI for Requirements Elicitation Short Papers and Posters Farnaz Fotrousi Chalmers University of Technology and University of Gothenburg, Theocharis Tavantzis Chalmers and University of Gothenburg
15:00 30m Talk		Session 3 Discussion Research Papers