Mutation Testing for Task-Oriented Chatbots (EASE 2024 - Research Papers)

Who

Pablo Gómez-Abajo, Sara Perez-Soler, Pablo C Canizares, Esther Guerra, Juan de Lara

Track

EASE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 19 Jun 2024 14:13 - 14:26 at Room Vietri - Testing Chair(s): Samira Silva

Abstract

Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.

To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.

Link to Preprint

https://www.miso.es/pubs/ease24.pdf

Pablo Gómez-Abajo

Universidad Autónoma de Madrid

Spain

Sara Perez-Soler

Universidad Autónoma de Madrid

Spain

Pablo C Canizares

Autonomous University of Madrid, Spain

Spain

Esther Guerra

Universidad Autónoma de Madrid

Spain

Juan de Lara

Autonomous University of Madrid

Spain

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 19 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:20	TestingResearch Papers / Short Papers, Vision and Emerging Results at Room Vietri Chair(s): Samira Silva Gran Sasso Science Institute (GSSI)

14:00 13m Talk		Using Large Language Models to Generate JUnit Tests: An Empirical Study Research Papers Mohammed Latif Siddiq University of Notre Dame, Joanna C. S. Santos University of Notre Dame, Ridwanul Hasan Tanvir Pennsylvania State University, Noshin Ulfat IQVIA Inc., Fahmid Al Rifat United International University, Vinicius Carvalho Lopes University of Notre Dame Pre-print
14:13 13m Talk		Mutation Testing for Task-Oriented Chatbots Research Papers Pablo Gómez-Abajo Universidad Autónoma de Madrid, Sara Perez-Soler Universidad Autónoma de Madrid, Pablo C Canizares Autonomous University of Madrid, Spain, Esther Guerra Universidad Autónoma de Madrid, Juan de Lara Autonomous University of Madrid Pre-print
14:26 13m Talk		A Catalog of Transformations to Remove Test Smells From Natural Language TestsDistinguished Paper Award Research Papers Manoel Aranda III Federal University of Alagoas, Naelson Oliveira Federal University of Alagoas, Elvys Soares Federal Institute of Alagoas (IFAL), Márcio Ribeiro Federal University of Alagoas, Brazil, Davi Romão Federal University of Alagoas, Ullyanne Patriota Federal University of Alagoas, Rohit Gheyi Federal University of Campina Grande, Emerson Paulo Soares de Souza Federal University of Pernambuco, Ivan Machado Federal University of Bahia Pre-print
14:40 13m Talk		An Empirical Study on Code Coverage of Performance Testing Research Papers Muhammad Imran Università degli Studi dell'Aquila, Vittorio Cortellessa University of L'Aquila, Davide Di Ruscio University of L'Aquila, Riccardo Rubei University of L'Aquila, Luca Traini University of L'Aquila Link to publication DOI
14:53 13m Talk		AI-Generated Test Scripts for Web E2E Testing with ChatGPT and Copilot: A preliminary study Short Papers, Vision and Emerging Results Maurizio Leotta DIBRIS, University of Genova, Italy, Hafiz Zeeshan Yousaf Università di Genova, Filippo Ricca Università di Genova, Boni Garcia Universidad Carlos III de Madrid
15:06 13m Talk		Towards Predicting Fragility in End-to-End Web Tests Short Papers, Vision and Emerging Results Sergio Di Meglio Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II