Developing a Llama-Based Chatbot for CI/CD Question Answering: A Case Study at Ericsson (ICSME 2024 - Industry Track)

Who

Daksh Chaudhary, Sri Lakshmi Vadlamani, Dimple Thomas, Shiva Nejati, Mehrdad Sabetzadeh

Track

ICSME 2024 Industry Track

Time Zone

The program is currently displayed in (GMT-07:00) Arizona.

Use conference time zone: (GMT-07:00) ArizonaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 10 Oct 2024 13:30 - 13:45 at Abineau - Session 9: Reflections and New Ideas Chair(s): Andrea Capiluppi

Abstract

This paper presents our experience developing a Llama-based chatbot for question answering about continuous integration and continuous delivery (CI/CD) at Ericsson, a multinational telecommunications company. Our chatbot is designed to handle the specificities of CI/CD documents at Ericsson, employing a retrieval-augmented generation (RAG) model to enhance accuracy and relevance. Our empirical evaluation of the chatbot on industrial CI/CD-related questions indicates that an ensemble retriever, combining BM25 and embedding retrievers, yields the best performance. When evaluated against a ground truth of 72 CI/CD questions and answers at Ericsson, our most accurate chatbot configuration provides fully correct answers for 61.11% of the questions, partially correct answers for 26.39%, and incorrect answers for 12.50%. Through an error analysis of the partially correct and incorrect answers, we discuss the underlying causes of inaccuracies and provide insights for further refinement. We also reflect on lessons learned and suggest future directions for further improving our chatbot’s accuracy.

Link to Preprint

https://arxiv.org/pdf/2408.09277

Daksh Chaudhary

University of Ottawa

Canada

Sri Lakshmi Vadlamani

Ericsson

Canada

Dimple Thomas

Ericsson

Canada

Shiva Nejati

University of Ottawa

Canada

Mehrdad Sabetzadeh

University of Ottawa

Canada

Time Zone

The program is currently displayed in (GMT-07:00) Arizona.

Use conference time zone: (GMT-07:00) ArizonaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 10 Oct
Displayed time zone: Arizona change

13:30 - 15:00	Session 9: Reflections and New IdeasIndustry Track / Registered Reports Track / New Ideas and Emerging Results Track / Journal First Track at Abineau Chair(s): Andrea Capiluppi University of Groningen

13:30 15m		Developing a Llama-Based Chatbot for CI/CD Question Answering: A Case Study at EricssonIndustry Track Paper Industry Track Daksh Chaudhary University of Ottawa, Sri Lakshmi Vadlamani Ericsson, Dimple Thomas Ericsson, Shiva Nejati University of Ottawa, Mehrdad Sabetzadeh University of Ottawa Pre-print
13:45 10m		RevToken: A Token-Level Review Recommendation: How Far Are We?NIER Paper New Ideas and Emerging Results Track Yasuhito Morikawa Nara Institute of Science and Technology, Yutaro Kashiwa Nara Institute of Science and Technology, Kenji Fujiwara Nara Women’s University, Hajimu Iida Nara Institute of Science and Technology
13:55 10m		Leveraging LSTM and Pre-trained Models for Effective Summarization of Stack Overflow PostsNIER Paper New Ideas and Emerging Results Track Anh M. T. Bui Hanoi University of Science and Technology, Nguyen Duc-Loc Hanoi University of Science and Technology
14:05 15m		Integrating Lean Processes and Engineering Discipline into Work Culture Over 20 Years: An Experience ReportIndustry Track Paper Industry Track Doug Durham Don't Panic Labs, Bonita Sharif University of Nebraska-Lincoln, USA
14:20 15m		A reflection on the impact of model mining from GitHubJ1C2 Paper Journal First Track Gregorio Robles Universidad Rey Juan Carlos, Michel Chaudron Eindhoven University of Technology, The Netherlands, Rodi Jolak RISE Research Institutes of Sweden and Mid Sweden University, Regina Hebig Universität Rostock, Rostock, Germany
14:35 10m		Analyzing the Ripple Effects of Refactoring. A Registered ReportRegistered Reports Paper Registered Reports Track Mikel Robredo University of Oulu, Matteo Esposito University of Oulu, Fabio Palomba University of Salerno, Rafael Peñaloza University of Milano-Bicocca, Valentina Lenarduzzi University of Oulu DOI Pre-print
14:45 10m		Learning Strategies using Boolean Program Metrics to Verify Industrial CodeIndustry Track Paper Industry Track Bharti Chimdyalwar Tata Consultancy Services, Priyanka Darke Tata Consultancy Services, Manoj Alladawar TCS Research, Sahil Sulakhe TCS Research, R Venkatesh , Supratik Chakraborty IIT Bombay