"Is It Responsible?" Emerging Results on Comparing Guardrails for Harm Mitigation in LLM-enhanced Software Applications (ESEIW 2025 - ESEM - Emerging Results and Vision Track )

Who

Manoel Veríssimo dos Santos Neto, Mohamad Kassab, Arlindo Galvão, Valdemar Vicente Graciano Neto, Edson OliveiraJr

Track

ESEIW 2025 ESEM - Emerging Results and Vision Track

Time Zone

The program is currently displayed in (GMT-10:00) Hawaii.

Use conference time zone: (GMT-10:00) HawaiiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 3 Oct 2025 15:56 - 16:12 at Kaiulani I - Responsible and Inclusive AI in Software Engineering Chair(s): Italo Santos

Abstract

The rapid adoption of Large Language Models (LLMs) in the engineering of software applications, such as customer service chatbots, has brought significant benefits, but has also posed substantial risks. Generation of biased, inappropriate, or harmful responses are among the potential problems that can arise when using LLM as a COTS, connecting its chat to the user interface of a software application. This paper brings emerging results of an exploratory study that compares commercial guardrail frameworks to evaluate their ability to retain unappropriate content during a chat conversation. We empirically evaluate three guardrail frameworks — LLM Guard, Llama Guard, and OpenAI Moderation — against two datasets of toxic and offensive content. Results show that improvements are still needed, since the assessed guardrails frameworks achieved high accuracy for one of the datasets (more than 90%) but underperformed in other metrics, showing that toxic or dangerous content could still be delivered for users if this is deployed in a chatbot, for instance. We hope these results assist researchers and practitioners in selecting appropriate guardrails to improve harm mitigation in LLM-based applications.

Manoel Veríssimo dos Santos Neto

Universidade Federal de Goiás (UFG)

Brazil

Mohamad Kassab

Boston University, USA

United States

Arlindo Galvão

Universidade Federal de Goiás

Brazil

Valdemar Vicente Graciano Neto

Universidade Federal de Goiás (UFG)

Brazil

Edson OliveiraJr

State University of Maringá

Brazil

Time Zone

The program is currently displayed in (GMT-10:00) Hawaii.

Use conference time zone: (GMT-10:00) HawaiiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 3 Oct
Displayed time zone: Hawaii change

15:40 - 17:00	Responsible and Inclusive AI in Software EngineeringESEM - Emerging Results and Vision Track / ESEM - Journal First Track / ESEM - Registered Reports Track / ESEM - Technical Track / at Kaiulani I Chair(s): Italo Santos University of Hawai‘i at Mānoa

15:40 16m Talk		Invisible Risks, Visible Code: A Vision for Understanding Ethical Debt in AI-Based Coding ESEM - Emerging Results and Vision Track Dina Salah City St George’s, University of London
15:56 16m Talk		"Is It Responsible?" Emerging Results on Comparing Guardrails for Harm Mitigation in LLM-enhanced Software Applications ESEM - Emerging Results and Vision Track Manoel Veríssimo dos Santos Neto Universidade Federal de Goiás (UFG), Mohamad Kassab Boston University, USA, Arlindo Galvão Universidade Federal de Goiás, Valdemar Vicente Graciano Neto Universidade Federal de Goiás (UFG), Edson OliveiraJr State University of Maringá
16:12 16m Talk		Trust, Transparency, and Adoption in Generative AI for Software Engineering: Insights from Twitter Discourse ESEM - Journal First Track Manaal Ramadan Basha The University of British Columbia, Gema Rodriguez-Perez The University of British Columbia
16:28 16m Talk		Toward Inclusive AI-Driven Development: Exploring Gender Differences in Code Generation Tool Interactions ESEM - Registered Reports Track Manaal Ramadan Basha The University of British Columbia, Ivan Beschastnikh The University of British Columbia, Gema Rodriguez-Perez The University of British Columbia, Cleidson de Souza Universidade Federal do Pará
16:44 16m Talk		Beyond Binary Moderation: Identifying Fine-Grained Sexist and Misogynistic Behavior on GitHub with Large Language Models ESEM - Technical Track Tanni Dev Wayne State University, Sayma Sultana Wayne State University, Amiangshu Bosu Wayne State University