An Adversarial-Attack Approach to Assess Bias in Text-to-Image Models
This study introduces a novel approach to uncovering and assessing biases in text-to-image models by employing adversarial prompts. Our strategy involves several key steps: (1) selecting words indicative of bias from existing literature; (2) crafting a set of original and noisy prompts, where noisy refers to prompts modified with positive or negative connotations; (3) applying these prompts across various text-to-image models; (4) annotating the generated images for demographic attributes such as gender and skin tone; and (5) analyzing and comparing the representation of demographic groups in images produced from both original and modified prompts. Our findings are derived from a dataset of 54 adversarial prompts tested on 6 text-to-image models, resulting on 324 generated images. The analysis revealed that even slight modifications to prompts could significantly influence the portrayal of demographic groups, often amplifying biases.
Sun 12 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
16:00 - 17:30 | |||
16:00 45mKeynote | From Biased to Trustworthy AI: Responsible Agentic Software Engineering Beyond Code RAIE Rashina Hoda Monash University | ||
16:45 15mTalk | An Adversarial-Attack Approach to Assess Bias in Text-to-Image Models RAIE Keya Gangadharan Penn State University, Nathalia Nascimento Pennsylvania State University, Paulo Alencar University of Waterloo, Myron David Peixoto Federal University of Alagoas, Audrey Vasconcelos Federal University of Alagoas (UFAL), Davy Baia Federal University of Alagoas, Baldoino Fonseca Federal University of Alagoas (UFAL) | ||
17:00 15mTalk | Reliable SOC LLM Agent with Workflow Generation RAIE Shohei Mitani Georgetown University, Siddharth Yadav Virginia Tech, Shinichiro Matsuo Georgetown University, Eric Burger Virginia Tech | ||
17:15 15mTalk | Anonymous-by-Construction: An LLM-Driven Framework for Privacy-Preserving Text RAIE | ||