Biasfinder: Metamorphic test generation to uncover bias for sentiment analysis systems
Artificial intelligence systems, such as Sentiment Analysis (SA) systems, typically learn from large amounts of data that may reflect human bias. Consequently, such systems may exhibit unintended demographic bias against specific characteristics (e.g., gender, occupation, country-of-origin, etc.). Such bias manifests in an SA system when it predicts different sentiments for similar texts that differ only in the characteristic of individuals described. To automatically uncover bias in SA systems, this paper presents BiasFinder, an approach that can discover biased predictions in SA systems via metamorphic testing. A key feature of BiasFinder is the automatic curation of suitable templates from any given text inputs, using various Natural Language Processing (NLP) techniques to identify words that describe demographic characteristics. Next, BiasFinder generates new texts from these templates by mutating words associated with a class of a characteristic (e.g., gender-specific words such as female names, “she”, “her”). These texts are then used to tease out bias in an SA system. BiasFinder identifies a bias-uncovering test case (BTC) when an SA system predicts different sentiments for texts that differ only in words associated with a different class (e.g., male vs. female) of a target characteristic (e.g., gender). We evaluate BiasFinder on 10 SA systems and 2 large scale datasets, and the results show that BiasFinder can create more BTCs than two popular baselines. We also conduct an annotation study and find that human annotators consistently think that test cases generated by BiasFinder are more fluent than the two baselines.
Fri 19 MayDisplayed time zone: Hobart change
15:45 - 17:15 | Metamorphic testingSEIP - Software Engineering in Practice / Technical Track / Journal-First Papers / SEIS - Software Engineering in Society at Meeting Room 102 Chair(s): Shiva Nejati University of Ottawa | ||
15:45 15mTalk | MTTM: Metamorphic Testing for Textual Content Moderation Software Technical Track Wenxuan Wang The Chinese University of Hong Kong, Jen-tse Huang The Chinese University of Hong Kong, Weibin Wu Sun Yat-sen University, Jianping Zhang The Chinese University of Hong Kong, Yizhan Huang The Chinese University of Hong Kong, Shuqing Li The Chinese University of Hong Kong, Pinjia He Chinese University of Hong Kong at Shenzhen, Michael Lyu The Chinese University of Hong Kong | ||
16:00 15mTalk | Metamorphic Shader Fusion for Testing Graphics Shader Compilers Technical Track Dongwei Xiao The Hong Kong University of Science and Technology, Zhibo Liu Hong Kong University of Science and Technology, Shuai Wang Hong Kong University of Science and Technology | ||
16:15 15mPaper | Metamorphic Testing and Debugging of Tax Preparation Software SEIS - Software Engineering in Society Saeid Tizpaz-Niari University of Texas at El Paso, Verya Monjezi University of Texas at El Paso, Morgan Wagner University of Texas at El Paso, Shiva Darian University of Colorado Boulder, Krystia Reed University of Texas at El Paso, Ashutosh Trivedi University of Colorado Boulder Pre-print | ||
16:30 7mTalk | Biasfinder: Metamorphic test generation to uncover bias for sentiment analysis systems Journal-First Papers Muhammad Hilmi Asyrofi School of Computing and Information Systems, Singapore Management University, Zhou Yang Singapore Management University, Imam Nur Bani Yusuf Singapore Management University, Singapore, Hong Jin Kang UCLA, Ferdian Thung Singapore Management University, David Lo Singapore Management University | ||
16:37 7mTalk | Automated Metamorphic Testing using Transitive Relations for Specializing Stance Detection Models SEIP - Software Engineering in Practice Alisa Arno IBM Research - Tokyo, Futoshi Iwama IBM Research - Tokyo, Mikio Takeuchi IBM Research - Tokyo | ||
16:45 15mTalk | MorphQ: Metamorphic Testing of the Qiskit Quantum Computing Platform Technical Track Pre-print |