Machine translation is essential for international communication and is extensively used across numerous human-related applications. Despite substantial progress, fairness issues persist in current machine translation systems. This paper addresses the intersection of machine translation testing and fairness testing, two critical and widely explored domains in software engineering. We introduce FairMT, an automated fairness testing approach specifically designed for evaluating machine translation systems. FairMT operates on the assumption that translations of semantically similar sentences, containing protected attributes from distinct demographic groups, should maintain comparable meanings. It comprises three key steps: (1) test input generation, producing inputs covering various demographic groups based on metamorphic relations; (2) test oracle generation, identifying potential unfair translations based on semantic similarity measurements; and (3) regression, discerning genuine fairness issues from those caused by low-quality translation. Leveraging FairMT, we conduct an empirical study on three leading machine translation systems—Google Translate, T5, and Transformer. Our investigation uncovers up to 832, 1,984, and 2,627 unfair translations across the three systems, respectively. Intriguingly, we observe that fair translations tend to exhibit better translation performance, challenging the conventional wisdom of a fairness-performance trade-off prevalent in the fairness literature.
Tue 24 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
| 16:00 - 17:40 | Fairness and GreenJournal First / Research Papers / Demonstrations at Aurora A  Chair(s): Aldeida Aleti Monash University | ||
| 16:0010m Talk | MANILA: A Low-Code Application to Benchmark Machine Learning Models and Fairness-Enhancing Methods Demonstrations Giordano d'Aloisio University of L'AquilaPre-print Media Attached | ||
| 16:1020m Talk | Fairness Testing of Machine Translation Systems Journal First Zeyu Sun Institute of Software, Chinese Academy of Sciences, Zhenpeng Chen Nanyang Technological University, Jie M. Zhang King's College London, Dan Hao Peking University | ||
| 16:3020m Talk | Bias behind the Wheel: Fairness Testing of Autonomous Driving Systems Journal First Xinyue Li Peking University, Zhenpeng Chen Nanyang Technological University, Jie M. Zhang King's College London, Federica Sarro University College London, Ying Zhang Peking University, Xuanzhe Liu Peking University | ||
| 16:5010m Talk | FAMLEM, the FAst ModuLar Energy Meter at Code Level Demonstrations Max Weber Leipzig University, Johannes Dorn Leipzig University, Sven Apel Saarland University, Norbert Siegmund Leipzig University | ||
| 17:0020m Talk | NLP Libraries, Energy Consumption and Runtime - An Empirical Study Research Papers Rajrupa Chattaraj Indian Institute of Technology Tirupati, India, Sridhar Chimalakonda Indian Institute of Technology TirupatiDOI | ||
| 17:2020m Talk | An adaptive language-agnostic pruning method for greener language models for code Research Papers Mootez Saad Dalhousie University, José Antonio Hernández López Linköping University, Boqi Chen McGill University, Daniel Varro Linköping University / McGill University, Tushar Sharma Dalhousie UniversityDOI Pre-print | ||
Aurora A is the first room in the Aurora wing.
When facing the main Cosmos Hall, access to the Aurora wing is on the right, close to the side entrance of the hotel.



