Evaluating Terminology Translation in Machine Translation Systems via Metamorphic Testing (ASE 2024 - Research Papers)

Who

Yihui Xu, Yanhui Li, Jun Wang, Xiaofang Zhang

Track

ASE 2024 Research Papers

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 29 Oct 2024 10:30 - 10:45 at Magnoila - SE for AI 1 Chair(s): Chengcheng Wan

Abstract

Machine translation has become an integral part of daily life, with terminology translation playing a crucial role in ensuring the accuracy of translation results. However, existing translation systems, such as Google Translate, have been shown to occasionally produce errors in terminology translation. Current metrics for assessing terminology translation rely on reference translations and bilingual dictionaries, limiting their effectiveness in large-scale automated MT system testing.

To address this challenge, we propose a novel method: Metamorphic Testing for Terminology Translation (TermMT), which achieves effective and efficient testing for terminology translation in MT systems without relying on reference translations or bilingual terminology dictionaries. Our approach involves constructing metamorphic relations based on the characteristics of terms: (a) adding an appropriate reference of the term in the given context would \textit{not change} the translation of the term; (b) if we modify part of a multi-word term, the translation of the revised word combination would \textit{change}. To evaluate the effectiveness of TermMT, we tested the terminology translation capabilities of three machine translation systems, Google Translate, Bing Microsoft Translator, and mBART, using the English portion of the bilingual UM-corpus dataset. The results show that TermMT detected a total of 3,765 translation errors on Google Translate, 2,351 on Bing Microsoft Translator, and 6,011 on mBART, with precisions of 82.33%, 83.00%, and 86.33%, respectively.

DOI

https://doi.org/10.1145/3691620.3695069

Yihui Xu

Soochow University

China

Yanhui Li

Nanjing University

China

Jun Wang

Nanjing University

China

Xiaofang Zhang

Soochow University

China

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 29 Oct
Displayed time zone: Pacific Time (US & Canada) change

10:30 - 12:00	SE for AI 1NIER Track / Journal-first Papers / Research Papers at Magnoila Chair(s): Chengcheng Wan East China Normal University

10:30 15m Talk		Evaluating Terminology Translation in Machine Translation Systems via Metamorphic Testing Research Papers Yihui Xu Soochow University, Yanhui Li Nanjing University, Jun Wang Nanjing University, Xiaofang Zhang Soochow University DOI
10:45 15m Talk		Mutual Learning-Based Framework for Enhancing Robustness of Code Models via Adversarial Training Research Papers Yangsen Wang Peking University, Yizhou Chen Peking University, Yifan Zhao Peking University, Zhihao Gong Peking University, Junjie Chen Tianjin University, Dan Hao Peking University DOI Pre-print
11:00 15m Talk		Supporting Safety Analysis of Image-processing DNNs through Clustering-based Approaches Journal-first Papers Mohammed Attaoui University of Luxembourg, Fabrizio Pastore University of Luxembourg, Lionel Briand University of Ottawa, Canada; Lero centre, University of Limerick, Ireland
11:15 15m Talk		Challenges and Practices of Deep Learning Model Reengineering: A Case Study on Computer Vision Journal-first Papers Wenxin Jiang Purdue University, Vishnu Banna Purdue University, Naveen Vivek Purdue University, Abhinav Goel Purdue University, Nicholas Synovic Loyola University Chicago, George K. Thiruvathukal Loyola University Chicago, James C. Davis Purdue University Link to publication DOI Media Attached File Attached
11:30 10m Talk		A Conceptual Framework for Quality Assurance of LLM-based Socio-critical Systems NIER Track Luciano Baresi Politecnico di Milano, Matteo Camilli Politecnico di Milano, Tommaso Dolci Politecnico di Milano, Giovanni Quattrocchi Politecnico di Milano
11:40 10m Talk		Towards Robust ML-enabled Software Systems: Detecting Out-of-Distribution data using Gini Coefficients NIER Track Hala Abdelkader Applied Artificial Intelligence Institute, Deakin University, Jean-Guy Schneider Monash University, Mohamed Abdelrazek Deakin University, Australia, Priya Rani RMIT University, Rajesh Vasa Deakin University, Australia
11:50 10m Talk		Attacks and Defenses for Large Language Models on Coding Tasks NIER Track Chi Zhang , Zifan Wang Center for AI Safety, Ruoshi Zhao Independent Researcher, Ravi Mangal Colorado State University, Matt Fredrikson Carnegie Mellon University, Limin Jia , Corina S. Păsăreanu Carnegie Mellon University; NASA Ames