Automated Testing for Machine Translation via Constituency Invariance
With the rapid development of deep neural networks, machine translation has achieved significant progress. Machine translation has been integrated with people’s daily life to assist in various tasks. However, machine translators, which are essentially one kind of software, also suffer from software defects. Translation errors might cause misunderstanding or even lead to threats to personal safety, marketing blunders, and political crisis. Thus, almost all translation service providers have feedback channels to collect incorrect translations for alleviating the problem of scarce data resources and improving the product performance. Inspired by the syntax structure analysis method, we introduce the constituency invariance, which reflects the structural similarity between a simple sentence and the sentences derived from it, to assist in the testing of machine translation. We implement the constituency invariance into an automated tool CIT~that can detect translation errors by checking the constituency invariance relation between the translation results. CIT~adopts constituency parse trees to represent the syntactic structures of sentences and employ an efficient data augmentation method to derive multiple new sentences based on one sentence. To validate CIT, we experiment with three widely-used machine translators, i.e., Bing Microsoft Translator, Google Translate, and Youdao Translator. With 600 seed sentences as input, CIT detects 2212, 1910, and 1590 translation errors with around 77% precision. We have submitted detected errors to the development team.
Until we submit this paper, Google, Bing, and Youdao have fixed 15.4%, 32.0%, 14.3% of reported errors, respectively.
Wed 17 NovDisplayed time zone: Hobart change
12:00 - 13:00 | Testing ApplicationsIndustry Showcase / NIER track / Research Papers at Koala Chair(s): Scott Barnett | ||
12:00 20mTalk | Automated Testing for Machine Translation via Constituency Invariance Research Papers Pin Ji Nanjing University, Yang Feng Nanjing University, Jia Liu Nanjing University, Zhihong Zhao Nanjing Tech Unniversity, Baowen Xu Nanjing University | ||
12:20 10mTalk | Systematic Testing of Autonomous Driving Systems Using Map Topology-Based Scenario Classification NIER track Yun Tang Nanyang Technological University, Yuan Zhou Nanyang Technological University, Tianwei Zhang Nanyang Technological University, Fenghua Wu Nanyang Technological University, Yang Liu Nanyang Technological University, Gang Wang Alibaba Group | ||
12:30 10mTalk | Automatic HMI Structure Exploration Via Curiosity-Based Reinforcement Learning Industry Showcase Yushi Cao Nanyang Technological University, YAN ZHENG Nanyang Technological University, Shang-Wei Lin Nanyang Technological University, Singapore, Yang Liu Nanyang Technological University, Yon Shin Teo Continental Automotive Singapore Pte. Ltd., Yuxuan Toh Continental Automotive Singapore Pte. Ltd., Vinay Vishnumurthy Adiga Continental Automotive Singapore Pte. Ltd. |