Natural Test Generation for Precise Testing of Question Answering Software (ASE 2022 - Research Papers)

Who

Qingchao Shen, Junjie Chen, Jie M. Zhang, Haoyu Wang, Shuang Liu, Menghan Tian

Track

ASE 2022 Research Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 12 Oct 2022 17:00 - 17:20 at Banquet A - Technical Session 18 - Testing II Chair(s): Darko Marinov

Abstract

Question answering (QA) software uses information retrieval and natural language processing techniques to automatically answer questions posed by humans in a natural language. Like other AI- based software, QA software may contain bugs. To automatically test QA software without human labeling, previous work extracts facts from question answer pairs and generates new questions to detect QA software bugs. Nevertheless, the generated questions can be ambiguous, confusing, or with chaotic syntax, which are unanswerable for QA software. As a result, a large proportion of the reported bugs are false positives. In this work, we proposed QATest, a sentence-level mutation based metamorphic testing tool for QA software. To eliminate false positives and achieve precise automatic testing, QATest leverages five Metamorphic Relations (MRs) as well as semantics-guided searching and enhanced test oracles. Our evaluation on three QA datasets demonstrates that QATest outperforms the state-of-the-art in both quantity (8,133 vs. 6,601 bugs) and quality (97.67% vs. 49% true positive rate) of the reported bugs. Moreover, the test inputs generated by QATest successfully reduce MR violation rate from 44.29% to 20.51% when being adopted in fine-tuning the QA software under test.

Link to Preprint

https://drive.google.com/file/d/1Cw_h3maEIQGnywAGwj6NAoPuI5DaOweH/view?usp=sharing

Qingchao Shen

Tianjin University

China

Junjie Chen

Tianjin University

China

Jie M. Zhang

King's College London

United Kingdom

Haoyu Wang

College of Intelligence and Computing, Tianjin University

Shuang Liu

Tianjin University

China

Menghan Tian

College of Intelligence and Computing, Tianjin University

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 12 Oct
Displayed time zone: Eastern Time (US & Canada) change

16:00 - 18:00	Technical Session 18 - Testing IIResearch Papers / Tool Demonstrations / Journal-first Papers at Banquet A Chair(s): Darko Marinov University of Illinois at Urbana-Champaign

16:00 10m Demonstration		Shibboleth: Hybrid Patch Correctness Assessment in Automated Program Repair Tool Demonstrations Ali Ghanbari Iowa State University, Andrian Marcus University of Texas at Dallas
16:10 20m Research paper		Auto Off-Target: Enabling Thorough and Scalable Testing for Complex Software Systems Research Papers Tomasz Kuchta Samsung Electronics, Bartosz Zator Samsung Electronics DOI Pre-print
16:30 10m Demonstration		Maktub: Lightweight Robot System Test Creation and Automation Tool Demonstrations Amr Moussa North Carolina State University, John-Paul Ore North Carolina State University
16:40 20m Paper		Cerebro: Static Subsuming Mutant Selection Journal-first Papers Aayush Garg University of Luxembourg, Milos Ojdanic University of Luxembourg, Renzo Degiovanni SnT, University of Luxembourg, Thierry Titcheu Chekam SES S.A. & University of Luxembourg (SnT), Mike Papadakis University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg Link to publication DOI
17:00 20m Research paper		Natural Test Generation for Precise Testing of Question Answering SoftwareVirtual Research Papers Qingchao Shen Tianjin University, Junjie Chen Tianjin University, Jie M. Zhang King's College London, Haoyu Wang College of Intelligence and Computing, Tianjin University, Shuang Liu Tianjin University, Menghan Tian College of Intelligence and Computing, Tianjin University Pre-print
17:20 20m Paper		GloBug: Using global data in Fault LocalizationVirtual Journal-first Papers Nima Miryeganeh University of Calgary, Sepehr Hashtroudi University of Calgary, Hadi Hemmati University of Calgary Link to publication DOI
17:40 20m Research paper		Selectively Combining Multiple Coverage Goals in Search-Based Unit Test GenerationVirtual Research Papers Zhichao Zhou School of Information Science and Technology, ShanghaiTech University, Yuming Zhou Nanjing University, Chunrong Fang Nanjing University, Zhenyu Chen Nanjing University, Yutian Tang ShanghaiTech University DOI Pre-print