Property-based Test for Part-of-Speech Tagging Tool
Part-of-Speech (POS) tagging for sentences is a basic and widely-used Natural Language Processing (NLP) technique. People rely heavily on it to predict POS tags that serve as the base for many advanced NLP tasks, such as sentiment analysis, word sense disambiguation, and information retrieval. However, POS tagging tools could make wrong predictions, which bring consequent error propagation to the advanced tasks and even cause serious threats in critical application domains. In this paper, we propose to test POS tagging tools with Metamorphic Testing against some properties that they should follow. The preliminary exploration with two groups of Metamorphic Relations shows that our method can effectively reveal defects of three common POS tagging tools (i.e., spaCy, NLTK, and Flair) on handling fairly simple intra- and inter-sentence transformation regarding adverbial clause and sentence appending. This demonstrates the great potential of our method to deliver a systematic test and reveal the unaware issues, which may benefit the validation, repair, and improvement, for POS tagging tools.
Tue 16 NovDisplayed time zone: Hobart change
18:00 - 19:00 | Testing IResearch Papers / NIER track / Industry Showcase at Kangaroo Chair(s): Xiaoyin Wang University of Texas at San Antonio | ||
18:00 20mTalk | Testing Your Question Answering Software via Asking Recursively Research Papers Songqiang Chen School of Computer Science, Wuhan University, Shuo Jin School of Computer Science, Wuhan University, Xiaoyuan Xie School of Computer Science, Wuhan University, China | ||
18:20 20mTalk | Improving Test Case Generation for REST APIs Through Hierarchical Clustering Research Papers Dimitri Stallenberg Delft University of Technology, Mitchell Olsthoorn Delft University of Technology, Annibale Panichella Delft University of Technology DOI Pre-print | ||
18:40 10mTalk | Access Control Tree for Testing and Learning Industry Showcase Davrondzhon Gafurov Norsk Helsenett SF, Margrete Sunde Grovan Norsk Helsenett SF, Margrete Sunde Grovan Norsk Helsenett SF | ||
18:50 10mTalk | Property-based Test for Part-of-Speech Tagging Tool NIER track Shuo Jin School of Computer Science, Wuhan University, Songqiang Chen School of Computer Science, Wuhan University, Xiaoyuan Xie School of Computer Science, Wuhan University, China |