Finding XPath Bugs in XML Document Processors via Differential Testing
Extensible Markup Language (XML) is a widely used file format for data storage and transmission. Many XML processors support XPath, a query language that enables the extraction of elements from XML documents. These systems can be affected by logic bugs, which are bugs that cause the processor to return incorrect results. In order to tackle such bugs, we propose a new approach, which we realized as a system called XPress. As a test oracle, XPress relies on differential testing, which compares the results of multiple systems on the same test input, and identifies bugs through discrepancies in their outputs. As test inputs, XPress generates both XML documents and XPath queries. Aiming to generate meaningful queries that compute non-empty results, XPress selects a so-called targeted node to guide the XPath expression generation process. Using the targeted node, XPress generates XPath expressions that reference existing context related to the targeted node, such as its tag name and attributes, while also guaranteeing that a predicate evaluates to true before further expanding the query. We tested our approach on six mature XML processors, BaseX, eXist-DB, Saxon, PostgreSQL, libXML2, and a commercial database system. In total, we have found 20 unique bugs in these systems, of which 25 have been verified by the developers, and 12 of which have been fixed. XPress is efficient, as it finds 12 unique bugs i BaseX in 24 hours, which is 2x as fast as naive random generation. We expect that the effectiveness and simplicity of our approach will help to improve the robustness of many XML processors.
Fri 19 AprDisplayed time zone: Lisbon change
14:00 - 15:30 | Testing: various bug types 3Research Track / Demonstrations / Software Engineering Education and Training at Fernando Pessoa Chair(s): Fernando Brito e Abreu ISCTE-IUL | ||
14:00 15mTalk | Testing Graph Database Systems via Equivalent Query Rewriting Research Track Qiuyang Mang The Chinese University of Hong Kong, Shenzhen, Aoyang Fang Chinese University of Hong Kong, Shenzhen, BoXi Yu The Chinese University of Hong Kong, Shenzhen, Hanfei Chen The Chinese University of Hong Kong, Shenzhen, Pinjia He Chinese University of Hong Kong, Shenzhen | ||
14:15 15mTalk | ROSInfer: Statically Inferring Behavioral Component Models for ROS-based Robotics Systems Research Track Tobias Dürschmid Carnegie Mellon University, USA, Christopher Steven Timperley Carnegie Mellon University, David Garlan Carnegie Mellon University, Claire Le Goues Carnegie Mellon University DOI | ||
14:30 15mTalk | Finding XPath Bugs in XML Document Processors via Differential Testing Research Track Shuxin Li Southern University of Science and Technology, Manuel Rigger National University of Singapore | ||
14:45 15mTalk | Sedar: Obtaining High-Quality Seeds for DBMS Fuzzing via Cross-DBMS SQL Transfer Research Track Jingzhou Fu School of Software, Tsinghua University, Jie Liang , Zhiyong Wu Tsinghua University, China, Yu Jiang Tsinghua University | ||
15:00 15mTalk | Automatically Detecting Reflow Accessibility Issues in Responsive Web Pages Research Track Paul T. Chiou University of Southern California, Robert Winn University of Southern California, Ali S. Alotaibi University of Southern California, William G.J. Halfond University of Southern California Media Attached | ||
15:15 7mTalk | Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist Demonstrations Sajad Khatiri USI-Lugnao & Zurich University of Applied Sciences, Sebastiano Panichella Zurich University of Applied Sciences, Paolo Tonella USI Lugano DOI Pre-print | ||
15:22 7mTalk | eFish'nSea: Unity Game Set for Learning Software Performance Issues Root Causes and Resolutions Software Engineering Education and Training Andrew Quinlan Stevens Institute of Technology, Ryan Mercadante Stevens Institute of Technology, Vincent Tufo Stevens Institute of Technology, Jonathan Morrone Stevens Institute of Technology, Lu Xiao Stevens Institute of Technology |