Taming Timeout Flakiness: An Empirical Study of SAP HANA
Context: Regression testing aims to prevent code changes from breaking existing features. Flaky tests negatively affect regression testing because they result in test failures that are not caused by code changes, thus providing an ambiguous signal. Test timeouts are one potential root cause for such flaky test failures. Objective: With the goal of reducing test flakiness in a large-scale industrial database management system, we empirically study the impact of test timeouts on flakiness in SAP HANA’s system tests and evaluate different approaches to automatically adjust timeout values, assessing their suitability for reducing execution time costs and improving build turnaround times. Method: We collect metadata on SAP HANA’s test executions by repeatedly executing tests on the same code revision over a period of six months. We evaluate the level of test flakiness and its main root cause, investigate the evolution of test timeout values, and evaluate different approaches for optimizing timeout values. Results: The test flakiness rate ranges from 49% to 70%, depending on the number of repeated test executions. Test timeouts account for 70% of flaky test failures. Developers typically react to flaky timeouts by manually increasing timeout values or splitting longrunning test suites. However, adjusting timeout values manually is a tedious and ineffective task for developers. Our approach for timeout optimization can reduce related flaky failures by 80% while even reducing the median timeout value by 25%. Conclusion: Test timeouts are a major cause of system test flakiness in SAP HANA and it is challenging for developers to effectively mitigate this problem manually. Our automatic technique to optimize timeout values reduces flaky failures while minimizing test costs. Practitioners working on large-scale industrial software systems can use our findings to increase the effectiveness of their system tests while reducing the burden on developers to manually maintain appropriate timeout values.
Wed 17 AprDisplayed time zone: Lisbon change
| 14:00 - 15:30 | Testing 2Research Track / Software Engineering Education and Training / Software Engineering in Practice / Demonstrations / Journal-first Papers at Eugénio de Andrade Chair(s): Jonathan Bell Northeastern University | ||
| 14:0015m Talk | Ripples of a Mutation — An Empirical Study of Propagation Effects in Mutation Testing Research Track Hang Du University of California at Irvine, Vijay Krishna Palepu Microsoft, James Jones University of California at IrvineDOI | ||
| 14:1515m Talk | Fast Deterministic Black-box Context-free Grammar Inference Research Track Mohammad Rifat Arefin The University of Texas at Arlington, Suraj Shetiya University of Texas at Arlington, Zili Wang Iowa State University, Christoph Csallner University of Texas at ArlingtonPre-print Media Attached | ||
| 14:3015m Talk | Bridging Theory to Practice in Software Testing Teaching through Team-based Learning (TBL) and Open Source Software (OSS) Contribution Software Engineering Education and Training | ||
| 14:4515m Talk | Productive Coverage: Improving the Actionability of Code Coverage Software Engineering in Practice Marko Ivanković Google; Universität Passau, Goran Petrović Google Inc, Yana Kulizhskaya Google Inc, Mateusz Lewko Google Inc, Luka Kalinovčić No affiliation, René Just University of Washington, Gordon Fraser University of Passau | ||
| 15:0015m Talk | Taming Timeout Flakiness: An Empirical Study of SAP HANA Software Engineering in PracticePre-print | ||
| 15:157m Talk | Testing Abstractions for Cyber-Physical Control Systems Journal-first Papers Claudio Mandrioli University of Luxembourg, Max Nyberg Carlsson Lund University, Martina Maggio Saarland University, Germany / Lund University, SwedenPre-print | ||
| 15:227m Talk | FaultFuzz: A Coverage Guided Fault Injection Tool for Distributed Systems Demonstrations Wenhan Feng Institute of Software, Chinese Academy of Sciences, Qiugen Pei Joint Laboratory on Cyberspace Security China Southern Power Grid, Yu Gao Institute of Software, Chinese Academy of Sciences, China, Dong Wang Institute of software, Chinese academy of sciences, Wensheng Dou Institute of Software Chinese Academy of Sciences, Jun Wei Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences; University of Chinese Academy of Sciences Chongqing School, Zheheng Liang Joint Laboratory on Cyberspace Security of China Southern Power Grid, Zhenyue Long Joint Laboratory on Cyberspace Security China Southern Power GridPre-print | ||
