Automated Generation and Evaluation of JMH Microbenchmark Suites From Unit Tests
Performance is a crucial non-functional requirement of many software systems. Despite the widespread use of performance testing, developers still struggle to construct and evaluate the quality of performance tests. To address these two major challenges, we implement a framework, dubbed ju2jmh, to automatically generate performance microbenchmarks from JUnit tests and use mutation testing to study the quality of generated microbenchmarks. Specifically, we compare our ju2jmh generated benchmarks to manually written JMH benchmarks and to automatically generated JMH benchmarks using the AutoJMH framework, as well as directly measuring system performance with JUnit tests. For this purpose, we have conducted a study on three subjects (Rxjava, Eclipse-collections, and Zipkin) with ~454 K source lines of code (SLOC), 2,417 JMH benchmarks (including manually written and generated AutoJMH benchmarks) and 35,084 JUnit tests. Our results show that the ju2jmh generated JMH benchmarks consistently outperform using the execution time and throughput of JUnit tests as a proxy of performance and JMH benchmarks automatically generated using the AutoJMH framework while being comparable to JMH benchmarks manually written by developers in terms of tests’ stability and ability to detect performance bugs. Nevertheless, ju2jmh benchmarks are able to cover more of the software applications than manually written JMH benchmarks during the microbenchmark execution. Furthermore, ju2jmh benchmarks are generated automatically, while manually written JMH benchmarks requires many hours of hard work and attention; therefore our study can reduce developers’ effort to construct microbenchmarks. In addition, we identify three factors (too low test workload, unstable tests and limited mutant coverage) that affect a benchmark’s ability to detect performance bugs. To the best of our knowledge, this is the first study aimed at assisting developers in fully automated microbenchmark creation and assessing microbenchmark quality for performance testing.
Wed 17 MayDisplayed time zone: Hobart change
15:45 - 17:15 | Test generationSEIP - Software Engineering in Practice / DEMO - Demonstrations / Technical Track / NIER - New Ideas and Emerging Results / Journal-First Papers at Meeting Room 102 Chair(s): Chunyang Chen Monash University | ||
15:45 7mTalk | SoapOperaTG: A Tool for System Knowledge Graph Based Soap Opera Test Generation DEMO - Demonstrations Yanqi Su Australian National University, Zheming Han , Zhenchang Xing CSIRO’s Data61; Australian National University, Xiwei (Sherry) Xu CSIRO’s Data61, Liming Zhu CSIRO’s Data61, Qinghua Lu CSIRO’s Data61 | ||
15:52 7mTalk | GUI Testing to the Power of Parallel Q-Learning DEMO - Demonstrations Marco Mobilio University of Milano Bicocca, Diego Clerissi University of Milano-Bicocca, Giovanni Denaro University of Milano-Bicocca, Italy, Leonardo Mariani University of Milano-Bicocca | ||
16:00 15mTalk | BADGE: Prioritizing UI Events with Hierarchical Multi-Armed Bandits for Automated UI Testing Technical Track Dezhi Ran Peking University, Hao Wang Peking University, China, Wenyu Wang University of Illinois Urbana-Champaign, Tao Xie Peking University | ||
16:15 15mTalk | Efficiency Matters: Speeding Up Automated Testing with GUI Rendering Inference Technical Track Sidong Feng Monash University, Mulong Xie Australian National University, Chunyang Chen Monash University Pre-print | ||
16:30 15mTalk | CodaMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models Technical Track Caroline Lemieux University of British Columbia, Jeevana Priya Inala Microsoft Research, Shuvendu K. Lahiri Microsoft Research, Siddhartha Sen Microsoft Research | ||
16:45 15mTalk | Simulation-Driven Automated End-to-End Test and Oracle Inference SEIP - Software Engineering in Practice Shreshth Tuli Meta Platforms Inc. and Imperial College, Kinga Bojarczuk Facebook, Natalija Gucevska Facebook, Mark Harman University College London, Xiaoyu Wang Meta Platforms Inc., Graham Wright Meta Platforms Inc. | ||
17:00 7mTalk | Reasoning-Based Software Testing NIER - New Ideas and Emerging Results Luca Giamattei Università di Napoli Federico II, Roberto Pietrantuono Università di Napoli Federico II, Stefano Russo Università di Napoli Federico II Pre-print | ||
17:07 7mTalk | Automated Generation and Evaluation of JMH Microbenchmark Suites From Unit Tests Journal-First Papers Mostafa Jangali Concordia University, Yiming Tang Concordia University, Niclas Alexandersson Chalmers University of Technology, Philipp Leitner Chalmers University of Technology, Sweden / University of Gothenburg, Sweden, Jinqiu Yang Concordia University, Weiyi Shang University of Waterloo |