Reproducing Performance Bug Reports in Server Applications: The Researchers' Experiences (ASE 2020 - Journal-first Papers)

Who

Xue Han, Daniel Carroll, Tingting Yu

Track

ASE 2020 Journal-first Papers

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 23 Sep 2020 17:30 - 17:50 at Koala - Empirical Software Engineering (1) Chair(s): Jinqiu Yang

Abstract

Software performance is critical to the quality of the software system. Performance bugs can cause significant performance degradation such as long response time and low system throughput that ultimately lead to poor user experiences. Many modern software projects use bug tracking systems that allow developers and users to report issues they have identified in the software. While bug reports are intended to help developers to understand and fix bugs, they are also extensively used by researchers for finding benchmarks to evaluate their testing and debugging approaches. Researchers often rely on the description of a confirmed performance bug report to reproduce the performance bug to be used in their evaluation. Although researchers spend a considerable amount of time and effort in finding usable performance bugs from bug repositories, they often get only a few usable performance bugs. Reproducing performance bugs is a difficult task even for domain experts such as developers. Compared to functional bugs, performance bugs are substantially more complicated to reproduce because they often manifest through large inputs and specific execution conditions. The amount of information disclosed in a bug report may not always be sufficient to reproduce the performance bug for researchers, and thus hinders the usability of bug repository as the resource for finding benchmarks. Our study targets reproducing performance bugs from the perspectives of non-domain experts such as software engineering researchers. One big difference compared to the prior work is that we specifically target confirmed performance bugs to report why software engineering researchers may not succeed in reproducing such bugs rather than understanding and characterizing non- reproducible bugs from the viewpoints of developers. Therefore, a failed-to-reproduce performance bug in this work is defined as a developer confirmed reproducible performance bug that cannot be reproduced by researchers due to the lack of domain knowledge or environment limitations. The goal of this study is to share our experience as software engineering researchers in reproducing performance bugs through investigating the impact of different factors identified in confirmed performance bug reports in open-source projects. We studied the characteristics of confirmed performance bugs by reproducing them using only information available from the bug report to examine the challenges of performance bug reproduction. We spent more than 800 hours over the course of six months to study and reproduce 93 confirmed performance bugs, which are randomly sampled from two large-scale open-source server applications. We 1) studied the characteristics of the reproduced performance bug reports; 2) summarized the causes of failed-to-reproduce confirmed performance bug reports; 3) shared our experience on suggesting workarounds to improve the bug reproduction success rate; 4) delivered a virtual machine image that contains a set of 17 ready-to-execute performance bug benchmarks. The findings of our study provide guidance and a set of suggestions to help researchers to understand, evaluate, and successfully reproduce performance bugs. We also provided a set of implications for both researchers and practitioners on developing techniques for testing and diagnosing performance bugs, improving the quality of bug reports, and detecting failed-to-reproduce bug reports.

Link to Publication

https://www.sciencedirect.com/science/article/pii/S0164121219301438

DOI

https://doi.org/10.1016/j.jss.2019.06.100

Xue Han

University of Kentucky

Daniel Carroll

University of Kentucky

Tingting Yu

University of Kentucky

United States

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 23 Sep
Displayed time zone: (UTC) Coordinated Universal Time change

17:10 - 18:10	Empirical Software Engineering (1)Research Papers / Journal-first Papers at Koala Chair(s): Jinqiu Yang Concordia University, Montreal, Canada

17:10 20m Talk		Code to Comment "Translation": Data, Metrics, Baselining & Evaluation Research Papers David Gros University of California, Davis, Hariharan Sezhiyan University of California, Davis, Prem Devanbu University of California, Zhou Yu University of California, Davis
17:30 20m Talk		Reproducing Performance Bug Reports in Server Applications: The Researchers' Experiences Journal-first Papers Xue Han University of Kentucky, Daniel Carroll University of Kentucky, Tingting Yu University of Kentucky Link to publication DOI
17:50 20m Talk		Exploring the Architectural Impact of Possible Dependencies in Python software Research Papers Wuxia Jin Xi'an Jiaotong University, Yuanfang Cai Drexel University, Rick Kazman University of Hawai‘i at Mānoa, Gang Zhang Emergent Design Inc, Qinghua Zheng Xi'an Jiaotong University, Ting Liu Xi'an Jiaotong University