MSR 2023
Dates to be announced Melbourne, Australia
co-located with ICSE 2023
Mon 15 May 2023 14:32 - 14:44 at Meeting Room 110 - Understanding Defects Chair(s): Matteo Paltenghi

Performance efficiency and scalability are the major design goals for high performance computing (HPC) applications. However, it is challenging to achieve high efficiency and scalability for such applications due to complex underlying hardware architecture, inefficient algorithm implementation, suboptimal code generation by the compilers, inefficient parallelization, and so on. As a result, the HPC community spends a significant effort detecting and fixing the performance bugs frequently appearing in scientific applications. However, it is important to accumulate the experience to guide the scientific software engineering community to write performance-efficient code.

In this paper, we investigate open-source HPC applications to categorize the performance bugs and their fixes and measure the programmer’s effort and experience to fix them. For this purpose, we first perform a large-scale empirical analysis on 1729 HPC performance commits collected from 23 real-world projects. Through our manual analysis, we identify 186 performance issues from these projects. Furthermore, we study the root cause of these performance issues and generate a performance bug taxonomy for HPC applications. Our analysis identifies that inefficient algorithm implementation (39.3%), inefficient code for target micro-architecture (31.2%), and missing parallelism and inefficient parallelization (14.5%) are the top three most prevalent categories of performance issues for HPC applications. Additionally, to understand how the performance bugs are fixed, we analyze the performance fix commits and categorize them into eight performance fix types. We further measure the time it takes to discover a performance bug and the developer’s efforts and expertise required to fix them. The analysis identified that it’s difficult to localize performance inefficiencies, and once localized, fixes are complicated with a median patch size (LOC) of 35 lines and are mostly fixed by experienced developers.

Mon 15 May

Displayed time zone: Hobart change

14:20 - 15:15
Understanding DefectsRegistered Reports / Data and Tool Showcase Track / Technical Papers at Meeting Room 110
Chair(s): Matteo Paltenghi University of Stuttgart, Germany
14:20
12m
Talk
What Happens When We Fuzz? Investigating OSS-Fuzz Bug History
Technical Papers
Brandon Keller Rochester Institute of Technology, Benjamin S. Meyers Rochester Institute of Technology, Andrew Meneely Rochester Institute of Technology
14:32
12m
Talk
An Empirical Study of High Performance Computing (HPC) Performance Bugs
Technical Papers
Md Abul Kalam Azad University of Michigan - Dearborn, Nafees Iqbal University of Michigan - Dearborn, Foyzul Hassan University of Michigan - Dearborn, Probir Roy University of Michigan at Dearborn
Pre-print
14:44
6m
Talk
Semantically-enriched Jira Issue Tracking Data
Data and Tool Showcase Track
Themistoklis Diamantopoulos Electrical and Computer Engineering Dept, Aristotle University of Thessaloniki, Dimitrios-Nikitas Nastos Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki, Andreas Symeonidis Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki
Pre-print
14:50
6m
Talk
An exploratory study of bug introducing changes: what happens when bugs are introduced in open source software?
Registered Reports
Lukas Schulte Universitity of Passau, Anamaria Mojica-Hanke University of Passau and Universidad de los Andes, Mario Linares-Vasquez Universidad de los Andes, Steffen Herbold University of Passau
14:56
6m
Talk
HasBugs - Handpicked Haskell Bugs
Data and Tool Showcase Track
Leonhard Applis Delft University of Technology, Annibale Panichella Delft University of Technology
15:02
6m
Talk
An Empirical Study on the Performance of Individual Issue Label Prediction
Technical Papers
Jueun Heo , Seonah Lee Gyeongsang National University