Beyond Duplicates: Towards Understanding and Predicting Link Types in Issue Tracking Systems (MSR 2022 - Technical Papers)

Who

Clara Marie Lüders, Abir Bouraffa, Walid Maalej

Track

MSR 2022 Technical Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 18 May 2022 03:11 - 03:18 at MSR Main room - odd hours - Session 2: Maintenance (Issues & Smells) Chair(s): Alessio Ferrari
Tue 24 May 2022 09:15 - 09:30 at Room 315+316 - Blended Technical Session 3 (Smells and Maintenance) Chair(s): Andy Zaidman

Abstract

Software projects use Issue Tracking Systems (ITS) like JIRA to track issues and organize the workflows around them. Issues are often inter-connected via different links such as the default JIRA link types Duplicate, Relate, Block, and Subtask. While previous research has focused on analyzing and predicting duplication links, this work aims at understanding the various other link types, their prevalence, and characteristics towards a more reliable link type prediction.

For this, we studied 607,208 links connecting 698,790 issues in 15 public JIRA repositories. Besides the default types, the custom types Depend, Incorporate, Split, and Cause were also common. We manually grouped all 75 link types used in the repositories into five general categories: General Relation, Duplication, Composition, Temporal / Causal, and Workflow. Comparing the structures of the corresponding graphs, we observed several trends. For instance, as expected, Duplication links tend to represent simpler issue graphs often with two components and Composition links present the highest amount of hierarchical tree structures (97.7%). Surprisingly, General Relation links have a significantly higher transitivity score than Duplication and Temporal / Causal links.

Motivated by the differences between the types and by their popularity, we evaluated the robustness of two state-of-the-art duplicate detection approaches from the literature on our JIRA dataset. We found that current deep-learning approaches confuse between Duplication and other links in almost all repositories. On average, the classification accuracy dropped by 6% for one approach and 12% for the other. Extending the training sets with other link types seems to partly solve this issue. We discuss our findings and their implications for research and practice.

Link to Preprint

https://arxiv.org/pdf/2204.12893.pdf

DOI

https://doi.org/10.1145/3524842.3528457

Clara Marie Lüders

University of Hamburg

Germany

Abir Bouraffa

University of Hamburg

Walid Maalej

University of Hamburg

Germany

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 18 May
Displayed time zone: Eastern Time (US & Canada) change

03:00 - 03:50	Session 2: Maintenance (Issues & Smells) Technical Papers / Registered Reports / Data and Tool Showcase Track / Industry Track at MSR Main room - odd hours Chair(s): Alessio Ferrari CNR-ISTI

03:00 4m Talk		An Alternative Issue Tracking Dataset of Public Jira Repositories Data and Tool Showcase Track Lloyd Montgomery Universität Hamburg, Clara Marie Lüders University of Hamburg, Walid Maalej University of Hamburg Pre-print Media Attached
03:04 7m Talk		Smelly Variables in Ansible Infrastructure Code: Detection, Prevalence, and Lifetime Technical Papers Ruben Opdebeeck Vrije Universiteit Brussel, Ahmed Zerouali Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel Pre-print
03:11 7m Talk		Beyond Duplicates: Towards Understanding and Predicting Link Types in Issue Tracking Systems Technical Papers Clara Marie Lüders University of Hamburg, Abir Bouraffa University of Hamburg, Walid Maalej University of Hamburg DOI Pre-print
03:18 7m Talk		Real-World Clone-Detection in Go Industry Track Qinyun Wu Bytedance Ltd., Huan Song Bytedance Ltd., Ping Yang Bytedance Network Technology
03:25 4m Talk		Towards Using Gameplay Videos for Detecting Issues in Video Games Registered Reports Emanuela Guglielmi University of Molise, Simone Scalabrino University of Molise, Gabriele Bavota Software Institute, USI Università della Svizzera italiana, Rocco Oliveto University of Molise Pre-print
03:29 4m Talk		Is Surprisal in Issue Trackers Actionable? Registered Reports James Caddy University of Adelaide, Markus Wagner University of Adelaide, Australia, Christoph Treude University of Melbourne, Earl T. Barr University College London, UK, Miltiadis Allamanis Microsoft Research DOI Pre-print Media Attached
03:33 17m Live Q&A		Discussions and Q&A Technical Papers

Tue 24 May
Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:30	Blended Technical Session 3 (Smells and Maintenance)Technical Papers / Mining Challenge / Registered Reports / Data and Tool Showcase Track at Room 315+316 Chair(s): Andy Zaidman Delft University of Technology

09:00 15m Talk		Smelly Variables in Ansible Infrastructure Code: Detection, Prevalence, and Lifetime Technical Papers Ruben Opdebeeck Vrije Universiteit Brussel, Ahmed Zerouali Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel Pre-print
09:15 15m Talk		Beyond Duplicates: Towards Understanding and Predicting Link Types in Issue Tracking Systems Technical Papers Clara Marie Lüders University of Hamburg, Abir Bouraffa University of Hamburg, Walid Maalej University of Hamburg DOI Pre-print
09:30 15m Talk		How to Improve Deep Learning for Software Analytics (a case study with code smell detection) Technical Papers Rahul Yedida , Tim Menzies North Carolina State University Pre-print
09:45 8m Talk		npm-filter: Automating the mining of dynamic information from npm packages Data and Tool Showcase Track Ellen Arteca Northeastern University, Alexi Turcotte Northeastern University Pre-print Media Attached
09:53 8m Talk		Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship Between Technical Debt and RefactoringBest Mining Challenge Paper Award Mining Challenge Anthony Peruma Rochester Institute of Technology, Eman Abdullah AlOmar Stevens Institute of Technology, Christian D. Newman Rochester Institute of Technology, Mohamed Wiem Mkaouer Rochester Institute of Technology, Ali Ouni ETS Montreal, University of Quebec Pre-print Media Attached
10:01 8m Talk		CamBench - Cryptographic API Misuse Detection Tool Benchmark Suite Registered Reports Michael Schlichtig Heinz Nixdorf Institute at Paderborn University, Anna-Katharina Wickert TU Darmstadt, Germany, Stefan Krüger Independent Researcher, Eric Bodden University of Paderborn; Fraunhofer IEM, Mira Mezini TU Darmstadt Pre-print
10:09 21m Live Q&A		Discussions and Q&A Technical Papers

Information for Participants

Wed 18 May 2022 03:00 - 03:50 at MSR Main room - odd hours - Session 2: Maintenance (Issues & Smells) Chair(s): Alessio Ferrari

Info for room MSR Main room - odd hours:

Click here to go to the room on Midspace