MSR 2023
Dates to be announced Melbourne, Australia
co-located with ICSE 2023
Tue 16 May 2023 11:30 - 11:36 at Meeting Room 110 - Code Smells Chair(s): Md Tajmilur Rahman

Researchers apply machine-learning techniques for code smell detection to counter the subjectivity of many code smells. Such approaches need a large, manually annotated dataset for training and benchmarking. Existing literature offers a few datasets; however, they are small in size and, more importantly, do not focus on the subjective code snippets. In this paper, we present DACOS, a manually annotated dataset containing 10,267 annotations for 5,192 code snippets. The dataset targets three kinds of code smells at different granularity-multifaceted abstraction, complex method, and long parameter list. The dataset is created in two phases. The first phase helps us identify the code snippets that are potentially subjective by determining the thresholds of metrics used to detect a smell. The second phase collects annotations for potentially subjective snippets. We also offer an extended dataset DACOSX that includes definitely benign and definitely smelly snippets by using the thresholds identified in the first phase. We have developed Tagman, a web application to help annotators view and mark the snippets one-by-one and record the provided annotations. We make the datasets and the web application accessible publicly. This dataset will help researchers working on smell detection techniques to build relevant and context-aware machine-learning models.

DACOS Presentation (DACOS.pdf)5.37MiB

Tue 16 May

Displayed time zone: Hobart change

11:00 - 11:45
11:00
12m
Talk
Don't Forget the Exception! Considering Robustness Changes to Identify Design Problems
Technical Papers
Anderson Oliveira PUC-Rio, João Lucas Correia Federal University of Alagoas, Leonardo Da Silva Sousa Carnegie Mellon University, USA, Wesley Assunção Johannes Kepler University Linz, Austria & Pontifical Catholic University of Rio de Janeiro, Brazil, Daniel Coutinho PUC-Rio, Alessandro Garcia PUC-Rio, Willian Oizumi GoTo, Caio Barbosa UFAL, Anderson Uchôa Federal University of Ceará, Juliana Alves Pereira PUC-Rio
Pre-print
11:12
12m
Talk
Pre-trained Model Based Feature Envy Detection
Technical Papers
mawenhao Wuhan University, Yaoxiang Yu Wuhan University, Xiaoming Ruan Wuhan University, Bo Cai Wuhan University
11:24
6m
Talk
CLEAN++: Code Smells Extraction for C++
Data and Tool Showcase Track
Tom Mashiach Ben Gurion University of the Negev, Israel, Bruno Sotto-Mayor Ben Gurion University of the Negev, Israel, Gal Kaminka Bar Ilan University, Israel, Meir Kalech Ben Gurion University of the Negev, Israel
11:30
6m
Talk
DACOS-A Manually Annotated Dataset of Code Smells
Data and Tool Showcase Track
Himesh Nandani Dalhousie University, Mootez Saad Dalhousie University, Tushar Sharma Dalhousie University
Pre-print File Attached
11:36
6m
Talk
What Warnings Do Engineers Really Fix? The Compiler That Cried Wolf
Industry Track
Gunnar Kudrjavets University of Groningen, Aditya Kumar Snap, Inc., Ayushi Rastogi University of Groningen, The Netherlands
Pre-print