Write a Blog >>
ICSE 2023
Sun 14 - Sat 20 May 2023 Melbourne, Australia
Fri 19 May 2023 16:52 - 17:00 at Level G - Plenary Room 1 - Software quality Chair(s): Valentina Lenarduzzi

When developers rush out code, that code often contains technical debt (TD), i.e. decisions that must later be repaid with further work. Keeping track of and managing Self-Admitted Technical Debts (SATDs) is important for maintaining a healthy software project. Current active-learning SATD recognition tool involves manual inspection of 24% of the test comments on average to reach 90% of the recall. Among all the test comments, about 5% are SATDs. The human experts are then required to read almost a quintuple of the SATD comments which indicates the inefficiency of the tool. Plus, human experts are still prone to error: 95% of the false-positive labels from previous work were actually true positives.

To solve the above problems, we propose DebtFree, a two-mode framework based on unsupervised learning for identifying SATDs. In mode1, when the existing training data is unlabeled, DebtFree starts with an unsupervised learner to automatically pseudo-label the programming comments in the training data. In contrasts, in mode2 where labels are available with the corresponding training data, DebtFree starts with a pre-processor that identifies the highly prone SATDs from the test dataset. Then, our machine learning model is employed to assist human experts in manually identifying the remaining SATDs. Our experiments on 10 software projects show that both models yield statistically significant improvement in effectiveness over the state-of-the-art automated and semi-automated models.

The main reasons to endorse DebtFree include: - Across our ten datasets, DebtFree can reduce the labeling effort by 99% in mode1 (unlabeled training data), and up to 63% in mode2 (labeled training data) while improving the current active learner’s F1 relatively to almost 100%. - This work is the first to assess the usage of unsupervised learning to reduce the cost of labeling in identifying SATDs. Nearly all the prior unsupervised learning work focuses on defect prediction. Our success here suggests that many more domains in SE could benefit from unsupervised learning.

Slide this talk.

Fri 19 May

Displayed time zone: Hobart change

15:45 - 17:15
15:45
15m
Talk
DuetCS: Code Style Transfer through Generation and Retrieval
Technical Track
Binger Chen Technische Universität Berlin, Ziawasch Abedjan Leibniz Universität Hannover
16:00
15m
Talk
Understanding Why and Predicting When Developers Adhere to Code-Quality Standards
SEIP - Software Engineering in Practice
Manish Motwani Georgia Institute of Technology, Yuriy Brun University of Massachusetts
Pre-print
16:15
15m
Talk
Code Compliance Assessment as a Learning Problem
SEIP - Software Engineering in Practice
16:30
15m
Talk
An Empirical Study on Quality Issues of Deep Learning Platform
SEIP - Software Engineering in Practice
Yanjie Gao Microsoft Research, Xiaoxiang Shi , Haoxiang Lin Microsoft Research, Hongyu Zhang The University of Newcastle, Hao Wu , Rui Li , Mao Yang Microsoft Research
Pre-print
16:45
7m
Talk
Can static analysis tools find more defects? A qualitative study of design rule violations found by code review
Journal-First Papers
Sahar Mehrpour George Mason University, USA, Thomas LaToza George Mason University
16:52
7m
Talk
DebtFree: minimizing labeling cost in self-admitted technical debt identification using semi-supervised learning
Journal-First Papers
Huy Tu North Carolina State University, USA, Tim Menzies North Carolina State University
Link to publication Pre-print
17:00
7m
Talk
FIXME: synchronize with database! An empirical study of data access self-admitted technical debt
Journal-First Papers
Biruk Asmare Muse Polytechnique Montréal, Csaba Nagy Software Institute - USI, Lugano, Anthony Cleve University of Namur, Foutse Khomh Polytechnique Montréal, Giuliano Antoniol Polytechnique Montréal
17:07
7m
Talk
How does quality deviate in stable releases by backporting?
NIER - New Ideas and Emerging Results
Jarin Tasnim University of Saskatchewan, Debasish Chakroborti University of Saskatchewan, Chanchal K. Roy University of Saskatchewan, Kevin Schneider University of Saskatchewan
Link to publication Pre-print