Pre-trained Model Based Feature Envy Detection (MSR 2023 - Technical Papers)

Who

mawenhao , Yaoxiang Yu, Xiaoming Ruan, Bo Cai

Track

MSR 2023 Technical Papers

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 16 May 2023 11:12 - 11:24 at Meeting Room 110 - Code Smells Chair(s): Md Tajmilur Rahman

Abstract

Code smell slows down software system development and makes them harder to maintain. Existing research aims to develop automatic detection algorithms to reduce the labor and time costs within the detection process. Deep learning techniques have recently been demonstrated to enhance the performance of recognizing code smell even more than metric-based heuristic detection algorithms. As Large-scale pre-trained models for Programming Languages (PL), such as CodeT5, have lately achieved the top results in a variety of downstream tasks, some researchers begin to explore the use of pre-trained models to extract the contextual semantics of code to detect code smells. However, little research has employed contextual code semantics relationship between code snippets obtained by pre-trained models to identify code smells. In this paper, we investigate the use of the pre-trained model codeT5 to extract semantic relationships between code snippets to detect feature envy, which is one of the most common code smells. In addition, to investigate the performance of these semantic relationships extracted by pre-trained models of different architectures on detecting feature envy, we compare CodeT5 with two other pre-trained models CodeBert, CodeGPT. % We have performed our experimental evaluation on ten open-source projects, our approach outperforms the state-of-the-art in F-measure with a 29.32% improvement on detecting feature envy and in accuracy with a 16.57% improvement on moving destination recommendation. We have performed our experimental evaluation on ten open-source projects, our approach improves F-measure by 29.32% on feature envy detection and 16.57% on moving destination recommendation. And using semantic relations extracted by several pre-trained models to detect feature envy outperforms the state-of-the-art. This shows that using this semantic relation to detect feature envy is promising. To enable future research on feature envy detection, we have made all the code and datasets utilized in this article open source.

mawenhao

Wuhan University

Yaoxiang Yu

Wuhan University

Xiaoming Ruan

Wuhan University

Bo Cai

Wuhan University

China

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 16 May
Displayed time zone: Hobart change

11:00 - 11:45	Code SmellsTechnical Papers / Industry Track / Data and Tool Showcase Track at Meeting Room 110 Chair(s): Md Tajmilur Rahman Gannon University

11:00 12m Talk		Don't Forget the Exception! Considering Robustness Changes to Identify Design Problems Technical Papers Anderson Oliveira PUC-Rio, João Lucas Correia Federal University of Alagoas, Leonardo Da Silva Sousa Carnegie Mellon University, USA, Wesley K.G. Assunção Johannes Kepler University Linz, Austria & Pontifical Catholic University of Rio de Janeiro, Brazil, Daniel Coutinho PUC-Rio, Alessandro Garcia PUC-Rio, Willian Oizumi GoTo, Caio Barbosa UFAL, Anderson Uchôa Federal University of Ceará, Juliana Alves Pereira PUC-Rio Pre-print
11:12 12m Talk		Pre-trained Model Based Feature Envy Detection Technical Papers mawenhao Wuhan University, Yaoxiang Yu Wuhan University, Xiaoming Ruan Wuhan University, Bo Cai Wuhan University
11:24 6m Talk		CLEAN++: Code Smells Extraction for C++ Data and Tool Showcase Track Tom Mashiach Ben Gurion University of the Negev, Israel, Bruno Sotto-Mayor Ben Gurion University of the Negev, Israel, Gal Kaminka Bar Ilan University, Israel, Meir Kalech Ben Gurion University of the Negev, Israel
11:30 6m Talk		DACOS-A Manually Annotated Dataset of Code Smells Data and Tool Showcase Track Himesh Nandani Dalhousie University, Mootez Saad Dalhousie University, Tushar Sharma Dalhousie University Pre-print File Attached
11:36 6m Talk		What Warnings Do Engineers Really Fix? The Compiler That Cried Wolf Industry Track Gunnar Kudrjavets University of Groningen, Aditya Kumar Snap, Inc., Ayushi Rastogi University of Groningen, The Netherlands Pre-print