Fusion of deep convolutional and LSTM recurrent neural networks for automated detection of code smells (EASE 2023 - Short Papers and Posters)

Who

Anh Ho, Anh M. T. Bui, Phuong T. Nguyen, Amleto Di Salle

Track

EASE 2023 Short Papers and Posters

Time Zone

The program is currently displayed in (GMT+03:00) Athens.

Use conference time zone: (GMT+03:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 14 Jun 2023 11:20 - 11:30 at Aurora Hall - AI and Software Engineering Chair(s): Valentina Lenarduzzi

Abstract

Code smells is the term used to signal certain patterns or structures in software code that may contain a potential design or architecture problem, leading to maintainability or other software quality issues. Detecting code smells early in the software development process helps prevent these problems and improve the overall software quality. Existing research concentrates on the process of collecting and handling dataset, then exploring the potential of utilizing deep learning models to detect smells, while ignoring extensive feature engineering. Though these approaches obtained promising results, there are the following issues that need to be tackled: (i) extracting both structural and semantic features from the software units; (ii) mitigating the effects of imbalanced data distribution on the performance of learning models. In this paper, we propose DeepSmells as a novel approach to code smells detection. To learn the complex hierarchical representations of the code fragment, we apply a deep convolutional neural network (CNN). Then, in order to improve the quality of the context encoding and preserve semantic information, long short-term memory networks (LSTM) is placed immediately after the CNN. The final classification is conducted by deep neural networks with weighted loss function to reduce the impact of skewed data distribution. We performed an empirical study using the existing code smell benchmark datasets to assess the performance of our proposed approach, and compare it with state-of-the-art baselines. The results demonstrate the effectiveness of our proposed method for all kinds of code smells with outperformed evaluation metrics in terms of F1 score and MCC.

Authorizer Link

https://dl.acm.org/doi/abs/10.1145/3593434.3593476

DOI

https://doi.org/10.1145/3593434.3593476

File attachments

Slides (EASE2023-DeepSmells.pdf)	1.52MiB

Anh Ho

Hanoi University of Science and Technology

Vietnam

Anh M. T. Bui

Hanoi University of Science and Technology

Vietnam

Phuong T. Nguyen

University of L’Aquila

Italy

Amleto Di Salle

European University of Rome