An empirical evaluation of the effectiveness of various ML, DL, and CodeBERT models to enhance the quality of software with the application of AST and Embedding techniques (ISEC 2025 - Doctoral Symposium)

Track

ISEC 2025 Doctoral Symposium

Time Zone

The program is currently displayed in (GMT+05:30) Chennai, Kolkata, Mumbai, New Delhi.

Use conference time zone: (GMT+05:30) Chennai, Kolkata, Mumbai, New DelhiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 22 Feb 2025 14:15 - 14:30 at Senate Hall - Doctoral Symposium

Abstract

The competitive market, dynamic business requirements and rapid growth in technologies poses a lot of challenges to deliver quality software. The application of Data Mining (DM), Machine Learning (ML), and Deep Learning (DL) for the software engineering activities such as software fault prediction, software maintainability prediction, fault localization, code refactoring and cloning etc, improves the quality of software, expedites its development and enhances the productivity of developers. On top of the existing static features and classical learning models, the recent release of CodeBERT and state-of-art deep learning models offer a promising hope of handling a wide range of software engineering operations efficiently to improvise the software quality. The proposed research topic focuses on Software Fault Prediction (SFP), and Fault Localization, two software engineering tasksto further enhance the quality of software systems through the application of DM, ML, and DL techniques. The performance of the Software Fault Prediction (SFP) model is hampered by feature redundancy, correlation, and irrelevance. The application of a prediction model to such an imbalanced class or software source code metric yields incorrect prediction results. In addition, the performance of the SFP model differs depending on which ML methods and approaches were used to train it. Hence, an extensive study of learning models with possible changes in the associated techniques along with detailed empirical results is necessary to find the most effective and high- performing SFP model. To establish the cost-effectiveness of SFP models, a cost-benefit analysis of the applied ensemble approaches is also required. By analysing a variety of dynamic execution information (such as bug reports, test results, and failed/passed tests), fault localization provides developers with the ability to locate potentially faulty code files and preferably, if possible, localize them to segments of code or methods. The fault localization task has been studied in the past using a variety of approaches, including those based on information retrieval (IR), spectral analysis, and learning-based. The challenge of fault localization techniques is test cases for spectra-based methods are rarely available and in most cases, IR-based code only functions at the file or method level and not at the line level. Fault prediction and fault localization are two topics that have traditionally been studied separately, with only sporadic instances of overlap and joint investigation, despite the fact that they both aim to support quality assurance activities at different times. Identifying and capitalising on synergies between the two areas could lead to more insightful and actionable outcomes. As a result, a significant amount of experimentation is required in order to find the most effective and the highest-performing fault localization and prediction models.

Bio

Dr. Lov Kumar is currently working as Assistant professor in the Department of Computer Engineering , NIT Kurukshetra . He received his Ph.D. in Computer Science and Engineering from NIT Rourkela, under the supervision of Prof. S. K. Rath. His current research interests are in the area of Mining Software Repositories, Software Analytics, and Social Media Analytics. His thesis is titled “Predicting Software Quality Parameters using Artificial Intelligence Techniques and Source Code Metrics”. He was a Faculty Member (at Thapar University) from Aug 2017 to Dec 2017 and BITS Pilani from Jan 2018 to Jan 2023. He has delivered over 60 invited talks, over 100 international refereed publications in international conferences and journals, and four published book chapter to his credit. He has won several other awards including the Young Scientist Award, Best Researcher Award, and best paper Award. He has a broad range of interests and hobbies. He loves to play cricket, read books, play chess, and solve Sudoku puzzles.

Time Zone

The program is currently displayed in (GMT+05:30) Chennai, Kolkata, Mumbai, New Delhi.

Use conference time zone: (GMT+05:30) Chennai, Kolkata, Mumbai, New DelhiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sat 22 Feb
Displayed time zone: Chennai, Kolkata, Mumbai, New Delhi change

14:15 - 15:30	Doctoral SymposiumDoctoral Symposium at Senate Hall

14:15 15m Doctoral symposium paper		An empirical evaluation of the effectiveness of various ML, DL, and CodeBERT models to enhance the quality of software with the application of AST and Embedding techniques Doctoral Symposium Lov Kumar National Institute of Technology Kurukshetra
14:30 15m Doctoral symposium paper		Enhanced Software Fault Prediction by Analyzing Code Comments using CodeBERT Doctoral Symposium Lov Kumar National Institute of Technology Kurukshetra
15:00 15m Doctoral symposium paper		Leveraging Ensemble Learning for Software Engineering Jobs Doctoral Symposium Swati Garg
15:15 15m Doctoral symposium paper		Privacy or Performance? Towards Secure and Scalable Medical Image Storage and Retrieval Management in the Cloud Doctoral Symposium Arun Amaithi Rajan College of Engineering Guindy, Anna University, Chennai

An empirical evaluation of the effectiveness of various ML, DL, and CodeBERT models to enhance the quality of software with the application of AST and Embedding techniques

Program Display Configuration

Program Display Configuration

Sat 22 FebDisplayed time zone: Chennai, Kolkata, Mumbai, New Delhi change

Lov Kumar

National Institute of Technology Kurukshetra

Sat 22 Feb
Displayed time zone: Chennai, Kolkata, Mumbai, New Delhi change