An empirical evaluation of the effectiveness of various ML, DL, and CodeBERT models to enhance the quality of software with the application of AST and Embedding techniques
The competitive market, dynamic business requirements and rapid growth in technologies poses a lot of challenges to deliver quality software. The application of Data Mining (DM), Machine Learning (ML), and Deep Learning (DL) for the software engineering activities such as software fault prediction, software maintainability prediction, fault localization, code refactoring and cloning etc, improves the quality of software, expedites its development and enhances the productivity of developers. On top of the existing static features and classical learning models, the recent release of CodeBERT and state-of-art deep learning models offer a promising hope of handling a wide range of software engineering operations efficiently to improvise the software quality. The proposed research topic focuses on Software Fault Prediction (SFP), and Fault Localization, two software engineering tasksto further enhance the quality of software systems through the application of DM, ML, and DL techniques. The performance of the Software Fault Prediction (SFP) model is hampered by feature redundancy, correlation, and irrelevance. The application of a prediction model to such an imbalanced class or software source code metric yields incorrect prediction results. In addition, the performance of the SFP model differs depending on which ML methods and approaches were used to train it. Hence, an extensive study of learning models with possible changes in the associated techniques along with detailed empirical results is necessary to find the most effective and high- performing SFP model. To establish the cost-effectiveness of SFP models, a cost-benefit analysis of the applied ensemble approaches is also required. By analysing a variety of dynamic execution information (such as bug reports, test results, and failed/passed tests), fault localization provides developers with the ability to locate potentially faulty code files and preferably, if possible, localize them to segments of code or methods. The fault localization task has been studied in the past using a variety of approaches, including those based on information retrieval (IR), spectral analysis, and learning-based. The challenge of fault localization techniques is test cases for spectra-based methods are rarely available and in most cases, IR-based code only functions at the file or method level and not at the line level. Fault prediction and fault localization are two topics that have traditionally been studied separately, with only sporadic instances of overlap and joint investigation, despite the fact that they both aim to support quality assurance activities at different times. Identifying and capitalising on synergies between the two areas could lead to more insightful and actionable outcomes. As a result, a significant amount of experimentation is required in order to find the most effective and the highest-performing fault localization and prediction models.
Dr. Lov Kumar is currently working as Assistant professor in the Department of Computer Engineering , NIT Kurukshetra . He received his Ph.D. in Computer Science and Engineering from NIT Rourkela, under the supervision of Prof. S. K. Rath. His current research interests are in the area of Mining Software Repositories, Software Analytics, and Social Media Analytics. His thesis is titled “Predicting Software Quality Parameters using Artificial Intelligence Techniques and Source Code Metrics”. He was a Faculty Member (at Thapar University) from Aug 2017 to Dec 2017 and BITS Pilani from Jan 2018 to Jan 2023. He has delivered over 60 invited talks, over 100 international refereed publications in international conferences and journals, and four published book chapter to his credit. He has won several other awards including the Young Scientist Award, Best Researcher Award, and best paper Award. He has a broad range of interests and hobbies. He loves to play cricket, read books, play chess, and solve Sudoku puzzles.
Sat 22 FebDisplayed time zone: Chennai, Kolkata, Mumbai, New Delhi change
14:15 - 15:30 | |||
14:15 15mDoctoral symposium paper | An empirical evaluation of the effectiveness of various ML, DL, and CodeBERT models to enhance the quality of software with the application of AST and Embedding techniques Doctoral Symposium Lov Kumar National Institute of Technology Kurukshetra | ||
14:30 15mDoctoral symposium paper | Enhanced Software Fault Prediction by Analyzing Code Comments using CodeBERT Doctoral Symposium Lov Kumar National Institute of Technology Kurukshetra | ||
15:00 15mDoctoral symposium paper | Leveraging Ensemble Learning for Software Engineering Jobs Doctoral Symposium | ||
15:15 15mDoctoral symposium paper | Privacy or Performance? Towards Secure and Scalable Medical Image Storage and Retrieval Management in the Cloud Doctoral Symposium Arun Amaithi Rajan College of Engineering Guindy, Anna University, Chennai |