ICSME 2024
Sun 6 - Fri 11 October 2024
Wed 9 Oct 2024 16:10 - 16:25 at Fremont - Session 6: Maintenance of AI-based Systems Chair(s): Sujoy Roychowdhury

As generative AI is expected to increase global code volumes, the importance of maintainability from a human perspective will become even greater. Various methods have been developed to identify the most important maintainability issues, including aggregated metrics and advanced Machine Learning (ML) models. This study benchmarks several maintainability prediction approaches, including State-of-the-Art (SotA) ML, SonarQube’s Maintainability Rating, CodeScene’s Code Health, and Microsoft’s Maintainability Index. Our results indicate that CodeScene matches the accuracy of SotA ML and outperforms the average human expert. Importantly, unlike SotA ML, CodeScene also provides end users with actionable code smell details to remedy identified issues. Finally, caution is advised with SonarQube due to its tendency to generate many false positives. Unfortunately, our findings call into question the validity of previous studies that solely relied on SonarQube output for establishing ground truth labels. To improve reliability in future maintainability and technical debt studies, we recommend employing more accurate metrics. Moreover, reevaluating previous findings with Code Health would mitigate this revealed validity threat.

Wed 9 Oct

Displayed time zone: Arizona change

15:30 - 17:00
Session 6: Maintenance of AI-based SystemsResearch Track / Industry Track / New Ideas and Emerging Results Track at Fremont
Chair(s): Sujoy Roychowdhury Ericsson R&D
15:30
15m
A Taxonomy of Self-Admitted Technical Debt in Deep Learning SystemsResearch Track Paper
Research Track
Federica Pepe , Fiorella Zampetti University of Sannio, Italy, Antonio Mastropaolo William and Mary, USA, Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Massimiliano Di Penta University of Sannio, Italy
Pre-print
15:45
10m
Property-based Testing within ML Projects: an Empirical StudyNIER Paper
New Ideas and Emerging Results Track
Cindy Wauters Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel
Pre-print
15:55
15m
Toward Debugging Deep Reinforcement Learning Programs with RLExplorerResearch Track Paper
Research Track
Rached Bouchoucha Polytechnique Montréal, Ahmed Haj Yahmed École Polytechnique de Montréal, Darshan Patil , Janarthanan Rajendran , Amin Nikanjam École Polytechnique de Montréal, Sarath Chandar Polytechnique Montréal, Foutse Khomh Polytechnique Montréal
16:10
15m
Ghost Echoes Revealed: Benchmarking Maintainability Metrics and Machine Learning Predictions Against Human AssessmentsIndustry Track Paper
Industry Track
Markus Borg CodeScene, Marwa Ezzouhri University of Clermont Auvergne, Adam Tornhill Codescene AB
Pre-print
16:25
10m
RetypeR: Integrated Retrieval-based Automatic Program Repair for Python Type ErrorsVideo presentationResearch Track Paper
Research Track
Sichong Hao Faculty of Computing, Harbin Institute of Technology, Xianjun Shi , Hongwei Liu Faculty of Computing, Harbin Institute of Technology
16:35
10m
OPass: Orchestrating TVM's Passes for Lowering Memory Footprints of Computation GraphsVideo presentationResearch Track Paper
Research Track
Pengbo Nie Shanghai Jiao Tong University, Zihan Wang Shanghai Jiao Tong University, Chengcheng Wan East China Normal University, Ziyi Lin Alibaba Group, He Jiang Dalian University of Technology, Jianjun Zhao Kyushu University, Yuting Chen Shanghai Jiao Tong University