MSR 2023
Dates to be announced Melbourne, Australia
co-located with ICSE 2023
Tue 16 May 2023 12:20 - 12:32 at Meeting Room 109 - Development Tools & Practices II Chair(s): Banani Roy

Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affect the quality of software projects. Meanwhile, the C programming language, one of the most popular ones, is widely used to develop foundation applications (i.e., operating system, database, compiler, etc.) in IT companies and its change-level characteristics on project quality have not been fully investigated. Additionally, whether open-source C projects have similar important features to commercial projects has not been studied much.

To address the aforementioned limitations, in this paper, we investigate the impacts of programming language-specific features on the state-of-the-art JIT defect identification approach in an industrial setting. We collect and label the top-10 most starred C projects (i.e., 329,021 commits) on GitHub and 8 C projects in an ICT company (i.e., 12,983 commits). We also propose nine C-specific change-level features and focus our investigations on both open-source C projects on GitHub and C projects at the ICT company considering three aspects: (1) The effectiveness of C-specific change-level features in improving the performance of identification of defect-inducing changes, (2) The importance of features in the identification of defect-inducing changes between open-source C projects and commercial C projects, and (3) The effectiveness of combining language-independent features and C-specific features in a real-life setting at the ICT company.

Tue 16 May

Displayed time zone: Hobart change

11:50 - 12:35
Development Tools & Practices IIData and Tool Showcase Track / Industry Track / Technical Papers / Registered Reports at Meeting Room 109
Chair(s): Banani Roy University of Saskatchewan
11:50
12m
Talk
Automating Arduino Programming: From Hardware Setups to Sample Source Code Generation
Technical Papers
Imam Nur Bani Yusuf Singapore Management University, Singapore, Diyanah Binte Abdul Jamal Singapore Management University, Lingxiao Jiang Singapore Management University
Pre-print
12:02
6m
Talk
A Dataset of Bot and Human Activities in GitHub
Data and Tool Showcase Track
Natarajan Chidambaram University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS, Tom Mens University of Mons
12:08
6m
Talk
Mining the Characteristics of Jupyter Notebooks in Data Science Projects
Registered Reports
Morakot Choetkiertikul Mahidol University, Thailand, Apirak Hoonlor Mahidol University, Chaiyong Ragkhitwetsagul Mahidol University, Thailand, Siripen Pongpaichet Mahidol University, Thanwadee Sunetnanta Mahidol University, Tasha Settewong Mahidol University, Raula Gaikovina Kula Nara Institute of Science and Technology
12:14
6m
Talk
Optimizing Duplicate Size Thresholds in IDEs
Industry Track
Konstantin Grotov JetBrains Research, Constructor University, Sergey Titov JetBrains Research, Alexandr Suhinin JetBrains, Yaroslav Golubev JetBrains Research, Timofey Bryksin JetBrains Research
Pre-print
12:20
12m
Talk
Boosting Just-in-Time Defect Prediction with Specific Features of C Programming Languages in Code Changes
Technical Papers
Chao Ni Zhejiang University, xiaodanxu College of Computer Science and Technology, Zhejiang university, Kaiwen Yang Zhejiang University, David Lo Singapore Management University