ICSE 2022 (series) / MSR 2022 (series) / Data and Tool Showcase Track /
ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction
Wed 18 May 2022 13:28 - 13:32 at MSR Main room - odd hours - Session 4: Software Quality (Bugs & Smells) Chair(s): Maxime Lamothe, Mahmoud Alfadel
In this paper, we present ApacheJIT, a large dataset for Just-In-Time defect prediction. ApacheJIT consists of clean and bug-inducing software changes in popular Apache projects. ApacheJIT has a total of 106,674 commits (28,239 bug-inducing and 78,435 clean commits). Having a large number of commits makes ApacheJIT a suitable dataset for machine learning models, especially deep learning models that require large training sets to effectively generalize the patterns present in the historical data to future data.
Wed 18 MayDisplayed time zone: Eastern Time (US & Canada) change
Wed 18 May
Displayed time zone: Eastern Time (US & Canada) change
13:00 - 13:50 | Session 4: Software Quality (Bugs & Smells)Data and Tool Showcase Track / Technical Papers at MSR Main room - odd hours Chair(s): Maxime Lamothe Polytechnique Montreal, Montreal, Canada, Mahmoud Alfadel University of Waterloo | ||
13:00 7mTalk | Dazzle: Using Optimized Generative Adversarial Networks to Address Security Data Class Imbalance Issue Technical Papers Rui Shu North Carolina State University, Tianpei Xia North Carolina State University, Laurie Williams North Carolina State University, Tim Menzies North Carolina State University | ||
13:07 7mTalk | To What Extent do Deep Learning-based Code Recommenders Generate Predictions by Cloning Code from the Training Set? Technical Papers Matteo Ciniselli Università della Svizzera Italiana, Luca Pascarella Università della Svizzera italiana (USI), Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print | ||
13:14 7mTalk | How to Improve Deep Learning for Software Analytics (a case study with code smell detection) Technical Papers Pre-print | ||
13:21 7mTalk | Using Active Learning to Find High-Fidelity Builds Technical Papers Harshitha Menon Lawrence Livermore National Lab, Konstantinos Parasyris Lawrence Livermore National Laboratory, Todd Gamblin Lawrence Livermore National Laboratory, Tom Scogland Lawrence Livermore National Laboratory Pre-print | ||
13:28 4mTalk | ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction Data and Tool Showcase Track Hossein Keshavarz David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada, Mei Nagappan University of Waterloo Pre-print | ||
13:32 4mTalk | ReCover: a Curated Dataset for Regression Testing Research Data and Tool Showcase Track Francesco Altiero Università degli Studi di Napoli Federico II, Anna Corazza Università degli Studi di Napoli Federico II, Sergio Di Martino Università degli Studi di Napoli Federico II, Adriano Peron Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II | ||
13:36 14mLive Q&A | Discussions and Q&A Technical Papers |
Information for Participants
Wed 18 May 2022 13:00 - 13:50 at MSR Main room - odd hours - Session 4: Software Quality (Bugs & Smells) Chair(s): Maxime Lamothe, Mahmoud Alfadel
Info for room MSR Main room - odd hours: