Predicting the Objective and Priority of Issue Reports in Software Repositories
Wed 11 May 2022 04:00 - 04:05 at ICSE room 3-even hours - Recommender Systems 1 Chair(s): Alessio Ferrari
Fri 27 May 2022 09:00 - 09:05 at Room 306+307 - Papers 18: Recommender Systems, tools and environments Chair(s): Christian Bird
Software repositories such as GitHub host a large number of software entities. Developers collaboratively discuss, implement, use, and share these entities. Proper documentation plays an important role in successful software management and maintenance. Users exploit Issue Tracking Systems, a facility of software repositories, to keep track of issue reports, to manage the workload and processes, and finally, to document the highlight of their team’s effort. An issue report is a rich source of collaboratively-curated software knowledge, and can contain a reported problem, a request for new features, or merely a question about the software product. As the number of these issues increases, it becomes harder to manage them manually. GitHub provides labels for tagging issues, as a means of issue management. However, about half of the issues in GitHub’s top 1000 repositories do not have any labels. In this work, we aim at automating the process of managing issue reports for software teams. We propose a two-stage approach to predict both the objective behind opening an issue and its priority level using feature engineering methods and state-of-the-art text classifiers. To the best of our knowledge, we are the first to fine-tune a Transformer for issue classification. We train and evaluate our models in both project-based and cross-project settings. The latter approach provides a generic prediction model applicable for any unseen software project or projects with little historical data. Our proposed approach can successfully predict the objective and priority level of issue reports with 82% (fine-tuned RoBERTa) and 75% (Random Forest) accuracy, respectively. Moreover, we conducted human labeling and evaluation on unlabeled issues from six unseen GitHub projects to assess the performance of the cross-project model on new data. The model achieves 90% accuracy on the sample set. We measure inter-rater reliability and obtain an average Percent Agreement of 85.3% and Randolph’s free-marginal Kappa of 0.71 that translate to a substantial agreement among labelers.
Tue 10 MayDisplayed time zone: Eastern Time (US & Canada) change
Wed 11 MayDisplayed time zone: Eastern Time (US & Canada) change
04:00 - 05:00 | Recommender Systems 1SEIP - Software Engineering in Practice / Technical Track / Journal-First Papers at ICSE room 3-even hours Chair(s): Alessio Ferrari CNR-ISTI | ||
04:00 5mTalk | Predicting the Objective and Priority of Issue Reports in Software Repositories Journal-First Papers Maliheh Izadi Sharif University of Technology, Kiana Akbari Sharif University of technology, Abbas Heydarnoori Sharif University of Technology Link to publication DOI Pre-print Media Attached | ||
04:05 5mTalk | Code Reviewer Recommendation in Tencent: Practice, Challenge, and Direction SEIP - Software Engineering in Practice Qiuyuan Chen Zhejiang University, Dezhen Kong Zhejiang University, Lingfeng Bao Zhejiang University, Chenxing Sun Tencent, Xin Xia Huawei Software Engineering Application Technology Lab, Shanping Li Zhejiang University Pre-print Media Attached | ||
04:10 5mTalk | Using Deep Learning to Generate Complete Log Statements Technical Track Antonio Mastropaolo Università della Svizzera italiana, Luca Pascarella Università della Svizzera italiana (USI), Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached | ||
04:15 5mTalk | Modeling Review History for Reviewer Recommendation: A Hypergraph Approach Technical Track Guoping Rong Nanjing University, YiFan Zhang Nanjing University, Lanxin Yang Nanjing University, Fuli Zhang Nanjing University, Hongyu Kuang Nanjing University, He Zhang Nanjing University Pre-print Media Attached | ||
04:20 5mTalk | ShellFusion: Answer Generation for Shell Programming Tasks via Knowledge Fusion Technical Track Neng Zhang School of Software Engineering, Sun Yat-sen University, Chao Liu Chongqing University, Xin Xia Huawei Software Engineering Application Technology Lab, Christoph Treude University of Melbourne, Ying Zou Queen's University, Kingston, Ontario, David Lo Singapore Management University, Zibin Zheng School of Data and Computer Science, Sun Yat-sen University DOI Pre-print Media Attached | ||
04:25 5mTalk | CLEAR: Contrastive Learning for API Recommendation Technical Track Moshi Wei York University, Nima Shiri Harzevili York University, Yuchao Huang Institute of Software Chinese Academy of Sciences, Junjie Wang Institute of Software at Chinese Academy of Sciences, Song Wang York University Pre-print Media Attached |
Fri 27 MayDisplayed time zone: Eastern Time (US & Canada) change
09:00 - 10:30 | Papers 18: Recommender Systems, tools and environmentsTechnical Track / Journal-First Papers / NIER - New Ideas and Emerging Results / SEIP - Software Engineering in Practice at Room 306+307 Chair(s): Christian Bird Microsoft Research | ||
09:00 5mTalk | Predicting the Objective and Priority of Issue Reports in Software Repositories Journal-First Papers Maliheh Izadi Sharif University of Technology, Kiana Akbari Sharif University of technology, Abbas Heydarnoori Sharif University of Technology Link to publication DOI Pre-print Media Attached | ||
09:05 5mTalk | Using Deep Learning to Generate Complete Log Statements Technical Track Antonio Mastropaolo Università della Svizzera italiana, Luca Pascarella Università della Svizzera italiana (USI), Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached | ||
09:10 5mTalk | Better Modeling the Programming World with Code Concept Graphs-augmented Multi-modal Learning NIER - New Ideas and Emerging Results Martin Weyssow DIRO, Université de Montréal, Houari Sahraoui Université de Montréal, Bang Liu DIRO & Mila, Université de Montréal Pre-print Media Attached | ||
09:15 5mTalk | "Project smells" — Experiences in Analysing the Software Quality of ML Projects with mllint SEIP - Software Engineering in Practice Bart van Oort Delft University of Technology, Luís Cruz Deflt University of Technology, Babak Loni ING Bank N.V., Arie van Deursen Delft University of Technology, Netherlands Pre-print Media Attached | ||
09:20 5mTalk | Discovering Repetitive Code Changes in Python ML Systems Technical Track Malinda Dilhara University of Colorado Boulder, USA, Ameya Ketkar Oregon State University, USA, Nikhith Sannidhi University of Colorado Boulder, Danny Dig University of Colorado Boulder, USA DOI Pre-print Media Attached | ||
09:25 5mTalk | FlakiMe: Laboratory-Controlled Test Flakiness Impact Assessment Technical Track Maxime Cordy University of Luxembourg, Luxembourg, Renaud Rwemalika University of Luxembourg, Adriano Franci University of Luxembourg, Mike Papadakis University of Luxembourg, Luxembourg, Mark Harman University College London Pre-print Media Attached | ||
09:30 5mTalk | Semantic Image Fuzzing of AI Perception Systems Technical Track Trey Woodlief University of Virginia, Sebastian Elbaum University of Virginia, Kevin Sullivan University of Virginia DOI Pre-print Media Attached | ||
09:35 5mTalk | Understanding and improving artifact sharing in software engineering research Journal-First Papers Christopher Steven Timperley Carnegie Mellon University, Lauren Herckis Carnegie Mellon University, Claire Le Goues Carnegie Mellon University, Michael Hilton Carnegie Mellon University, USA Link to publication DOI Pre-print Media Attached | ||
09:40 5mTalk | ARCLIN: Automated API Mention Resolution for Unformatted Texts Technical Track Yintong Huo The Chinese University of Hong Kong, Yuxin Su Sun Yat-sen University, Hongming Zhang The Hong Kong University of Science and Technology, Michael Lyu The Chinese University of Hong Kong DOI Pre-print Media Attached |