ASE 2023
Mon 11 - Fri 15 September 2023 Kirchberg, Luxembourg
Wed 13 Sep 2023 14:08 - 14:21 at Room D - Open Source and Software Ecosystems 2 Chair(s): Paul Grünbacher

Software developed on public platform is a source of data that can be used to make predictions about those projects. While the individual developing activity may be random and hard to predict, the developing behavior on project level can be predicted with good accuracy when large groups of developers work together on software projects.

To demonstrate this, we use 64,181 months of data from 1,159 GitHub projects to make various predictions about the recent status of those projects (as of April 2020). We find that traditional estimation algorithms make many mistakes. Algorithms like k-nearest neighbors (KNN), support vector regression (SVR), random forest (RFT), linear regression (LNR), and regression trees (CART) have high error rates. But that error rate can be greatly reduced using hyperparameter optimization.

To the best of our knowledge, this is the largest study yet conducted, using recent data for predicting multiple health indicators of open-source projects.

Wed 13 Sep

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

13:30 - 15:00
Open Source and Software Ecosystems 2Research Papers / Journal-first Papers / Industry Showcase (Papers) at Room D
Chair(s): Paul Grünbacher Johannes Kepler University Linz, Austria
13:30
12m
Talk
Personalized First Issue Recommender for Newcomers in Open Source Projects
Research Papers
Wenxin Xiao School of Computer Science, Peking University, Jingyue Li Norwegian University of Science and Technology, Hao He Carnegie Mellon University, Ruiqiao Qiu Beijing Institute of Technology, Minghui Zhou Peking University
Pre-print
13:42
12m
Talk
Understanding and Enhancing Issue Prioritization in GitHub
Research Papers
Yingying He Nanjing University of Aeronautics and Astronautics, Wenhua Yang Nanjing University of Aeronautics and Astronautics, Minxue Pan Nanjing University, Yasir Hussain Nanjing University of Aeronautics and Astronautics, Yu Zhou Nanjing University of Aeronautics and Astronautics
13:55
12m
Research paper
Who is the Real Hero? Measuring Developer Contribution via Multi-dimensional Data Integration
Research Papers
Yuqiang Sun Nanyang Technological University, Zhengzi Xu Nanyang Technological University, Chengwei Liu Nanyang Technological University, Yiran Zhang Nanyang Technological University, Yang Liu Nanyang Technological University
Pre-print
14:08
12m
Talk
Predicting Health Indicators for Open Source Projects (using Hyperparameter Optimization)
Journal-first Papers
Tianpei Xia North Carolina State University, Wei Fu North Carolina State University, Rui Shu North Carolina State University, Rishabh Agrawal North Carolina State University, Tim Menzies North Carolina State University
Link to publication DOI Pre-print
14:21
12m
Talk
To Share, or Not to Share: Exploring Test-Case Reusability in Fork Ecosystems
Research Papers
Mukelabai Mukelabai The University of Zambia, Zambia, Christoph Derks Ruhr-University Bochum, Germany, Jacob Krüger Eindhoven University of Technology, Thorsten Berger Ruhr University Bochum
File Attached
14:34
12m
Talk
LiSum: Open Source Software License Summarization with Multi-Task LearningRecorded talk
Research Papers
Linyu Li , Sihan Xu Nankai University, Yang Liu Nanyang Technological University, Ya Gao Nankai University, Xiangrui Cai Nankai University, Jiarun Wu Nankai University, Wenli Song Civil Aviation University of China, Zheli Liu Nankai University
Pre-print Media Attached
14:47
12m
Talk
Open Source Software Tools for Data Management and Deep Model Training Automation
Industry Showcase (Papers)
Umut Tıraşoğlu ORDULU Corp., Abdussamet Türker ORDULU Corp., Adnan Ekici ORDULU Corp., Hayri Yiğit ORDULU Corp., Yusuf Enes Bölükbaşı ORDULU Corp., Toygar Akgun TOBB ETU