FRELinker: A Novel Issue-Commit Link Recovery Model Based on Feature Refinement and Expansion with Multi-Classifier Fusion
In the field of software traceability (ST), machine learning (ML) has become a common and effective method for automated issue-commit link recovery. The features extracted from issue and commit artifacts are composed of significantly different types of data, such as issue summaries, diff codes, and hashes. Such complex and diverse data poses a challenge to conventional ML methods. To overcome this challenge, we propose a novel model named FRELinker, which trains independent classifiers based on the type of issue-commit training data, fully leveraging the effectiveness of ML for single type data, and then fuses the classifiers. Specifically, we categorize the features into four types: textual features, code features, non-textual features, and similarity features, and extend text similarity features by adding hybrid textual similarity measures. And then, we use a ranking method to select the optimal classifiers corresponding to these four types of features. Among them, the optimal classifier for textual features is Gradient Boosting (GB), the optimal classifier for code features is Logistic Regression (LR), and the optimal classifier for non-textual features and similarity features is Random Forest (RF). Finally, we use a Bayesian optimization model to fuse these four classifiers. Experimental results show that our method outperforms competing methods Hybrid-Linker and DeepLink in terms of Precision, Recall, and F-measure on six real-world open-source software (OSS) datasets, demonstrating significant performance advantages in complex and diverse data.
Wed 4 DecDisplayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change
14:00 - 15:30 | Session (4)Technical Track / ERA - Early Research Achievements at Room 4 (Xianglin Ballroom) Chair(s): Lina Gong Nanjing University of Aeronautics and Astronautic | ||
14:00 30mTalk | An Empirical Study of Cross-Project Pull Request Recommendation in GitHub Technical Track Wenyu Xu national university of defense technology, Yao Lu National University of Defense Technology, Xunhui Zhang National University of Defense Technology, China, Tanghaoran Zhang national university of defense technology, Xinjun Mao National University of Defense Technology, Bo Lin National University of Defense Technology | ||
14:30 30mTalk | FRELinker: A Novel Issue-Commit Link Recovery Model Based on Feature Refinement and Expansion with Multi-Classifier Fusion Technical Track Bangchao Wang Wuhan Textile University, Xinyu He School of Computer Science and Artificial Intelligence, Wuhan Textile University, Hongyan Wan Wuhan Textile University, Xiaoxiao Li School of Computer Science and Artificial Intelligence, Wuhan Textile University, Jiaxu Zhu School of Computer Science and Artificial Intelligence, Wuhan Textile University, Yukun Cao School of Computer Science and Artificial Intelligence, Wuhan Textile University | ||
15:00 20mTalk | Towards Filtering Out Deficient Pull Requests Collected through the GitHub API ERA - Early Research Achievements Bowen Tang Ritsumeikan University, Xiqin Lu Ritsumeikan University, Katsuhisa Maruyama Ritsumeikan University |