A Machine Learning Based Ensemble Method for Automatic Multiclass Classification of Decisions
Stakeholders make various types of decisions with respect to requirements, design, management, and so on during the software development life cycle. Nevertheless, these decisions are typically not well documented and classified due to limited human resources, time, and budget. To this end, automatic approaches provide a promising way. In this paper, we aimed at automatically classifying decisions into five types to help stakeholders better document and understand decisions. First, we collected a dataset from the Hibernate developer mailing list. We then experimented and evaluated 270 configurations regarding feature selection, feature extraction techniques, and machine learning classifiers to seek the best configuration for classifying decisions. Especially, we applied an ensemble learning method and constructed ensemble classifiers to compare the performance between ensemble classifiers and base classifiers. Our experiment results show that (1) feature selection can decently improve the classification results; (2) ensemble classifiers can outperform base classifiers provided that ensemble classifiers are well constructed; (3) BoW + 50% features selected by feature selection with an ensemble classifier that combines Naive Bayes (NB), Logistic Regression (LR), and Support Vector Machine (SVM) achieves the best classification result (with a weighted precision of 0.750, a weighted recall of 0.739, and a weighted F1-score of 0.727) among all the configurations. Our work can benefit various types of stakeholders in software development through providing an automatic approach for effectively classifying decisions into specific types that are relevant to their interests.
Tue 22 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
11:00 - 12:00 | Decision MakingEASE 2021 at Zoom Chair(s): Pingfan Kong Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg | ||
11:00 20mFull-paper | Analytics Mistakes that Derail Software Startups EASE 2021 Usman Rafiq Free University of Bolzano, Jorge Melegati Free University of Bozen-Bolzano, Dron Khanna Free University of Bozen-Bolzano, Eduardo Guerra Free University of Bozen-Bolzano, Xiaofeng Wang Free University of Bozen-Bolzano Pre-print | ||
11:20 20mFull-paper | Influence of Roles in Decision-Making during OSS Development - A Study of Python EASE 2021 Pankajeshwara Sharma University of Otago, Dunedin, Bastin Tony Roy Savarimuthu University of Otago, Dunedin, New Zealand, Nigel Stanger University of Otago, Dunedin DOI Pre-print | ||
11:40 20mFull-paper | A Machine Learning Based Ensemble Method for Automatic Multiclass Classification of Decisions EASE 2021 Liming Fu Wuhan University, Peng Liang Wuhan University, Xueying Li Wuhan University, Chen Yang IBO Technology Co., Ltd Pre-print Media Attached |