Using Voting and Stacking Ensemble Techniques to Optimize Software Requirements Classification (ESEIW 2025 - ESEM - Technical Track)

Sun 28 September - Fri 3 October 2025

Who

Maria Isabel Limaylla Lunarejo, Condori-Fernandez Nelly, Miguel Rodríguez Luaces

Track

ESEIW 2025 ESEM - Technical Track

Abstract

Background: Ensemble models play an important role in integrating multiple classifiers in a wide range of applications, such as medical diagnosis, sentiment analysis, and financial market trends. In Requirements Engineering (RE), automatic requirements classification can be improved by the utilization of these models. Aims: This paper analyses the performance metrics of voting and stacking ensemble models for requirements classification prediction. Moreover, a cross-dataset validation was performed for the meta-models generated using the stacking ensemble method. Methods: Some previously trained base models and two datasets of software requirements written in Spanish (translated PROMISE exp and ReSpa dataset) were used to build the ensemble models. Results: The results indicate that the stacking model achieved a weighted F1-score of 0.828 using Support Vector Machine (SVM) and Multi-layer Perceptron (MLP) for translated PROMISE exp dataset. For the ReSpa dataset, the stacking model achieved a weighted F1-score of 0.890 using Logistic Regression (LR). Conclusion: This study confirms a slight improvement in the performance of binary requirements classification using stacking ensemble methods over voting and most individual base models. Moreover, combining all models outperforms combinations that include only Shallow ML or DL models.

Maria Isabel Limaylla Lunarejo

Using Voting and Stacking Ensemble Techniques to Optimize Software Requirements Classification

Maria Isabel Limaylla Lunarejo

Universidade da Coruña

Spain

Condori-Fernandez Nelly

Universidad de Santiago de Compostela

Spain

Miguel Rodríguez Luaces

Universidade da Coruña, CITIC, Database Lab

Spain

Tracks