Using Bandit Algorithms for Selecting Feature Reduction Techniques in Software Defect Prediction
Background: Selecting a suitable feature reduction technique, when building a defect prediction model, can be challenging. Different techniques can result in the selection of different independent variables which have an impact on the overall performance of the prediction model. To help in the selection, previous studies have assessed the impact of each feature reduction technique using different datasets. However, there are many reduction techniques, and therefore some of the well-known techniques have not been assessed by those studies. Aim: The goal of the study is to select a high-accuracy reduction technique from several candidates without preliminary assessments. Method: We utilized bandit algorithm (BA) to help with the selection of best features reduction technique for a list of candidates. To select the best feature reduction technique, BA evaluates the prediction accuracy of the candidates, comparing testing results of different modules with their prediction results. By substituting the reduction technique for the prediction method, BA can then be used to select the best reduction technique. Results: In the experiment, we evaluated the performance of BA to select suitable reduction technique. We performed cross version defect prediction using 14 datasets. As feature reduction techniques, we used two assessed and two non-assessed techniques. Using BA, the prediction accuracy was higher or equivalent than existing approaches on average, compared with techniques selected based on an assessment. Conclusions: BA can have larger impact on improving prediction models by helping not only on selecting suitable models, but also in selecting suitable feature reduction techniques.