Sat 24 Oct 2020 Porto, Portugal
co-located with ICST 2020
Sat 24 Oct 2020 14:20 - 14:35 at Miragaia - Session II

Testing large software or other systems where test execution is time consuming or requires high computational resources is difficult and requires the selection of appropriate test cases. In the context of testing, appropriate means to have tests that most likely reveal faults or at least indicate when passing that the important functionality of the system works. In practice it is often the case that we have to reduce available test suites in order to finalize testing in a given time not exceeding other resources. In this paper, we introduce a machine learning based algorithm for test suite reduction that combines k-means clustering with binary search. The idea behind the algorithm is to cluster test cases that are close together and to select a representative test case from each of the clusters to be used in the new reduced test suite. We use binary search for looking for the proper number of clusters that allows to reduce the test suite under the condition of not substantially deviating from coverage or mutation score obtained from the initial tests suite. Besides discussing the algorithm, we present experimental results using small to larger Java programs with different types of inputs and outputs. For all example cases we were able to considerably reduce the number of test cases requiring a short reduction time especially compared to other test suite reduction approaches.