SiMut: Exploring Program Similarity to Support the Cost Reduction of Mutation Testing
Scientists have created many cost reduction techniques for mutation testing, and most of them reduce cost with minor losses of effectiveness. However, many of these techniques are difficult to generalize, difficult to scale, or both. Published results are usually limited to a modest collection of programs. Therefore, an open question is whether the results of a given cost reduction technique on programs studied in the paper will hold true for other programs. This paper introduces a conceptual framework, named SiMut, to support the cost reduction of mutation testing based on historical data and program similarity. Given a new, untested program u, the central idea is applying to u the same cost reduction strategy applied to a group G of programs that are similar to u and have already been tested with mutation, and check for consistency of results in terms of reduced costs and quality of test sets. SiMut includes activities to compute program abstractions and similarity. Based on this information, it supports the application of mutation cost reduction techniques to both G and u. This paper presents the concepts behind SiMut, a proof-of-concept implementation of SiMut, and results from a pilot study. Finally, we discuss some issues related to the use of SiMut, focusing on the composition of a representative dataset to properly explore the potential of our framework.