Classification-based Static Collection Selection for Java: Effectiveness and Adaptability
Carefully selecting the right collection datastructure can significantly improve the performance of a Java program. Unfortunately, the performance impact of a certain collection selection can be hard to estimate.To assist developers there are tools that recommend collections to use based on static and/or dynamic information about a program. The majority of existing collection selection tools for Java (e.g., CoCo, CollectionSwitch) pick their selections dynamically, which means that they must trade off sophistication in their selection algorithm against its run time overhead.For static collection selection, the Brainy tool has demonstrated that complex, machine-dependent models can produce substantial performance improvements, albeit only for C++ so far.
In this paper, we port Brainy from C++ to Java, and evaluate its effectiveness for 5 benchmarks from the DaCapo benchmark suite. We compare it against the original program, but also to a variant of a brute-force approach to collection selection, which serves as our ground truth for optimal performance. Our results show that in four benchmarks out of five, our ground truth and the original program are similar. In one case, the ground truth shows an optimization yielding 15% speedup was available, but our port did not find this substantial optimization. We find that the port is more efficient but less effective than the ground truth, can easily adapt to new hardware architectures, and incorporate new datastructures with at most a few hours of human effort. We detail challenges that we encountered porting the Brainy approach to Java, and list a number of insights and directions for future research.
(presentation.pdf) | 11.4MiB |