Studying and Understanding the Tradeoffs Between Generality and Reduction in Software Debloating
Existing approaches for program debloating often use a usage profile, typically provided as a set of inputs, for identifying the features of a program to be preserved. Specifically, given a program and a set of inputs, these techniques produce a reduced program that behaves correctly for these inputs. Focusing only on \textit{reduction}, however, would typically result in programs that are overfitted to the inputs used for debloating. For this reason, another important factor to consider in the context of debloating is \textit{generality}, which measures the extent to which a debloated program behaves correctly also for inputs that were not in the initial usage profile. Unfortunately, most evaluations of existing debloating approaches only consider reduction, thus providing partial information on the effectiveness of these approaches. To address this limitation, we perform an empirical evaluation of the reduction and generality of 4 debloating techniques, 3 state-of-the-art techniques and a baseline, on a set of 25 programs and different sets of inputs for these programs. Our results show that these approaches can indeed produce programs that are overfitted to the inputs used and have low generality. Based on these results, we also propose two new augmentation approaches and evaluate their effectiveness. The results of this additional evaluation show that these two approaches can help improve program generality without significantly affecting size reduction. Finally, because different approaches have different strengths and weaknesses, we also provide guidelines to help users choose the most suitable approach based on their specific needs and context.