We present a new learning-based approach for accelerating disjunctive static bug-finders. Industrial static bug-finders usually perform disjunctive analysis, differentiating program states along different execution paths of a program. Such path-sensitivity is essential for reducing false positives but it also increases analysis costs exponentially. Therefore, practical bug-finders use a state-selection heuristic to keep track of a small number of beneficial states only. However, designing a good heuristic for real-world programs is challenging; as a result, modern static bug-finders still suffer from low cost / bug-finding efficiency. In this paper, we aim to address this problem by learning effective state-selection heuristics from data. To this end, we present a novel data-driven technique that efficiently collects alarm-triggering traces, learns multiple candidate models, and adaptively chooses the best model tailored for each target program. We evaluate our approach with Infer and show that our technique significantly improves Infer’s bug-finding efficiency for a range of open-source C programs.
The artifact will be released through the facebook research github repository.