Precise Sparse Abstract Execution via Cross-Domain Interaction
Sparse static analysis offers a more scalable solution compared to its non-sparse counterpart. The basic idea is to first conduct a fast pointer analysis that over-approximates the value-flows (or def-use chains) between program variables, thus bootstrapping the main analysis by propagating the data-flow facts sparsely along only the pre-computed value-flows instead of all program points in the control-flow graph. Current sparse techniques focus on improving the scalability of the main analysis while maintaining its precision. However, their pointer analyses in both the offline and main phases are inherently imprecise because they rely solely on a single memory address domain without considering values from other domains like the interval domain. Consequently, this leads to conservative alias results, like array-insensitivity, which leaves substantial room for precision improvement of the main data-flow analysis.
To address this limitation, this paper presents CSA, a new $\underline{C}$ross-domain $\underline{S}$parse $\underline{A}$bstract execution that interweaves correlations between values across multiple abstract domains (e.g., memory address and interval domains). Unlike traditional sparse analysis without cross-domain refinement, CSA performs correlation tracking by establishing implications of values from one domain to another through reduced cardinal power. This correlation tracking enables online bidirectional refinement: CSA refines spurious alias relations using interval domain information and also enhances the precision of interval analysis with refined alias information. This results in increasingly improved precision and scalability as the main analysis progresses. To improve the efficiency of correlation tracking, we propose an equivalent correlation tracking approach that groups (virtual) memory addresses with equivalent implication results to minimize redundant value joins and storage associated.
We apply CSA on two common assertion-based checking clients, buffer overflow and null dereference detection. Experimental results show that CSA outperforms five open-source tools (Infer, Cppcheck, Ikos, Sparrow and Klee) on ten large-scale projects. CSA finds 167 real bugs with 67.51% precision, detecting 119.74% more bugs than Infer and exhibiting 10.64% more precision rate than Klee. CSA records 94.89% less false positives on real-world projects than the version without cross-domain interaction. CSA also exhibits an average speedup of $2.47\times$ and an average memory reduction of $6.14\times$ with equivalent correlation tracking.