Clause2Inv: A Generate-Combine-Check Framework for Loop Invariant Inference
Loop invariant inference is a fundamental, yet challenging, problem in program verification. Recent work adopts the guess-and-check framework, where candidate loop invariants are iteratively generated in the guess step and verified in the check step. A major challenge of this general framework is to produce high-quality candidate invariants in each iteration so that the inference procedure can converge quickly. Empirically, we observe that existing approaches may struggle with guessing the complete invariant due to the complexity of logical connectives, but usually, all the clauses of the correct loop invariant have already appeared in the previous guesses. This motivates us to refine the guess-and-check framework, resulting in a generate-combine-check framework, where the loop invariant inference task is divided into clause generation and clause combination. Specifically, we propose a novel loop invariant inference approach Clause2Inv under the new framework, which consists of an LLM-based clause generator and a counterexample-driven clause combinator. As the clause generator, Clause2Inv leverages LLMs to generate a multitude of clauses; as the clause combinator, Clause2Inv leverages counterexamples from the previous rounds to convert generated clauses into invariants. Our experiments show that Clause2Inv significantly outperforms existing loop invariant inference approaches. For example, Clause2Inv solved 312 (out of 316) linear invariant generation tasks and 44 (out of 50) nonlinear invariant generation tasks, which is at least 93 and 16 more than the existing baselines, respectively. By design, the generate-combine-check framework is flexible to accommodate various existing approaches which are currently under the guess-and-check framework by splitting the guessed candidate invariants into clauses. The evaluation shows that our approach can, with minor adaptation, improve existing loop invariant inference approaches in both effectiveness and efficiency. For example, Code2Inv which solved 210 problems with an average solving time of 137.6 seconds can be improved to solve 252 problems with an average solving time of 17.8 seconds.