SpecGuru: Hierarchical LLM-Driven API Points-to Specification Generation with Self-Validation (ICSE 2026 - Research Track)

Sun 12 - Sat 18 April 2026 Rio de Janeiro, Brazil

Who

Shuangxiang Kan, Yuekang Li, Xiao Cheng, Yulei Sui

Track

ICSE 2026 Research Track

Abstract

Static analysis is a fundamental technique for ensuring software safety and reliability. However, when analyzing client code that invokes third-party APIs, proper handling of API source code becomes critical. A common approach is to utilize API specifications that approximate the essential behaviors of the API while avoiding direct analysis of the implementation. Existing methods for constructing API specifications often fail to adequately address challenges presented by complex semantics, syntax, and edge cases.

In this paper, we introduce SpecGuru, a framework that leverages Large Language Models (LLMs) to automatically generate points-to API specifications. SpecGuru employs a novel bottom-up approach that begins with leaf functions (those without dependencies) and progressively builds specifications through rigorous validation. These validated specifications subsequently serve as abstractions at call sites of higher-level functions along the call hierarchy, effectively eliminating the need to process complex source code directly with LLMs. Our approach incorporates a robust self-validation mechanism using automatically synthesized test cases for differential testing throughout the specification inference process, which prevents error propagation and ensures incremental correctness of the generated specifications.

Our experimental evaluation on 15 third-party C libraries shows that SpecGuru significantly outperforms existing specification inference tools, generating 21% more specifications than Spectre and 46% more than c-summary. Moreover, our bottom-up abstraction technique combined with automated testing methodology enables the discovery of a more comprehensive set of specifications while effectively preventing error propagation, achieving analysis results comparable to those obtained using complete API source code.

Shuangxiang Kan

SpecGuru: Hierarchical LLM-Driven API Points-to Specification Generation with Self-Validation

Shuangxiang Kan

UNSW

Yuekang Li

UNSW

Australia

Xiao Cheng

Macquarie University

Australia

Yulei Sui

University of New South Wales

Australia

Tracks

Co-hosted Conferences

Workshops