Measuring LLM Code Generation Stability via Structural Entropy
This program is tentative and subject to change.
Assessing the stability of code generation from large language models (LLMs) is essential for judging their reliability in real-world development. We extend prior “structural‐entropy’’ concepts to the program domain by pairing entropy with abstract-syntax-tree (AST) analysis. For any fixed prompt, we collect the multiset of depth-bounded subtrees of AST in each generated program and treat their relative frequencies as a probability distribution. We then measure stability in two complementary ways: (i)Jensen–Shannon divergence, a symmetric, bounded indicator of structural overlap, and (ii) a Structural Cross-Entropy ratio that highlights missing high-probability patterns. Both metrics admit structural-only and token-aware variants, enabling separate views on control-flow shape and identifier-level variability. Unlike pass@k, BLEU, or CodeBLEU, our metrics are reference-free, language-agnostic, and execution-independent. We benchmark several leading LLMs on standard code generation tasks, demonstrating that AST-driven structural entropy reveals nuances in model consistency and robustness. The method runs in O(n,d) time with no external tests, providing a lightweight addition to the code-generation evaluation toolkit.
This program is tentative and subject to change.
Mon 17 NovDisplayed time zone: Seoul change
16:00 - 16:50 | |||
16:00 10mTalk | Data Dependency-Aware Code Generation from Enhanced UML Sequence Diagrams Industry Showcase Wenxin Mao Tencent, Zhitao Wang Tencent, Long Wang Tencent, Sirong Chen Tencent, Cuiyun Gao Harbin Institute of Technology, Shenzhen, Luyang Cao Tencent, Ziming Liu Tencent, Qiming Zhang Tencent, Jun Zhou Tencent, China, Zhi Jin Peking University | ||
16:10 10mTalk | AutoPLC: Generating Vendor-Aware Structured Text for Programmable Logic Controllers Industry Showcase Donghao Yang Beihang University, Aolang Wu Beihang University, Tianyi Zhang BeiHang University, Li Zhang Beihang University, Xiaoli Lian Beihang University, China, Fang Liu Beihang University, Yuming Ren , Jiaji Tian Beihang University, Xiaoyin Che Siemens AG | ||
16:20 10mTalk | Requirements Development and Formalization for Reliable Code Generation: A Multi-Agent Vision NIER Track Xu Lu Xidian University, Weisong Sun Nanyang Technological University, Yiran Zhang , Ming Hu Singapore Management University, Cong Tian Xidian University, Zhi Jin Peking University, Yang Liu Nanyang Technological University | ||
16:30 10mTalk | Measuring LLM Code Generation Stability via Structural Entropy NIER Track Yewei Song University of Luxembourg, Tiezhu Sun University of Luxembourg, Xunzhu Tang University of Luxembourg, Prateek Kumar Rajput University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg | ||
16:40 10mTalk | TreeRanker: Fast and Model-agnostic Ranking System for Code Suggestions in IDEs Industry Showcase Daniele Cipollone Delft University of Technology, Netherlands, Egor Bogomolov JetBrains Research, Arie van Deursen TU Delft, Maliheh Izadi Delft University of Technology | ||