Data Dependency-Aware Code Generation from Enhanced UML Sequence Diagrams
This program is tentative and subject to change.
Large language models (LLMs) excel at generating code from natural language (NL) descriptions. However, the plain textual descriptions are inherently ambiguous and often fail to capture complex requirements like intricate system behaviors, conditional logic, and architectural constraints; implicit data dependencies in service-oriented architectures are difficult to infer and handle correctly.
To bridge this gap, we propose a novel step-by-step code generation framework named UML2Dep by leveraging unambiguous formal specifications of complex requirements. First, we introduce an enhanced Unified Modeling Language (UML) sequence diagram tailored for service-oriented architectures. This diagram extends traditional visual syntax by integrating decision tables and API specifications, explicitly formalizing structural relationships and business logic flows in service interactions to rigorously eliminate linguistic ambiguity. Second, recognizing the critical role of data flow, we introduce a dedicated data dependency inference (DDI) task. DDI systematically constructs an explicit data dependency graph prior to actual code synthesis. To ensure reliability, we formalize DDI as a constrained mathematical reasoning task through novel prompting strategies, aligning with LLMs’ excellent mathematical strengths. Additional static parsing and dependency pruning further reduce context complexity and cognitive load associated with intricate specifications, thereby enhancing reasoning accuracy and efficiency.
Experimental results on our in-house industrial datasets demonstrate the effectiveness of the proposed framework. Specifically, our framework achieves strong performance, with 89.97% recall, 95.06% precision, and 92.33% F1 score on the DDI task. Furthermore, the integration of UML2Dep into the code generation pipeline also improves practical deployment, increasing compilation pass rate by 8.83% and unit test pass rate by 11.66%.
This program is tentative and subject to change.
Wed 19 NovDisplayed time zone: Seoul change
| 16:00 - 16:50 | |||
| 16:0010m Talk | Data Dependency-Aware Code Generation from Enhanced UML Sequence Diagrams Industry Showcase Wenxin Mao Tencent, Zhitao Wang Tencent, Long Wang Tencent, Sirong Chen Tencent, Cuiyun Gao Harbin Institute of Technology, Shenzhen, Luyang Cao Tencent, Ziming Liu Tencent, Qiming Zhang Tencent, Jun Zhou Tencent, China, Zhi Jin Peking University | ||
| 16:1010m Talk | AutoPLC: Generating Vendor-Aware Structured Text for Programmable Logic Controllers Industry Showcase Donghao Yang Beihang University, Aolang Wu Beihang University, Tianyi Zhang BeiHang University, Li Zhang Beihang University, Xiaoli Lian Beihang University, China, Fang Liu Beihang University, Yuming Ren , Jiaji Tian Beihang University, Xiaoyin Che Siemens AG | ||
| 16:2010m Talk | Requirements Development and Formalization for Reliable Code Generation: A Multi-Agent Vision NIER Track Xu Lu Xidian University, Weisong Sun Nanyang Technological University, Yiran Zhang , Ming Hu Singapore Management University, Cong Tian Xidian University, Zhi Jin Peking University, Yang Liu Nanyang Technological University | ||
| 16:3010m Talk | Measuring LLM Code Generation Stability via Structural Entropy NIER Track Yewei Song University of Luxembourg, Tiezhu Sun University of Luxembourg, Xunzhu Tang University of Luxembourg, Prateek Kumar Rajput University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg | ||
| 16:4010m Talk | TreeRanker: Fast and Model-agnostic Ranking System for Code Suggestions in IDEs Industry Showcase Daniele Cipollone Delft University of Technology, Netherlands, Egor Bogomolov JetBrains Research, Arie van Deursen TU Delft, Maliheh Izadi Delft University of Technology | ||

