Unveiling the Potential of Diffusion Large Language Models in Software Engineering Tasks: An Empirical Study
Autoregressive Large Language Models (AR-LLMs) are widely used in software engineering (SE) but face limitations in processing code structure information and suffer from high inference latency. Diffusion LLMs (DLLMs) offer a promising alternative with global bidirectional encoding and decoupled generation steps. This work presents the first comprehensive evaluation of DLLMs across the software development lifecycle, including code generation, defect detection, and program repair. On a large-scale benchmark of 52,937 tasks, 7Bparameter DLLMs outperform AR-LLMs with a 30% average accuracy improvement achieving a 113% gain on cross-file repair, while maintaining superior efficiency and reduced latency. Our results establish DLLMs as a superior paradigm for SE tasks. The code is publicly available at https://anonymous.4open.science/r/all_datasetsB4A6
| Slide (1645_Zhang.pdf) | 1.1MiB |
Fri 17 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
16:00 - 17:30 | AI for Software Engineering 26Research Track / Demonstrations / New Ideas and Emerging Results (NIER) at Asia I Chair(s): Jiakun Liu Harbin Institute of Technology | ||
16:00 15mTalk | AdapTrack: Constrained Decoding without Distorting LLM's Output Intent Research Track Yongmin Li Peking University, Jia Li Tsinghua University, Ge Li Peking University, Zhi Jin Peking University, Wuhan University | ||
16:15 15mTalk | Evaluating Generated Commit Messages with Large Language ModelsDistinguished Paper Award Research Track Qunhong Zeng Beijing Institute of Technology, Yuxia Zhang Beijing Institute of Technology, Zexiong Ma Peking University, Bo Jiang Bytedance Network Technology, Ningyuan Sun ByteDance, Klaas-Jan Stol Lero; University College Cork; SINTEF Digital , Xingyu Mou Beijing Institute of Technology, Hui Liu Beijing Institute of Technology Pre-print | ||
16:30 15mTalk | Automating Just-In-Time Python Type Annotation UpdatingDistinguished Paper Award Research Track Zhipeng Xue Zhejiang University, Zhipeng Gao Shanghai Institute for Advanced Study - Zhejiang University, Xing Hu Zhejiang University, Jingyuan Chen Zhejiang University, Xin Xia Zhejiang University, Shanping Li Zhejiang University | ||
16:45 15mTalk | Unveiling the Potential of Diffusion Large Language Models in Software Engineering Tasks: An Empirical Study New Ideas and Emerging Results (NIER) Jingyao Zhang Xi'an Jiaotong-Liverpool University, Li Tianlin NTU, Xiaoyu Zhang Nanyang Technological University, Singapore, Qiang Hu Tianjin University, Bin Shi Xi'an Jiaotong University Media Attached File Attached | ||
17:00 15mTalk | Enhancing LLM Code Generation with Ensembles: A Similarity-Based Selection Approach Research Track Tarek Mahmud Texas State University, Bin Duan University of Queensland, Corina S. Păsăreanu Carnegie Mellon University; NASA Ames, Guowei Yang University of Queensland | ||
17:15 15mTalk | Code4MeV2: a Research-oriented Code-completion Platform Demonstrations Roham Koohestani Delft University of Technology, Parham Bateni Delft University of Technology, Aydin Ebrahimi Delft University of Technology, Behdad Etezadi Delft University of Technology, Kiarash Karimi Delft University of Technology, Mali Izadi TU Delft | ||