Grounding Generative AI in Software Engineering: Are We There Yet?
This program is tentative and subject to change.
Large Language Models (LLMs) have made significant contributions to software engineering, particularly in the field of code generation, demonstrating the ability to produce functionally correct code snippets. The development of these models involves multiple stages, including pre-training on large amounts of source code data and aligning them with human preferences using various techniques. However, their development process often neglects foundational software engineering (SE) practices and principles. Specifically, LLMs receive limited exposure to core SE concepts during training, such as modularity, single responsibility, cohesion, and coupling. As a result, the generated code may lack the properties critical for building maintainable, extensible, and robust software systems.
This vision paper advocates integrating SE knowledge directly into LLMs to enhance their capability to generate code and SE artifacts that adhere to and align with established best practices. We propose a new direction for LLMs to move beyond their current focus on functional accuracy toward producing robust, maintainable software. To assess how well LLMs internalize SE knowledge, we propose adopting Bloom’s Taxonomy as a comprehensive assessment framework, offering a structured alternative to limited evaluation methods such as probing. By embedding software engineering principles, next-generation LLMs can leverage decades of software engineering knowledge and transform software development with reliable, high-quality generative capabilities.
This program is tentative and subject to change.
Thu 19 MarDisplayed time zone: Athens change
14:00 - 15:30 | Session 5A - Robustness and Reliability of LLM Code GenerationShort Papers and Posters Track / Research Track / Tool Demo Track / Early Research Achievement (ERA) Track at Panorama | ||
14:00 7mTalk | Failure-Aware Enhancements for Large Language Model (LLM) Code Generation: An Empirical study on Decision Framework Short Papers and Posters Track Jianru Shen University of Montana, Zedong Peng University of Montana, Lucy Owen University of Montana | ||
14:07 15mTalk | Progressively Mitigating API Hallucination in LLM-Generated Code via Knowledge Graph Reasoning Research Track Yuxuan Li Peking University, Zexiong Ma Peking University, Yanzhen Zou Peking University, Yue Wang Peking University, Lihan Yang Peking University, Bing Xie Peking University | ||
14:22 15mTalk | Programming Language Confusion: When Code LLMs Can't Keep their Languages Straight Research Track Micheline Bénédicte MOUMOULA University of Luxembourg, NIKIEMA Beninwende Serge Lionel University of Luxembourg, Abdoul Kader Kaboré University of Luxembourg, Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg | ||
14:37 15mTalk | Can LLMs Keep Up with Library Changes? An Exploratory Study on LLM-Generated Code Research Track Xiangrong Lin Zhejiang University, Jiakun Liu Harbin Institute of Technology, Lingfeng Bao Zhejiang University | ||
14:52 15mTalk | Leveraging Enhanced Test-Driven Development for Accurate Code Generation in LLMs Research Track Rui Zhang School of Artificial Intelligence, China University of Geosciences (Beijing), Weijie Shan School of Artificial Intelligence, China University of Geosciences (Beijing), Teng Long School of Artificial Intelligence, China University of Geosciences (Beijing), Ce Fu School of Artificial Intelligence, China University of Geosciences(Beijing) | ||
15:07 7mTalk | When RAG Lies: Link-Injection Knowledge-Base Poisoning in Code Generation Short Papers and Posters Track Nguyen Trung Hieu Hanoi University of Science and Technology, Trung-Hieu Nguyen Hanoi University of Science and Technology, Hanoi, Vietnam, Trong-Nghia Be University of Engineering and Technology, Bao-Huy Hoang Hanoi University of Science and Technology,, Anh M. T. Bui Hanoi University of Science and Technology | ||
15:14 7mTalk | Grounding Generative AI in Software Engineering: Are We There Yet? Early Research Achievement (ERA) Track Mootez Saad Dalhousie University, José Antonio Hernández López Department of Computer Science and Systems, University of Murcia, Boqi Chen McGill University, Neil Ernst University of Victoria, Daniel Varro Linköping University / McGill University, Tushar Sharma Dalhousie University Pre-print | ||
15:21 7mTalk | MutEval: NL-PL Prompt Mutation Framework for Robustness Evaluation of Code LLMs Tool Demo Track Pre-print Media Attached | ||