FORGE 2024
Sun 14 Apr 2024 Lisbon, Portugal
co-located with ICSE 2024

The rapid progress of AI-powered programming assistants, such as GitHub Copilot, has facilitated the development of software applications. These assistants rely on large language models (LLMs), which are foundation models (FMs) that support a wide range of tasks related to understanding and generating language. LLMs have demonstrated their ability to express UML model specifications using formal languages like the Object Constraint Language (OCL). However, the context size of the prompt is limited by the number of tokens an LLM can process. This limitation becomes significant as the size of UML class models increases. In this study, we introduce PathOCL, a novel path-based prompt augmentation technique designed to facilitate OCL generation. PathOCL addresses the limitations of LLMs, specifically their token processing limit and the challenges posed by large UML class models. PathOCL is based on the concept of chunking, which selectively augments the prompts with a subset of UML classes relevant to the English specification. Our findings demonstrate that PathOCL, compared to augmenting the complete UML class model (UML-Augmentation), generates a higher number of valid and correct OCL constraints using the GPT-4 model. Moreover, the average prompt size crafted using PathOCL significantly decreases when scaling the size of the UML class models.

Sun 14 Apr

Displayed time zone: Lisbon change

16:00 - 17:30
FORGE2024 Awards & Foundation Models for Code and Documentation GenerationResearch Track at Luis de Freitas Branco
Chair(s): Antonio Mastropaolo Università della Svizzera italiana
16:00
10m
Awards
Award Ceremony
Research Track

16:10
7m
Short-paper
Fine Tuning Large Language Model for Secure Code GenerationNew Idea Paper
Research Track
Junjie Li Concordia University, Aseem Sangalay Delhi Technological University, Cheng Cheng Concordia University, Yuan Tian Queen's University, Kingston, Ontario, Jinqiu Yang Concordia University
16:17
14m
Full-paper
Investigating the Performance of Language Models for Completing Code in Functional Programming Languages: a Haskell Case StudyFull Paper
Research Track
Tim van Dam Delft University of Technology, Frank van der Heijden Delft University of Technology, Philippe de Bekker Delft University of Technology, Berend Nieuwschepen Delft University of Technology, Marc Otten Delft University of Technology, Maliheh Izadi Delft University of Technology
16:31
7m
Short-paper
On Evaluating the Efficiency of Source Code Generated by LLMsNew Idea Paper
Research Track
Changan Niu Software Institute, Nanjing University, Ting Zhang Singapore Management University, Chuanyi Li Nanjing University, Bin Luo Nanjing University, Vincent Ng Human Language Technology Research Institute, University of Texas at Dallas, Richardson, TX 75083-0688
16:38
14m
Full-paper
PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4Full Paper
Research Track
Seif Abukhalaf Polytechnique Montreal, Mohammad Hamdaqa Polytechnique Montréal, Foutse Khomh École Polytechnique de Montréal
16:52
7m
Short-paper
Creative and Correct: Requesting Diverse Code Solutions from AI Foundation ModelsNew Idea Paper
Research Track
Scott Blyth Monash University, Christoph Treude Singapore Management University, Markus Wagner Monash University, Australia
16:59
7m
Short-paper
Commit Message Generation via ChatGPT: How Far Are We?New Idea Paper
Research Track
Yifan Wu Peking University, Ying Li School of Software and Microelectronics, Peking University, Beijing, China, Siyu Yu The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen)
17:06
24m
Other
Discussion
Research Track