FORGE 2024 Keynotes
Keynote 1: Large Language Models for Test Case Repair
Time: Sun 14 April 2024 09:10 - 09:50
Abstract: Ensuring the quality of software systems through testing is essential, yet maintaining test cases poses significant challenges. The need for frequent updates to align with the evolving system under test often entails high complexity and cost for maintaining these test cases. Further, unrepaired broken test cases can degrade test suite quality and disrupt the software development process, wasting developers’ time. In addition, flaky tests are problematic because they non-deterministically pass or fail for the same software version under test, causing confusion and wasting development effort. In this presentation, I will report on recent work using language models to help automate test repair and will reflect on current results, limitations, and future work.
Prof. Lionel C. Briand is professor of software engineering and has shared appointments between (1) School of Electrical Engineering and Computer Science, University of Ottawa, Canada and (2) The Lero SFI Centre for Software Research, University of Limerick, Ireland. He is a Canada research chair in Intelligent Software Dependability and Compliance (Tier 1) and the director of Lero. He has conducted applied research in collaboration with industry for more than 25 years, including projects in the automotive, aerospace, manufacturing, financial, and energy domains. He is a fellow of the IEEE, ACM, and Royal Society of Canada. He was also granted the IEEE Computer Society Harlan Mills Award (2012), the IEEE Reliability Society Engineer-of-the-year award (2013), and the ACM SIGSOFT Outstanding Research Award (2022) for his work on software testing and verification. His research interests include software testing and verification (including security and AI aspects), trustworthy AI, applications of AI in software engineering, model-driven software development, requirements engineering, and empirical software engineering. Further details can be found on: www.lbriand.info.
Keynote 2: Towards an Interpretable Science of Deep Learning for Software Engineering: A Causal Inference View
Time: Sun 14 April 2024 14:00 - 14:40
Abstract: Neural Code Models (NCMs) are rapidly progressing from research prototypes to commercial developer tools. As such, understanding the capabilities and limitations of such models is becoming critical. However, the abilities of these models are typically measured using automated metrics that often only reveal a portion of their real-world performance. While, in general, the performance of NCMs appears promising, currently much is unknown about how such models arrive at decisions or whether practitioners trust NCMs’ outcomes. In this talk, I will introduce doCode, a post hoc interpretability framework specific to NCMs that can explain model predictions. doCode is based upon causal inference to enable programming language-oriented explanations. While the theoretical underpinnings of doCode are extensible to exploring different model properties, we provide a concrete instantiation that aims to mitigate the impact of spurious correlations by grounding explanations of model behavior in properties of programming languages. doCode can generate causal explanations based on Abstract Syntax Tree information and software engineering-based interventions. To demonstrate the practical benefit of doCode, I will present empirical results of using doCode for detecting confounding bias in NCMs.
Prof. Denys Poshyvanyk is a Chancellor Professor and a Graduate Director in the Computer Science Department at William & Mary. He currently serves as a Guest Editor-in-Chief of the AI-SE Continuous Special Section at the ACM Transactions on Software Engineering and Methodology (TOSEM) and a Program Co-Chair for FSE’25. He is a recipient of multiple ACM SIGSOFT Distinguished paper awards, the NSF CAREER award (2013). He is an IEEE Fellow and an ACM distinguished member. Further details can be found on: https://conf.researchr.org/profile/icse-2024/denysposhyvanyk.
Sun 14 AprDisplayed time zone: Lisbon change
09:00 - 10:30 | FORGE2024 Opening / Keynote 1 / PanelKeynotes / Panel at Luis de Freitas Branco Chair(s): Xin Xia Huawei Technologies, Xing Hu Zhejiang University | ||
09:00 10mDay opening | Introduction from The Chairs Keynotes | ||
09:10 40mKeynote | Keynote 1: Large Language Models for Test Case Repair Keynotes Lionel Briand University of Ottawa, Canada; Lero centre, University of Limerick, Ireland | ||
09:50 40mPanel | Theme: Is There Space for Software Engineering Researchers to Contribute to AI4SE in The Era of Foundation Models? Panel Lionel Briand University of Ottawa, Canada; Lero centre, University of Limerick, Ireland, Denys Poshyvanyk William & Mary, Prem Devanbu University of California at Davis, Massimiliano Di Penta University of Sannio, Italy, David Lo Singapore Management University |
11:00 - 12:30 | Foundation Models for Software Quality AssuranceResearch Track at Luis de Freitas Branco Chair(s): Matteo Ciniselli Università della Svizzera Italiana | ||
11:00 14mFull-paper | Deep Multiple Assertions GenerationFull Paper Research Track | ||
11:14 14mFull-paper | MeTMaP: Metamorphic Testing for Detecting False Vector Matching Problems in LLM Augmented GenerationFull Paper Research Track Guanyu Wang Beijing University of Posts and Telecommunications, Yuekang Li The University of New South Wales, Yi Liu Nanyang Technological University, Gelei Deng Nanyang Technological University, Li Tianlin Nanyang Technological University, Guosheng Xu Beijing University of Posts and Telecommunications, Yang Liu Nanyang Technological University, Haoyu Wang Huazhong University of Science and Technology, Kailong Wang Huazhong University of Science and Technology | ||
11:28 14mFull-paper | Planning to Guide LLM for Code Coverage PredictionFull Paper Research Track Hridya Dhulipala University of Texas at Dallas, Aashish Yadavally University of Texas at Dallas, Tien N. Nguyen University of Texas at Dallas | ||
11:42 7mShort-paper | The Emergence of Large Language Models in Static Analysis: A First Look through Micro-BenchmarksNew Idea Paper Research Track Ashwin Prasad Shivarpatna Venkatesh University of Paderborn, Samkutty Sabu University of Paderborn, Amir Mir Delft University of Technology, Sofia Reis Instituto Superior Técnico, U. Lisboa & INESC-ID, Eric Bodden | ||
11:49 14mFull-paper | Reality Bites: Assessing the Realism of Driving Scenarios with Large Language ModelsFull Paper Research Track Jiahui Wu Simula Research Laboratory and University of Oslo, Chengjie Lu Simula Research Laboratory and University of Oslo, Aitor Arrieta Mondragon University, Tao Yue Beihang University, Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University | ||
12:03 7mShort-paper | Assessing the Impact of GPT-4 Turbo in Generating Defeaters for Assurance CasesNew Idea Paper Research Track Kimya Khakzad Shahandashti York University, Mithila Sivakumar York University, Mohammad Mahdi Mohajer York University, Alvine Boaye Belle York University, Song Wang York University, Timothy Lethbridge University of Ottawa | ||
12:10 20mOther | Discussion Research Track |
16:00 - 17:30 | FORGE2024 Awards & Foundation Models for Code and Documentation GenerationResearch Track at Luis de Freitas Branco Chair(s): Antonio Mastropaolo Università della Svizzera italiana | ||
16:00 10mAwards | Award Ceremony Research Track | ||
16:10 7mShort-paper | Fine Tuning Large Language Model for Secure Code GenerationNew Idea Paper Research Track Junjie Li Concordia University, Aseem Sangalay Delhi Technological University, Cheng Cheng Concordia University, Yuan Tian Queen's University, Kingston, Ontario, Jinqiu Yang Concordia University | ||
16:17 14mFull-paper | Investigating the Performance of Language Models for Completing Code in Functional Programming Languages: a Haskell Case StudyFull Paper Research Track Tim van Dam Delft University of Technology, Frank van der Heijden Delft University of Technology, Philippe de Bekker Delft University of Technology, Berend Nieuwschepen Delft University of Technology, Marc Otten Delft University of Technology, Maliheh Izadi Delft University of Technology | ||
16:31 7mShort-paper | On Evaluating the Efficiency of Source Code Generated by LLMsNew Idea Paper Research Track Changan Niu Software Institute, Nanjing University, Ting Zhang Singapore Management University, Chuanyi Li Nanjing University, Bin Luo Nanjing University, Vincent Ng Human Language Technology Research Institute, University of Texas at Dallas, Richardson, TX 75083-0688 | ||
16:38 14mFull-paper | PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4Full Paper Research Track Seif Abukhalaf Polytechnique Montreal, Mohammad Hamdaqa Polytechnique Montréal, Foutse Khomh École Polytechnique de Montréal | ||
16:52 7mShort-paper | Creative and Correct: Requesting Diverse Code Solutions from AI Foundation ModelsNew Idea Paper Research Track Scott Blyth Monash University, Christoph Treude Singapore Management University, Markus Wagner Monash University, Australia | ||
16:59 7mShort-paper | Commit Message Generation via ChatGPT: How Far Are We?New Idea Paper Research Track | ||
17:06 24mOther | Discussion Research Track |