Keynotes - ESEIW 2024

The Method Behind the Magic: Ensuring Reliability in Software Engineering Empirical Results

Sira Vegas

Sira Vegas is an Associate Professor at the Universidad Politécnica de Madrid. She has been General Co-Chair of EASE’23, Program Co-Chair of ESEM’07, and a member of its SC (2006-08). Sira is the Steering Committee Chair of the International Software Engineering Research Network (ISERN) since 2023. She has been Program co-Chair of ICSE-DS’21, the RENE track of ICPC’24 and SANER’24, and Program Chair of the Journal-First track of PROFES’23. Sira has participated in the organization of CSEE&T’03. She has been a PC member of different ICSE tracks (research, NIER, SEET, SEIP, DS, Demos, Artifacts and SRC), and other conferences such as ESEM, ASE, and MSR. She is a regular reviewer of IEEE Transactions on Software Engineering and was a member of the review board (2019-2024). Sira is also a regular reviewer of the Empirical Software Engineering Journal and ACM Transactions on Software Engineering and Methodology. She earned her B.S. and Ph.D. degrees in Computing from the Universidad Politécnica de Madrid.

Abstract

Empirical software engineering research plays a critical role in advancing the field, but the reliability of findings depends heavily on the rigor of the methodology. In this keynote, I will discuss the importance of addressing key aspects of validity—internal, construct, and statistical conclusion validity—when conducting empirical studies. Using two prevalent research areas, Mining Software Repositories (MSR) and Deep Learning (DL) algorithms, I will illustrate common methodological challenges. Through these examples, I will explore potential strategies and considerations for improving study design and analysis, aiming to enhance the reliability and scientific contribution of empirical software engineering studies, ultimately driving more robust insights and advancements in the field.

Charting New Frontiers: Exploring Limits, Threats, and Ecosystems of LLMs in Software Engineering

David Lo

David Lo is the OUB Chair Professor of Computer Science and Director of the Center for Research in Intelligent Software Engineering (RISE) at Singapore Management University. Championing the area of AI for Software Engineering (AI4SE) since the mid-2000s, he has demonstrated how AI — encompassing data mining, machine learning, information retrieval, natural language processing, and search-based algorithms — can transform software engineering data into actionable insights and automation. Through empirical studies, he has also identified practitioners' pain points, characterized the limitations of AI4SE solutions, and explored practitioners' acceptance thresholds for AI-powered tools. His contributions have led to over 20 awards, including two Test-of-Time awards and eleven ACM SIGSOFT/IEEE TCSE Distinguished Paper awards, and his work has garnered over 35,000 citations. An ACM Fellow, IEEE Fellow, ASE Fellow, and National Research Foundation Investigator (Senior Fellow), Lo has also served as a PC Co-Chair for ASE'20, FSE'24, and ICSE'25. For more information, visit: http://www.mysmu.edu/faculty/davidlo/

Abstract

Large language models (LLMs) are transforming software engineering, but their adoption brings critical challenges. In this keynote, I will explore three key thrusts: the *limits* of LLMs, including their struggles with long-tailed data distributions and the quality of generated outputs; the *threats* they pose, such as their robustness, vulnerabilities to backdoor attacks and the memorization of sensitive information; and the *emerging ecosystems* surrounding their reuse, licensing, and documentation practices. Empirical research plays a pivotal role in uncovering these challenges and guiding the responsible development of LLMs in software engineering. By addressing these issues, we can chart a path forward for future research and innovation in this rapidly evolving field.