ISSTA 2025
Wed 25 - Sat 28 June 2025 Trondheim, Norway
co-located with FSE 2025
Wed 25 Jun 2025 14:25 - 14:50 at Aurora A - Evolution, Continuous Integration, and Deployment Chair(s): Laura Plein

Large language models (LLMs) have demonstrated remarkable performance in code generation, significantly enhancing the coding efficiency of developers. Recent advancements in LLM-based agents have led to significant progress in end-to-end automatic software engineering (ASE), particularly in software maintenance (e.g., fixing software issues) and evolution (e.g., adding new features). Despite these encouraging advances, current research faces two major challenges. First, state-of-the-art performance primarily depends on closed-source models like GPT-4, which significantly limits the technology’s accessibility, and potential for customization in diverse software engineering tasks. This dependence also raises concerns about data privacy, particularly when handling sensitive codebases. Second, these models are predominantly trained on static code data, lacking a deep understanding of the dynamic interactions, iterative problem-solving processes, and evolutionary characteristics inherent in software development. Consequently, they may face challenges in navigating complex project structures and generating contextually relevant solutions, which can affect their practical utility in real-world scenarios.

To address these challenges, our study adopts a software engineering perspective. We recognize that real-world software maintenance and evolution processes encompass not only static code data but also developers’ thought processes, utilization of external tools, and the interaction between different functional personnel. Our objective is to develop an open-source large language model specifically optimized for software improvement, aiming to match the performance of closed-source alternatives while offering greater accessibility and customization potential. Consequently, we introduce the \textbf{SWE-GPT} series, comprising SWE-GPT 7B and SWE-GPT 72B. By learning from and simulating real-world code submission activities, SWE-GPT systematically incorporates the dynamic interactions and iterative problem-solving inherent in software development process—such as repository understanding, fault localization, and patch generation—thereby achieving a more comprehensive understanding of software improvement processes. We conducted experimental evaluations using SWE-bench Verified benchmark (comprising 500 real GitHub issues), recently proposed by OpenAI. The results demonstrate that \textbf{SWE-GPT 72B successfully resolves 30.20% of the GitHub issues}, marking a significant improvement in automatic issue resolution (22.76% relative improvement compared to Llama 3.1 405B), approaching the performance of closed-source models (31.80% issues of GPT-4o resolved). Notably, SWE-GPT 7B resolves 18.20% of the issues, surpassing the 17.20% resolution rate of Llama 3.1 70B, highlighting the potential for applying smaller models to ASE tasks.

Wed 25 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:15
Evolution, Continuous Integration, and DeploymentResearch Papers at Aurora A
Chair(s): Laura Plein CISPA Helmholtz Center for Information Security
14:00
25m
Talk
Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging
Research Papers
Siyuan Feng Shanghai Jiao Tong University, Jiawei Liu University of Illinois at Urbana-Champaign, Ruihang Lai Carnegie Mellon University, Charlie F. Ruan Carnegie Mellon University, Yong Yu Shanghai Jiao Tong University, Lingming Zhang University of Illinois at Urbana-Champaign, Tianqi Chen
DOI
14:25
25m
Talk
SWE-GPT: A Process-Centric Language Model for Automated Software Improvement
Research Papers
Yingwei Ma Alibaba Group, Rongyu Cao Tongyi Lab, Alibaba, China, Yongchang Cao Tongyi Lab, Alibaba, China, Yue Zhang Tongyi Lab, Alibaba, China, Jue Chen Tongyi Lab, Alibaba, China, Yibo Liu Tongyi Lab, Alibaba, China, Yuchen Liu Tongyi Lab, Alibaba, China, Binhua Li Tongyi Lab, Alibaba, China, Fei Huang Tongyi Lab, Alibaba, China, Yongbin Li Tongyi Lab, Alibaba, China
DOI
14:50
25m
Talk
What Happened in This Pipeline? Diffing Build Logs With CiDiff
Research Papers
Nicolas Hubner University of Bordeaux, LaBRI, UMR 5800, F-33400, Talence, France, Jean-Rémy Falleri Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, Institut Universitaire de France, Raluca Uricaru Univ. Bordeaux, Bordeaux INP, CNRS, LaBRI, UMR5800, F-33400 Talence, France, Thomas Degueule CNRS, Thomas Durieux TU Delft
DOI

Information for Participants
Wed 25 Jun 2025 14:00 - 15:15 at Aurora A - Evolution, Continuous Integration, and Deployment Chair(s): Laura Plein
Info for room Aurora A:

Aurora A is the first room in the Aurora wing.

When facing the main Cosmos Hall, access to the Aurora wing is on the right, close to the side entrance of the hotel.

:
:
:
: