ISSTA 2025
Wed 25 - Sat 28 June 2025 Trondheim, Norway
co-located with FSE 2025
Thu 26 Jun 2025 16:00 - 16:25 at Cosmos 3C - Code Generation with LLMs Chair(s): Yutian Tang

The GitHub issue resolution task aims to resolve issues reported in repositories automatically. With advances in large language models (LLMs), this task has gained increasing attention, and several benchmarks are proposed to evaluate the issue resolution ability of LLMs. However, existing benchmarks have three main limitations. First, current benchmarks focus on a single programming language, limiting the evaluation of issues from repositories across different languages. Second, they usually cover a narrow range of domains, which may fail to represent the diversity of real-world issues. Third, existing benchmarks rely solely on textual information in issue descriptions, overlooking multimodal information such as images in issues. In this paper, we propose OmniGIRL, a GitHub Issue ResoLution benchmark that is multilingual, multimodal, and multi-domain. OmniGIRL includes 959 task instances, which are collected from repositories across four programming languages (i.e., Python, JavaScript, TypeScript, and Java) and eight different domains. Our evaluation shows that current LLMs show limited performances on OmniGIRL. Notably, the best-performing model, GPT-4o, resolves only 8.6% of the issues. Besides, we find that current LLMs struggle to resolve issues requiring understanding images. The best performance is achieved by Claude-3.5-Sonnet, which resolves only 10.5% of the issues with image information. Finally, we analyze the reasons behind current LLMs’ failure on OmniGIRL, providing insights for future improvements.

Thu 26 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

16:00 - 17:15
Code Generation with LLMsResearch Papers at Cosmos 3C
Chair(s): Yutian Tang University of Glasgow, United Kingdom
16:00
25m
Talk
OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution
Research Papers
Lianghong Guo Sun Yat-sen University, Wei Tao Independent Researcher, Runhan Jiang Sun Yat-sen University, Yanlin Wang Sun Yat-sen University, Jiachi Chen Sun Yat-sen University, Xilin Liu Huawei Cloud, Yuchi Ma Huawei Cloud Computing Technologies, Mingzhi Mao Sun Yat-sen University, Hongyu Zhang Chongqing University, Zibin Zheng Sun Yat-sen University
DOI
16:25
25m
Talk
ConTested: Consistency-Aided Tested Code Generation with LLM
Research Papers
Jinhao Dong Peking University, Jun Sun Singapore Management University, Wenjie Zhang National University of Singapore, Jin Song Dong National University of Singapore, Dan Hao Peking University
DOI Pre-print
16:50
25m
Talk
Causality-Aided Evaluation and Explanation of Large Language Model-based Code Generation
Research Papers
Zhenlan Ji The Hong Kong University of Science and Technology, Pingchuan Ma HKUST, Li Zongjie Hong Kong University of Science and Technology, Zhaoyu Wang HKUST, Shuai Wang Hong Kong University of Science and Technology
DOI

Information for Participants
Thu 26 Jun 2025 16:00 - 17:15 at Cosmos 3C - Code Generation with LLMs Chair(s): Yutian Tang
Info for room Cosmos 3C:

Cosmos 3C is the third room in the Cosmos 3 wing.

When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.

:
:
:
: