ICSE 2026
Sun 12 - Sat 18 April 2026 Rio de Janeiro, Brazil

Performance bugs are inefficiencies in software that waste computational resources without causing functional failures, making them particularly challenging to detect and fix. While recent advances in Software Engineering agents have shown promise in automated bug fixing, existing benchmarks primarily focus on functional correctness and fail to evaluate agents’ abilities to identify and resolve non-functional issues like performance bugs. We introduce PerfBench, a benchmark comprising 81 real-world performance bug-fixing tasks from popular .NET repositories on GitHub. Unlike existing benchmarks that rely on pre-existing test suites, PerfBench features a novel evaluation harness that allows agents to generate their own performance benchmarks and validates fixes by comparing execution metrics collected for developer fix and agent fix. Each task in PerfBench is derived from actual developer fixes linked to performance-related issues, which are then verified by human experts, ensuring real-world relevance. Our evaluation reveals that current state-of-the-art coding agents struggle with performance optimization tasks, with baseline OpenHands agent achieving only a $\sim$3% success rate on our benchmark. We develop OpenHands-Perf-Agent, which incorporates performance-aware tooling and instructions and achieves a $\sim$20% success rate on the benchmark. We show that by ensuring the agent has proper instructions to benchmark its changes and tooling for benchmark output processing, we can improve the agent performance significantly, but room for improvement still remains. PerfBench provides a challenging test set for furthering the capabilities of agents in fixing perf issues.

Tue 14 Apr

Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30
Evaluation, Reliability, and Engineering PracticeAGENT at Oceania VIII
Chair(s): Oshani Weerakoon Department of Computing, University of Turku
14:00
30m
Keynote
Keynote: On the Evaluation of AI Coding Agents
AGENT
K: Chao Peng ByteDance
14:30
6m
Talk
Beyond Task Completion: An Assessment Framework for Evaluating Agentic AI Systems
AGENT
Sreemaee Akshathala IIIT Hyderabad, Bassam Adnan IIIT Hyderabad, Mahisha Ramesh IIIT Hyderabad, Karthik Vaidhyanathan IIIT Hyderabad, Basil Muhammed MontyCloud, Kannan Parthasarathy MontyCloud
14:36
6m
Talk
PerfBench: Can Agents Resolve Real-World Performance Bugs?Virtual Attendance
AGENT
Spandan Garg Microsoft Corporation, Roshanak Zilouchian Moghaddam Microsoft, Neel Sundaresan Microsoft
Pre-print Media Attached
14:42
6m
Talk
SWEnergy: An Empirical Study on Energy Efficiency in Agentic Issue Resolution Frameworks with SLMs
AGENT
Arihant Tripathy IIIT Hyderabad, India, Ch Pavan Harshit IIIT Hyderabad, India, Karthik Vaidhyanathan IIIT Hyderabad
14:48
6m
Talk
Context Matters: Evaluating MCP-Based Context-Aware AI — A Case Study of Email Communications in Nonprofit Organizations
AGENT
Nitin Gupta University of Victoria, Jayani Samaraweera University of Victoria, Raaj Chatterjee Meaningful Technology Inc., Riya Shrestha University of Victoria, Trinity West University of Victoria, Dana Damian University of Victoria
14:54
6m
Talk
Toward Agentic Software Project Management: A Vision and Roadmap
AGENT
Lakshana Assalaarachchi Monash University, Australia, Zainab Masood Prince Sultan University, Rashina Hoda Monash University, John Grundy Monash University
15:00
6m
Talk
Not All Problems Are Nails, Not All Tools Should Be Hammers: A Position Paper on Agent Usage in Software Engineering Tasks
AGENT
Juuso Rytilahti Department of Computing, University of Turku, Panu Puhtila University of Turku, Oshani Weerakoon Department of Computing, University of Turku, Erkki Kaila Department of Computing, University of Turku, Tuomas Mäkilä University of Turku
15:06
24m
Live Q&A
Session 3 Joint Q&A
AGENT