FSE 2025
Mon 23 - Fri 27 June 2025 Trondheim, Norway
co-located with ISSTA 2025
Mon 23 Jun 2025 12:10 - 12:30 at Vega - Performance Chair(s): Philipp Leitner

Code generation has largely improved development efficiency in the era of large language models (LLMs). With the ability to follow instructions, current LLMs can be prompted to generate code solutions given detailed descriptions in natural language. Many research efforts are being devoted to improving the correctness of LLM-generated code, and many benchmarks are proposed to evaluate the correctness comprehensively. Despite the focus on correctness, the time efficiency of LLM-generated code solutions is under-explored. Current correctness benchmarks are not suitable for time efficiency evaluation since their test cases cannot well distinguish the time efficiency of different code solutions. Besides, the current execution time measurement is not stable and comprehensive, threatening the validity of the time efficiency evaluation.

To address the challenges in the time efficiency evaluation of code generation, we propose COFFE, a code generation benchmark for evaluating the time efficiency of LLM-generated code solutions. COFFE contains 398 and 358 problems for function-level and file-level code generation, respectively. To improve the distinguishability, we design a novel stressful test case generation approach with contracts and two new formats of test cases to improve the accuracy of generation. For the time evaluation metric, we propose efficienct@k based on CPU instruction count to ensure a stable and solid comparison between different solutions. We evaluate 14 popular LLMs on COFFE and identify four findings. Based on the findings, we draw some implications for LLM researchers and software practitioners to facilitate future research and usage of LLMs in code generation.

Mon 23 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

10:30 - 12:30
PerformanceDemonstrations / Research Papers / Ideas, Visions and Reflections / Journal First / Industry Papers at Vega
Chair(s): Philipp Leitner Chalmers | University of Gothenburg
10:30
20m
Talk
Accuracy Can Lie: On the Impact of Surrogate Model in Configuration Tuning
Journal First
Pengzhou Chen University of electronic science and technology of China, Jingzhi Gong University of Leeds, Tao Chen University of Birmingham
10:50
20m
Talk
Understanding Debugging as Episodes: A Case Study on Performance Bugs in Configurable Software Systems
Research Papers
Max Weber Leipzig University, Alina Mailach Leipzig University, Sven Apel Saarland University, Janet Siegmund Chemnitz University of Technology, Raimund Dachselt Technical University of Dresden, Norbert Siegmund Leipzig University
DOI
11:10
20m
Talk
Towards Understanding Performance Bugs in Popular Data Science Libraries
Research Papers
Haowen Yang The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Zhengda Li The Chinese University of Hong Kong, Shenzhen, Zhiqing Zhong The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Xiaoying Tang hinese University of Hong Kong, Shenzhen, Pinjia He Chinese University of Hong Kong, Shenzhen
DOI
11:30
20m
Talk
When Should I Run My Application Benchmark?: Studying Cloud Performance Variability for the Case of Stream Processing Applications
Industry Papers
Sören Henning Dynatrace Research, Adriano Vogel , Esteban Pérez Wohlfeil Dynatrace Research, Otmar Ertl Dynatrace Research, Rick Rabiser LIT CPS, Johannes Kepler University Linz
DOI Pre-print
11:50
10m
Talk
LitmusKt: Concurrency Stress Testing for Kotlin
Demonstrations
Denis Lochmelis Constructor University Bremen, JetBrains Research, Evgenii Moiseenko JetBrains Research, Yaroslav Golubev JetBrains Research, Anton Podkopaev JetBrains Research, Constructor University
DOI Pre-print
12:00
10m
Talk
Breaking the Loop: AWARE is the New MAPE-K
Ideas, Visions and Reflections
Brell SANWOUO Univ. Lille / INRIA, Clément Quinton University of Lille, Paul Temple IRISA
12:10
20m
Talk
COFFE: A Code Efficiency Benchmark for Code Generation
Research Papers
Yun Peng The Chinese University of Hong Kong, Jun Wan Zhejiang University, Yichen LI The Chinese University of Hong Kong, Xiaoxue Ren Zhejiang University
DOI

Information for Participants
Mon 23 Jun 2025 10:30 - 12:30 at Vega - Performance Chair(s): Philipp Leitner
Info for room Vega:

Vega is close to the registration desk.

Facing the registration desk, its entrance is on the left, close to the hotel side entrance.

:
:
:
: