Integrating Large Language Models and Reinforcement Learning for Non-Linear Reasoning (FSE 2025 - Research Papers)

Mon 23 - Fri 27 June 2025 Trondheim, Norway

co-located with ISSTA 2025

Who

Yoav Alon, Cristina David

Track

FSE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 24 Jun 2025 14:20 - 14:40 at Cosmos Hall - LLM for SE 2 Chair(s): Jialun Cao

Abstract

Large Language Models (LLMs) were shown to struggle with long-term planning, which may be caused by the limited way in which they explore the space of possible solutions. We propose an architecture where a Reinforcement Learning (RL) Agent guides an LLM’s space exploration: (1) the Agent has access to domain-specific information, and can therefore make decisions about the quality of candidate solutions based on specific and relevant metrics, which were not explicitly considered by the LLM’s training objective; (2) the LLM can focus on generating immediate next steps, without the need for long-term planning. We allow non-linear reasoning by exploring alternative paths and backtracking. We evaluate this architecture on the program equivalence task, and compare it against Chain of Thought (CoT) and Tree of Thoughts (ToT). We assess both the downstream task, denoting the binary classification, and the intermediate reasoning steps. Our approach compares positively against CoT and ToT.

DOI

https://doi.org/10.1145/3715761

Yoav Alon

University of Bristol

United Kingdom

Cristina David

University of Bristol

United Kingdom

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 24 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:30	LLM for SE 2Research Papers / Industry Papers / Ideas, Visions and Reflections at Cosmos Hall Chair(s): Jialun Cao Hong Kong University of Science and Technology

14:00 20m Talk		Migrating Code At Scale With LLMs At Google Industry Papers Celal Ziftci Google, Stoyan Nikolov Google, Inc., Anna Sjovall Google, Inc., Bo Kim Google, Daniele Codecasa Google, Inc., Max Kim Google
14:20 20m Talk		Integrating Large Language Models and Reinforcement Learning for Non-Linear Reasoning Research Papers Yoav Alon University of Bristol, Cristina David University of Bristol DOI
14:40 20m Talk		Smaller but Better: Self-Paced Knowledge Distillation for Lightweight yet Effective LCMs Research Papers Yujia Chen Harbin Institute of Technology, Shenzhen, Yang Ye Huawei Cloud Computing Technologies Co., Ltd., Zhongqi Li Huawei Cloud Computing Technologies Co., Ltd., Yuchi Ma Huawei Cloud Computing Technologies, Cuiyun Gao Harbin Institute of Technology, Shenzhen DOI
15:00 10m Talk		Enabling Scalable Proactive Workspaces With Environment-Wide Context Ideas, Visions and Reflections Nick Bradley University of British Columbia, Thomas Fritz University of Zurich, Reid Holmes University of British Columbia
15:10 20m Talk		Bridging Operator Semantic Inconsistencies: A Source-level Cross-framework Model Conversion Approach Research Papers Xingpei Li National University of Defense Technology, China, Yan Lei Chongqing University, Zhouyang Jia National University of Defense Technology, Yuanliang Zhang National University of Defense Technology, Haoran Liu National University of Defense Technology, Liqian Chen National University of Defense Technology, Wei Dong National University of Defense Technology, Shanshan Li National University of Defense Technology DOI

Information for Participants

Tue 24 Jun 2025 14:00 - 15:30 at Cosmos Hall - LLM for SE 2 Chair(s): Jialun Cao

Info for room Cosmos Hall:

This is the main event hall of Clarion Hotel, which will be used to host keynote talks and other plenary sessions. The FSE and ISSTA banquets will also happen in this room.

The room is just in front of the registration desk, on the other side of the main conference area. The large doors with numbers “1” and “2” provide access to the Cosmos Hall.