Predictive Prompt Analysis (FSE 2025 - Ideas, Visions and Reflections)

Mon 23 - Fri 27 June 2025 Trondheim, Norway

Who

Jae Yong Lee, Sungmin Kang, Shin Yoo

Track

FSE 2025 Ideas, Visions and Reflections

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 25 Jun 2025 11:40 - 11:50 at Cosmos 3B - LLM and Prompt Chair(s): Giuseppe Scanniello

Abstract

Large Language Models (LLMs) are machine learning models that have seen widespread adoption due to their capability of handling previously difficult tasks. LLMs, due to their training, are sensitive to how exactly a question is presented, also known as prompting. However, prompting well is challenging, as it has been difficult to uncover principles behind prompting – generally, trial-and-error is the most common way of improving prompts, despite its significant computational cost. In this context, we argue it would be useful to perform `predictive prompt analysis’, in which an automated technique would perform a quick analysis of a prompt and predict how the LLM would react to it, relative to a goal provided by the user. As a demonstration of the concept, we present Syntactic Prevalence Analyzer (SPA), a predictive prompt analysis approach based on sparse autoencoders (SAEs). SPA accurately predicted how often an LLM would generate target syntactic structures during code synthesis, with up to 0.994 Pearson correlation between the predicted and actual prevalence of the target structure. At the same time, SPA requires only 0.4% of the time it takes to run the LLM on a benchmark. As LLMs are increasingly used during and integrated into modern software development, our proposed predictive prompt analysis concept has the potential to significantly ease the use of LLMs for both practitioners and researchers.

Jae Yong Lee

Sungmin Kang

NUS

South Korea

Shin Yoo

KAIST

South Korea

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 25 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:30	LLM and PromptIndustry Papers / Ideas, Visions and Reflections / Research Papers / Journal First at Cosmos 3B Chair(s): Giuseppe Scanniello University of Salerno

11:00 20m Talk		On Inter-dataset Code Duplication and Data Leakage in Large Language Models Journal First José Antonio Hernández López Linköping University, Boqi Chen McGill University, Mootez Saad Dalhousie University, Tushar Sharma Dalhousie University, Daniel Varro Linköping University / McGill University
11:20 20m Talk		LLM App Squatting and Cloning Industry Papers Yinglin Xie Huazhong University of Science and Technology, Xinyi Hou Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Kai Chen Huazhong University of Science and Technology, Haoyu Wang Huazhong University of Science and Technology
11:40 10m Talk		Predictive Prompt Analysis Ideas, Visions and Reflections Jae Yong Lee , Sungmin Kang NUS, Shin Yoo KAIST
11:50 20m Talk		From Prompts to Templates: A Systematic Prompt Template Analysis for Real-world LLMapps Industry Papers Yuetian Mao Technical University of Munich, Junjie He Technical University of Munich, Chunyang Chen TU Munich
12:10 20m Talk		Prompts Are Programs Too! Understanding How Developers Build Software Containing Prompts Research Papers Jenny T. Liang Carnegie Mellon University, Melissa Lin Carnegie Mellon University, Nikitha Rao Carnegie Mellon University, Brad A. Myers Carnegie Mellon University DOI

Information for Participants

Wed 25 Jun 2025 11:00 - 12:30 at Cosmos 3B - LLM and Prompt Chair(s): Giuseppe Scanniello

Info for room Cosmos 3B:

Cosmos 3B is the second room in the Cosmos 3 wing.

When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.