Towards Privacy-Preserving Code Generation: Understanding and Mitigating Memorization in Code Large Language Models (FSE 2025 - Doctoral Symposium)

Mon 23 - Fri 27 June 2025 Trondheim, Norway

Track

FSE 2025 Doctoral Symposium

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 25 Jun 2025 14:00 - 15:30 at Andromeda - Lightning Talks and Poster Session Chair(s): Jie M. Zhang

Abstract

Recent advances in code specialized language models have transformed software development by assisting developers for tasks, e.g., code generation, debugging, or testing. However, training these models on vast, centralized datasets raises significant privacy concerns, as inadvertent memorization of sensitive code can lead to data breaches and intellectual property risks. This research investigates the underlying patterns of code memorization and explores privacy preserving techniques, such as Federated Learning with Differential Privacy and targeted noise injection, to mitigate these risks. The goal is to develop robust and secure code generation systems that maintain high performance while safeguarding sensitive information, ultimately fostering greater trust and wider adoption in modern Software Engineering.

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 25 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:30	Lightning Talks and Poster SessionDoctoral Symposium at Andromeda Chair(s): Jie M. Zhang King's College London

14:00 90m Doctoral symposium paper		Towards Privacy-Preserving Code Generation: Understanding and Mitigating Memorization in Code Large Language Models Doctoral Symposium Melih Catal University of Zurich
14:00 90m Doctoral symposium paper		Reversing Programs for Error Reachability Analysis Doctoral Symposium Adéla Štěpková Masaryk University
14:00 90m Doctoral symposium paper		Automated Vulnerability-Focused Code Reviews Doctoral Symposium Leonardo Centellas Computer Science Department, Pontificia Universidad Catolica de Chile
14:00 90m Doctoral symposium paper		On the Brittleness of Legacy Web UI Testing: A Pragmatic Perspective Doctoral Symposium Haonan Zhang University of Waterloo
14:00 90m Doctoral symposium paper		Co-Intelligence in Software Engineering: Understanding and Optimizing GenAI Integration in Software Engineering for Skill Development in Time-Constrained Programming Doctoral Symposium Omkar Joshi Indian Institute of Technology Bombay
14:00 90m Doctoral symposium paper		Investigating the Role of Formal Verification in Software Development: From Automatic Specification Generation to Usability of Verification Languages Doctoral Symposium Tina Masoudi PhD student
14:00 90m Doctoral symposium paper		Enhancing Large Language Model Integration in Integrated Development Environments Doctoral Symposium Daniele Cipollone Delft University of Technology, Netherlands
14:00 90m Doctoral symposium paper		Towards More Interpretable Large Language Models for Code Doctoral Symposium Daniel Rodriguez-Cardenas William & Mary
14:00 90m Doctoral symposium paper		Formal Liability Apportionment in Autonomous Systems Doctoral Symposium Kaveh Aryan King's College London
14:00 90m Doctoral symposium paper		Fine-Grained Developer Reification Doctoral Symposium Stefano Campanella REVEAL @ Software Institute - USI, Lugano, Switzerland
14:00 90m Doctoral symposium paper		End-to-End Testing Gamification: A Novel Approach to the Verification and Validation of Web and Mobile Applications Doctoral Symposium Lorenzo Laudadio Politecnico di Torino
14:00 90m Doctoral symposium paper		Research Open-Source Software: supporting small communities with technical and social aspects Doctoral Symposium Lavinia Francesca Paganini Eindhoven University of Technology
14:00 90m Doctoral symposium paper		Generating Code Tours Using Locally-Runnable LLMs Doctoral Symposium Martin Balfroid University of Namur
14:00 90m Doctoral symposium paper		When Performance Failure Occurs in Low-Latency Storage Systems: Observation, Prediction, and Solutions Doctoral Symposium Linxiao Bai National University of Defense Technology
14:00 90m Doctoral symposium paper		Mediating between Human Programmers and Integrated Development Environments using LLM-based Agents Doctoral Symposium Ziyou Li Delft University of Technology
14:00 90m Doctoral symposium paper		Automating the conformity assessment of Cyber-Physical Systems software Doctoral Symposium Guillaume Nguyen University of Namur
14:00 90m Doctoral symposium paper		Securing Language Models Against Vulnerability Encoding Doctoral Symposium Rui Melo University of Porto Link to publication

Information for Participants

Wed 25 Jun 2025 14:00 - 15:30 at Andromeda - Lightning Talks and Poster Session Chair(s): Jie M. Zhang

Info for room Andromeda:

Andromeda is located close to the restaurant and the bar, at the end of the corridor on the side of the bar.

From the registration desk, go towards the restaurant, turn left towards the bar, walk until the end of the corridor.

Towards Privacy-Preserving Code Generation: Understanding and Mitigating Memorization in Code Large Language Models

Wed 25 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

Melih Catal

University of Zurich

Tracks

Co-hosted Conferences

Workshops

Towards Privacy-Preserving Code Generation: Understanding and Mitigating Memorization in Code Large Language Models

Program Display Configuration

Program Display Configuration

Wed 25 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

Information for Participants

Melih Catal

University of Zurich

Wed 25 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change