Backdoors in Code Summarizers: How Bad Is It? (ASE 2025 - Research Papers)

Who

Chenyu Wang, Zhou Yang, Yaniv Harel, David Lo

Track

ASE 2025 Research Papers

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 19 Nov 2025 15:10 - 15:20 at Grand Hall 5 - Security 4

Abstract

Large Language Models for Code (Code LLMs) are increasingly employed in software development. However, studies have recently shown that these models are vulnerable to backdoor attacks: when a trigger (a specific input pattern) appears in the input, the backdoor will be activated and cause the model to generate malicious outputs desired by the attacker. Researchers have designed various triggers and demonstrated the feasibility of implanting backdoors by poisoning a fraction of the training data (known as data poisoning). Some basic conclusions have been made, such as backdoors becoming easier to implant when attackers modify more training data. However, existing research has not explored other factors influencing backdoor attacks on Code LLMs, such as training batch size, epoch number, and the broader design space for triggers, e.g., trigger length. To bridge this gap, we use the code summarization task as an example to perform a comprehensive empirical study that systematically investigates the factors affecting backdoor effectiveness and understands the extent of the threat posed by backdoor attacks on Code LLMs. Three categories of factors are considered: data, model, and inference, revealing findings overlooked in previous studies for practitioners to mitigate backdoor threats. For example, Code LLM developers can adopt higher batch sizes with fewer epochs appropriately. Users of code models can adjust inference parameters, such as using a higher temperature or a larger top-k, appropriately. Future backdoor defense can prioritize the inspection of rarer and longer tokens, since they are more effective if they are indeed triggers. Since these non-backdoor design factors can also greatly sway attack performance, future backdoor studies should fully report settings, control key factors, and systematically vary them across configurations. What’s more, we find that the prevailing consensus, that attacks are ineffective at extremely low poisoning rates, is incorrect. The absolute number of poisoned samples matters as well. Specifically, poisoning just 20 out of 454,451 samples (0.004% poisoning rate, far below the minimum setting of 0.1% considered in prior Code LLM backdoor attack studies) successfully implants backdoors! Moreover, the common defense is incapable of removing even a single poisoned sample from this poisoned dataset, highlighting the urgent need for defense mechanisms against extremely low poisoning rate settings.

Link to Preprint

https://arxiv.org/abs/2506.01825

Chenyu Wang

Singapore Management University

Singapore

Zhou Yang

University of Alberta, Alberta Machine Intelligence Institute

Canada

Yaniv Harel

Tel Aviv University

David Lo