ICSE 2026
Sun 12 - Sat 18 April 2026 Rio de Janeiro, Brazil
Tue 14 Apr 2026 12:10 - 12:20 at Oceania I - Developer Experience and Human-AI Programming Chair(s): Yiling Lou

In this paper, we present the first large-scale study exploring whether JavaScript code generated by Large Language Models (LLMs) can reveal which model produced it, enabling reliable authorship attribution and model fingerprinting. With the rapid rise of AI-generated code, attribution is playing a critical role in detecting vulnerabilities, flagging malicious content, and ensuring accountability. While AI-vs-human detection usually treats AI as a single category we show that individual LLMs leave unique stylistic signatures, even among models belonging to the same family or parameter size. To this end, we introduce LLM-NodeJS, a dataset of 50,000 Node.js back-end programs from 20 large language models. Each has four transformed variants, yielding 250,000 unique JavaScript samples and two additional representations (JSIR and AST) for diverse research applications. Using this dataset, we benchmark traditional machine learning classifiers against fine-tuned Transformer encoders and introduce CodeT5-JSA, a custom architecture derived from the 770M-parameter CodeT5 model with its decoder removed and a modified classification head. It achieves 95.8% accuracy on five-class attribution, 94.6% on ten-class, and 88.5% on twenty-class tasks, surpassing other tested models such as BERT, CodeBERT, and Longformer. We demonstrate that classifiers capture deeper stylistic regularities in program dataflow and structure, rather than relying on surface-level features. As a result, attribution remains effective even after mangling, comment removal, and heavy code transformations. To support open science and reproducibility, we release the LLM-NodeJS dataset, Google Colab training scripts, and all related materials on GitHub: https://github.com/LLM-NodeJS-dataset.

Tue 14 Apr

Displayed time zone: Brasilia, Distrito Federal, Brazil change

11:35 - 12:25
Developer Experience and Human-AI ProgrammingLLM4Code at Oceania I
Chair(s): Yiling Lou University of Illinois at Urbana-Champaign
11:35
10m
Talk
Achieving Productivity Gains with AI-based IDE features: A Journey at Google
LLM4Code
Maxim Tabachnyk Google, Inc., Xu Shu Google, Inc., Alexander Frömmgen Google, Inc., Pavel Sychev Google, Inc., Vahid Meimand Google, Inc., Ilia Krets Google, Inc., Stanislav Pyatykh Google, Inc., Abner Araujo Google, Inc., Kristof Molnar Google, Inc., Satish Chandra Meta Platforms, Inc.
11:45
10m
Talk
Usage, Effects and Requirements for AI Coding Assistants in the Enterprise: An Empirical Study
LLM4Code
Michele Merler IBM Research, Rangeet Pan IBM Research, Rahul Krishna IBM Research, Tin Kam Ho IBM Research, Raju Pavuluri IBM T.J. Watson Research Center, Maja Vukovic IBM Research
11:55
10m
Talk
Code Roulette: How Prompt Variability Affects LLM Code Generation
LLM4Code
Andrei Paleyes Department of Computer Science and Technology, Univesity of Cambridge, Diana Robinson University of Cambridge, UK, Radzim Sendyka University of Cambridge, Christian Cabrera University of Cambridge, Neil D. Lawrence Department of Computer Science and Technology, Univesity of Cambridge
12:05
5m
Talk
English or Chinese? Investigating the Impact of Prompt Language on Large Language Models for Code Summarization
LLM4Code
Yijia Tang College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China, Zhiqiu Huang Nanjing University of Aeronautics and Astronautics, Jian Xie Informationization Department (Information Technology Center), Nanjing University of Aeronautics and Astronautics, Nanjing, China, Yaoshen Yu Informationization Department (Information Technology Center), Nanjing University of Aeronautics and Astronautics, Nanjing, China, Bowei Xia College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China, Enya Shen School of Software, Tsinghua University, Beijing, China, Yukun Cao School of Computer Science and Artificial Intelligence, Wuhan Textile University
12:10
10m
Talk
The Hidden DNA of LLM-Generated JavaScript: Structural Patterns Enable High-Accuracy Authorship Attribution
LLM4Code
Norbert Tihanyi Technology Innovation Institute, Bilel Cherif Technology Innovation Institute, ABU Dhabi, UAE, Mohamed Amine Ferrag United Arab Emirates University, ABU Dhabi, UAE, Richard A. Dubniczky Eötvös Loránd University, Budapest, Hungary, Tamas Bisztray University of Oslo
12:20
10m
Talk
An Initial Exploration of Contrastive Prompt Tuning to Generate Energy-Efficient Code
LLM4Code
Sophie Weidmann University of Twente, Fernando Castor University of Twente