TCSE logo 
 Sigsoft logo
Sustainability badge

This program is tentative and subject to change.

Thu 1 May 2025 14:45 - 15:00 at 207 - Human and Social using AI 1

Recent advances in Large Language Models (LLMs) have revolutionized code generation, leading to widespread adoption of AI coding tools by developers. However, LLMs can generate license-protected code without providing the necessary license information, leading to potential intellectual property violations during software production. This paper addresses the critical, yet underexplored, issue of license compliance in LLM-generated code by establishing a benchmark to evaluate the ability of LLMs to provide accurate license information for their generated code. To establish this benchmark, we conduct an empirical study to identify a reasonable standard for “striking similarity” that excludes the possibility of independent creation, indicating a copy relationship between the LLM output and certain open-source code. Based on this standard, we propose an evaluation benchmark LiCoEval, to evaluate the license compliance capabilities of LLMs. Using LiCoEval, we evaluate 14 popular LLMs, finding that even top-performing LLMs produce a non-negligible proportion (0.88% to 2.01%) of code strikingly similar to existing open-source implementations. Notably, most LLMs fail to provide accurate license information, particularly for code under copyleft licenses. These findings underscore the urgent need to enhance LLM compliance capabilities in code generation tasks. Our study provides a foundation for future research and development to improve license compliance in AI-assisted software development, contributing to both the protection of open-source software copyrights and the mitigation of legal risks for LLM users.

This program is tentative and subject to change.

Thu 1 May

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30
Human and Social using AI 1Research Track at 207
14:00
15m
Talk
Between Lines of Code: Unraveling the Distinct Patterns of Machine and Human Programmers
Research Track
Yuling Shi Shanghai Jiao Tong University, Hongyu Zhang Chongqing University, Chengcheng Wan East China Normal University, Xiaodong Gu Shanghai Jiao Tong University
14:15
15m
Talk
Deep Learning-based Code Reviews: A Paradigm Shift or a Double-Edged Sword?
Research Track
Rosalia Tufano Università della Svizzera Italiana, Alberto Martin-Lopez Software Institute - USI, Lugano, Ahmad Tayeb , Ozren Dabic Software Institute, Università della Svizzera italiana (USI), Switzerland, Sonia Haiduc , Gabriele Bavota Software Institute @ Università della Svizzera Italiana
14:30
15m
Talk
An Exploratory Study of ML Sketches and Visual Code Assistants
Research Track
Luis F. Gomes Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University, Jonathan Aldrich Carnegie Mellon University, Rui Abreu INESC-ID; University of Porto
14:45
15m
Talk
LiCoEval: Evaluating LLMs on License Compliance in Code Generation
Research Track
Weiwei Xu Peking University, Kai Gao Peking University, Hao He Carnegie Mellon University, Minghui Zhou Peking University
Pre-print
15:00
15m
Talk
Trust Dynamics in AI-Assisted Development: Definitions, Factors, and Implications
Research Track
Sadra Sabouri University of Southern California, Philipp Eibl University of Southern California, Xinyi Zhou University of Southern California, Morteza Ziyadi Amazon AGI, Nenad Medvidović University of Southern California, Lars Lindemann University of Southern California, Souti Chattopadhyay University of Southern California
Pre-print
15:15
15m
Talk
What Guides Our Choices? Modeling Developers' Trust and Behavioral Intentions Towards GenAI
Research Track
Rudrajit Choudhuri Oregon State University, Bianca Trinkenreich Colorado State University, Rahul Pandita GitHub, Inc., Eirini Kalliamvakou GitHub, Igor Steinmacher Northern Arizona University, Marco Gerosa Northern Arizona University, Christopher Sanchez Oregon State University, Anita Sarma Oregon State University
Pre-print
:
:
:
: