TCSE logo 
 Sigsoft logo
Sustainability badge

This program is tentative and subject to change.

Thu 1 May 2025 15:00 - 15:15 at 211 - Industry Challenge Presentations

With the rapid growth of the open source software (OSS) ecosystem, the use of open source has become the predominant model for contemporary software development. OSS licenses define the conditions for the reuse, distribution, and modification of OSS and form the foundation of the open source ecosystem. However, recent research shows that over half (53%) of OSS software experiences license conflicts, adversely affecting the sustainability of OSS and community collaboration and leading to significant legal risks. Researchers propose various methods for detecting license conflicts, yet these approaches face challenges such as limited license coverage and insufficient model accuracy. The recent emergence of large language models (LLMs) offers new opportunities for license conflict detection. However, there remains a lack of in-depth and systematic research on utilizing LLMs for this purpose.

To address this challenge, we propose L³icNexus, an effective tool for automatically detecting license conflicts using LLMs. Specifically, L³icNexus employs a joint labeling method based on embedded model label inference and expert verification and constructs a domain dataset consisting of 3,238 OSS licenses. Subsequently, L³icNexus proposes the AdaFine approach, combining Domain-Adaptive Pre-Training (DAPT) and Supervised Fine-Tuning (SFT), resulting in the License-Llama3-8B model. This model identifies terms, infers OSS license attitudes, and autonomously understands licenses end-to-end. Finally, L³icNexus generates summaries of the rights and obligations associated with licenses using License-Llama3-8B, and detects conflicts by extracting the license hierarchy of OSS. Experimental results demonstrate that L³icNexus achieves an F1-score of 85.58% in license term and attitude recognition, surpassing the best results of other methods by 20.69%. Moreover, an empirical study conducted on license conflict detection for 500 popular GitHub projects reveals that L³icNexus achieves a false positive rate of 5.88% and a false negative rate of 2.47%. The performance of L³icNexus exceeds that of existing state-of-the-art methods, illustrating the potential of LLMs in addressing license conflict detection. We summarize the insights from this research and release the OSS license dataset and License-Llama3-8B model on Hugging Face to encourage further exploration in related fields (Dataset available: https://huggingface.co/datasets/AnonymousAuthors/OSS-License-Terms; Model available: https://huggingface.co/AnonymousAuthors/License-Llama3-8B).

This program is tentative and subject to change.

Thu 1 May

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30
Industry Challenge PresentationsIndustry Challenge Track at 211
14:00
15m
Talk
CKGFuzzer: LLM-Based Fuzz Driver Generation Enhanced By Code Knowledge GraphAward Winner
Industry Challenge Track
Hanxiang Xu Huazhong University of Science and Technology, Wei Ma , Ting Zhou Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Kai Chen Huazhong University of Science and Technology, Qiang Hu The University of Tokyo, Yang Liu Nanyang Technological University, Haoyu Wang Huazhong University of Science and Technology
14:15
15m
Talk
ClauseBench: Enhancing Software License Analysis with Clause-Level Benchmarking
Industry Challenge Track
Qiang Ke Huazhong University of Science and Technology, Xinyi Hou Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Haoyu Wang Huazhong University of Science and Technology
14:30
15m
Talk
CodeMorph: Mitigating Data Leakage in Large Language Model Assessment
Industry Challenge Track
Hongzhou Rao Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Wenjie Zhu Huazhong University of Science and Technology, Ling Xiao Huazhong University of Science and Technology, Meizhen Wang Huazhong University of Science and Technology, Haoyu Wang Huazhong University of Science and Technology
14:45
15m
Talk
CommitShield: Tracking Vulnerability Introduction and Fix in Version Control Systems
Industry Challenge Track
Zhaonan Wu Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Chen Wei MYbank, Ant Group, Zirui Wan Huazhong University of Science and Technology, Yue Liu Monash University, Haoyu Wang Huazhong University of Science and Technology
15:00
15m
Talk
Exploring Large Language Models for Analyzing Open Source License Conflicts: How Far Are We?
Industry Challenge Track
Xing Cui Institute of Software, Chinese Academy of Sciences, Jingzheng Wu Institute of Software, The Chinese Academy of Sciences, Xiang Ling Institute of Software, Chinese Academy of Sciences, Tianyue Luo Institute of Software, Chinese Academy of Sciences, Mutian Yang Beijing ZhongKeWeiLan Technology Co.,Ltd., Wenxiang Ou Institute of Software, Chinese Academy of Sciences
15:15
15m
Talk
OSS-LCAF: Open-Source Software License Conflict Analysis Framework
Industry Challenge Track
Aditya Kahol TCS Research, Anka Chandrahas Tummepalli TCS Research, Preethu Rose Anish TCS Research
:
:
:
: