ICSE 2025
Sat 26 April - Sun 4 May 2025 Ottawa, Ontario, Canada

Conflicts arising from the presence of multiple licenses in open-source software (OSS) projects can lead to compliance issues, legal risks, operational challenges and even financial implications for developers and organizations. While enormous efforts have been made to automate license extraction and detect potential conflicts, current techniques primarily rely on static rule matching for license identification and extraction, or probabilistic and shallow neural modeling techniques for license term prediction. These techniques often struggle to adapt to evolving patterns. The advent of large language models (LLMs) presents new opportunities for comprehending complex information within license files; however, their application in license term extraction and conflict analysis remains underexplored. In this paper, we present an automated framework for license identification and conflict analysis that leverages the capabilities of LLMs. Additionally, we introduce a benchmark dataset specifically designed for the extraction of license terms. The framework consists of three key modules: (a) an automated license extraction module that identifies and extract declared, inline and referenced licenses from within local project repositories; (b) an LLM based OSS license labeling component utilizing few-shot Chain-of- thought prompting along with structured output generation; and (c) an LLM-based conflict analysis module that utilizes a hybrid approach of advanced prompting techniques. Our benchmark dataset contains over 5000 labeled instances of license texts, including approximately 2,000 well-known license texts sourced from open-source license repositories and GitHub projects. In addition to releasing the dataset, we provide a set of fine-tuned language models specifically designed for license term identification and conflict analysis. We compare our framework with existing automated license identification and conflict detection techniques, conduct an in-depth analysis of the benchmark dataset and the incorporated prompting strategies, and discuss their implications and potential directions for future research.