TIVER: Identifying Adaptive Versions of C/C++ Third-Party Open-Source Components Using a Code Clustering Technique (ICSE 2025 - Research Track)

Who

Youngjae Choi, Seunghoon Woo

Track

ICSE 2025 Research Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 2 May 2025 14:45 - 15:00 at 210 - Security and Analysis 3 Chair(s): Adriana Sejfia

Abstract

Reusing third-party open-source software (OSS) provides many benefits but can expose the entire system to risks owing to propagated vulnerabilities. While tracking the versions of OSS components can help prevent threats, existing approaches typically map a single version to a reused OSS codebase. This coarse-grained method fails to address multiple versions of code that coexist within the codebase, resulting in ineffective OSS management. Additionally, effectively identifying component versions is challenging owing to noise codes, such as algorithmic codes that coexist across different OSS, as well as duplicate components arising from the redundant reuse of OSS.

In this paper, we introduce the concept of the adaptive version, a one-stop solution to represent the version diversity of reused OSS. We present TIVER, an effective approach for identifying adaptive versions of OSS components. TIVER employs two key techniques: (1) fine-grained function-level versioning to uncover detailed versions, and (2) OSS code clustering to identify duplicate components and remove noise. This enables precise identification of OSS reuse locations and adaptive versions, effectively mitigating threats related to OSS reuse. Evaluation of popular C/C++ software on GitHub revealed that OSS components with a single version accounted for only 33%, while the remaining 67% of the components contained more than three versions on average. Nonetheless, TIVER effectively identified adaptive versions of OSS components with 88.46% precision and 91.63% recall in duplicate component distinction, and 86% precision and 86.84% recall in eliminating noise, while existing approaches barely achieved 42% recall in distinguishing duplicates and did not address noise. Further experiments showed that TIVER could enhance vulnerability management and be applied to Software Bills of Materials (SBOM) to improve supply chain security.

Youngjae Choi

Korea University

South Korea

Seunghoon Woo