Improving API Knowledge Comprehensibility: A Context-Dependent Entity Detection and Context Completion Approach using LLM (SANER 2025 - Research Papers)

Who

Zhang Zhang, Xinjun Mao, Shangwen Wang, Kang Yang, Tanghaoran Zhang, Fei Gao, Xunhui Zhang

Track

SANER 2025 Research Papers

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 5 Mar 2025 14:30 - 14:45 at L-1720 - API and Dependency Analysis (Room: L-1720) Chair(s): Raula Gaikovina Kula

Abstract

Extracting API knowledge from Stack Overflow has become a crucial way to assist developers in using APIs. Existing research has primarily focused on extracting relevant API-related knowledge at the sentence level to enhance API documentation.
However, this level of extraction can lead to a loss of crucial context, especially when sentences contain context-dependent entities (i.e., whose understanding requires reference to the surrounding context) that may hinder developers’ understanding. To investigate this issue, we conducted an empirical study of 384 Stack Overflow posts and found that (1) approximately one-third of API functionality sentences contain context-dependent entities, and (2) these entities fall into two categories: Referential Context-Dependent Entities and Local Variable Context-Dependent Entities. In response, we developed a novel method, CEDCC, which combines an entity filtering strategy informed by insights from our empirical study, with a large language model (LLM) to construct coreference chains for detecting context-dependent entities. Additionally, it employs a step-by-step approach with the LLM to complete the necessary context for understanding these entities. To evaluate CEDCC, we constructed a dataset of 1,023 API knowledge sentences, including 567 context-dependent entities and their required contexts. The results demonstrate the effectiveness of CEDCC in accurately detecting context-dependent entities and completing context tasks, achieving an F1-score of 0.865 and a BERTScore of 0.373, significantly surpassing the baseline methods. Human evaluations further confirmed that CEDCC effectively improves the comprehensibility of API knowledge sentences.

Zhang Zhang

National University of Defense Technology

China

Xinjun Mao

National University of Defense Technology

China

Shangwen Wang

National University of Defense Technology

China

Kang Yang

National University of Defense Technology

China

Tanghaoran Zhang

National University of Defense Technology

China

Fei Gao

National University of Defense Technology

China

Xunhui Zhang

National University of Defense Technology, China

China

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 5 Mar
Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30	API and Dependency Analysis (Room: L-1720)Research Papers at L-1720 Chair(s): Raula Gaikovina Kula Osaka University

14:00 15m Talk		Analysing Software Supply Chains of Infrastructure as Code: Extraction of Ansible Plugin Dependencies Research Papers Ruben Opdebeeck Vrije Universiteit Brussel, Bram Adams Queen's University, Coen De Roover Vrije Universiteit Brussel Pre-print
14:15 15m Talk		Enhancing Automated Vulnerability Repair through Dependency Embedding and Pattern Store Research Papers Qingao Dong Beihang university, Yuanzhang Lin Beihang University, Xiang Gao Beihang University, Hailong Sun Beihang University
14:30 15m Talk		Improving API Knowledge Comprehensibility: A Context-Dependent Entity Detection and Context Completion Approach using LLM Research Papers Zhang Zhang National University of Defense Technology, Xinjun Mao National University of Defense Technology, Shangwen Wang National University of Defense Technology, Kang Yang National University of Defense Technology, Tanghaoran Zhang National University of Defense Technology, Fei Gao National University of Defense Technology, Xunhui Zhang National University of Defense Technology, China
14:45 15m Talk		Pay Your Attention on Lib! Android Third-Party Library Detection via Feature Language Model Research Papers Dahan Pan Shanghai Jiao Tong University, Yi Xu Shanghai Jiao Tong University, Runhan Feng Shanghai Jiao Tong University, Donghui Yu Shanghai Jiao Tong University, Jiawen Chen Shanghai Jiao Tong University, Ya Fang Shanghai Jiao Tong University, Yuanyuan Zhang Shanghai Jiao Tong University
15:00 15m Talk		THINK: Tackling API Hallucinations in LLMs via Injecting Knowledge Research Papers Jiaxin Liu National University of Defense Technology, Yating Zhang National University of Defense Technology, Deze Wang National University of Defense Technology, Yiwei Li National University of Defense Technology, Wei Dong National University of Defense Technology