Compositional API Recommendation for Library-Oriented Code Generation (ICPC 2024 - Research Track) - ICPC 2024

Sun 14 - Sat 20 April 2024 Lisbon, Portugal

co-located with ICSE 2024

Who

Zexiong Ma, Shengnan An, Bing Xie, Zeqi Lin

Track

ICPC 2024 Research Track

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Mon 15 Apr 2024 14:10 - 14:20 at Sophia de Mello Breyner Andresen - Code + Documentation Generation Chair(s): Massimiliano Di Penta

Abstract

Large language models (LLMs) have achieved exceptional performance in code generation. However, the performance remains unsatisfactory in generating library-oriented code, especially for the libraries not present in the training data of LLMs. Previous work utilizes API recommendation technology to help LLMs use libraries: it retrieves APIs related to the user requirements, then leverages them as context to prompt LLMs. However, developmental requirements can be coarse-grained, requiring a combination of multiple fine-grained APIs. This granularity inconsistency makes API recommendation a challenging task.

To address this, we propose CAPIR (Compositional API Recommendation), which adopts a “divide-and-conquer” strategy to recommend APIs for coarse-grained requirements. Specifically, CAPIR employs an LLM-based Decomposer to break down a coarse-grained task description into several detailed subtasks. Then, CAPIR applies an embedding-based Retriever to identify relevant APIs corresponding to each subtask. Moreover, CAPIR leverages an LLM-based Reranker to filter out redundant APIs and provides the final recommendation.

To facilitate the evaluation of API recommendation methods on coarse-grained requirements, we present two challenging benchmarks, RAPID (Recommend APIs based on Documentation) and LOCG (Library-Oriented Code Generation). Experimental results on these benchmarks, demonstrate the effectiveness of CAPIR in comparison to existing baselines. Specifically, on RAPID’s Torchdata-AR dataset, compared to the state-of-the-art API recommendation approach, CAPIR improves recall@5 from 18.7% to 43.2% and precision@5 from 15.5% to 37.1%. On LOCG’s Torchdata-Code dataset, compared to code generation without API recommendation, CAPIR improves pass@100 from 16.0% to 28.0%.

Link to Preprint

https://arxiv.org/pdf/2402.19431.pdf

Zexiong Ma

Peking University

Shengnan An

Xi’an Jiaotong University

Bing Xie

Peking University

Zeqi Lin

Microsoft Research, China

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Mon 15 Apr
Displayed time zone: Lisbon change

	14:00 - 15:30	Code + Documentation GenerationResearch Track / / Early Research Achievements (ERA) / Replications and Negative Results (RENE) at Sophia de Mello Breyner Andresen Chair(s): Massimiliano Di Penta University of Sannio, Italy

	14:00 10m Talk		MESIA: Understanding and Leveraging Supplementary Nature of Method-level Comments for Automatic Comment GenerationICPCICPC Full paper Research Track Xinglu Pan Peking University, Chenxiao Liu Peking University, Yanzhen Zou Peking University, Tao Xie Peking University, Bing Xie Peking University Pre-print
	14:10 10m Talk		Compositional API Recommendation for Library-Oriented Code GenerationICPCICPC Full paper Research Track Zexiong Ma Peking University, Shengnan An Xi’an Jiaotong University, Bing Xie Peking University, Zeqi Lin Microsoft Research, China Pre-print
	14:20 10m Talk		On the Generalizability of Deep Learning-based Code Completion Across Programming Language VersionsICPCICPC Full paper Research Track Matteo Ciniselli Università della Svizzera Italiana, Alberto Martin-Lopez Software Institute - USI, Lugano, Gabriele Bavota Software Institute @ Università della Svizzera Italiana
	14:30 10m Talk		ESGen: Commit Message Generation Based on Edit Sequence of Code ChangeICPCICPC Full paperVirtual-Talk Research Track Xiangping Chen Sun Yat-sen University, Yangzi Li SUN YAT-SEN UNIVERSITY, Zhicao Tang SUN YAT-SEN UNIVERSITY, Yuan Huang School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China, Haojie Zhou School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China, Mingdong Tang Guangdong University of Foreign Studies, Zibin Zheng Sun Yat-sen University
	14:40 10m Talk		Improving AST-Level Code Completion with Graph Retrieval and Multi-Field AttentionICPCICPC Full paperVirtual-Talk Research Track Yu Xia Central South University, Tian Liang Central South University, Wei-Huan Min Central South University, Li Kuang School of Computer Science and Engineering, Central South University
	14:50 10m Talk		Exploring and Improving Code Completion for Test CodeICPCICPC Full paper Research Track Tingwei Zhu Nanjing University, Zhongxin Liu Zhejiang University, Tongtong Xu Huawei, Ze Tang Software Institute, Nanjing University, Tian Zhang Nanjing University, Minxue Pan Nanjing University, Xin Xia Huawei Technologies
	15:00 10m Talk		Understanding the Impact of Branch Edit Features for the Automatic Prediction of Merge Conflict ResolutionsICPCICPC RENE Paper Replications and Negative Results (RENE) Waad riadh aldndni Virginia Tech, Francisco Servant ITIS Software, University of Malaga, Na Meng Virginia Tech
	15:10 4m Talk		Investigating the Efficacy of Large Language Models for Code Clone DetectionICPCICPC ERA Paper Early Research Achievements (ERA) Mohamad Khajezade University of British Columbia Okanagan, Jie JW Wu University of British Columbia (UBC), Fatemeh Hendijani Fard University of British Columbia, Gema Rodríguez-Pérez University of British Columbia (UBC), Mohamed S Shehata University of British Columbia
	15:14 16m Talk		Code + Documentation Generation: Panel with SpeakersICPC Discussion