Answer Summarization for Technical Queries: Benchmark and New Approach (ASE 2022 - Research Papers)

Who

Chengran Yang, Bowen Xu, Ferdian Thung, Yucen Shi, Ting Zhang, Zhou Yang, Xin Zhou, Jieke Shi, Junda He, DongGyun Han, David Lo

Track

ASE 2022 Research Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 13 Oct 2022 10:30 - 10:50 at Banquet A - Technical Session 22 - Code Summarization and Recommendation Chair(s): Houari Sahraoui

Abstract

Prior studies have demonstrated that approaches to generate an answer summary for a given technical query in Software Question and Answer (SQA) sites are desired. We find that existing approaches are assessed solely through user studies. Hence, a new user study needs to be performed every time a new approach is introduced; this is time-consuming, slows down the development of the new approach, and results from different user studies may not be comparable to each other. There is a need for a benchmark with ground truth summaries to complement assessment through user studies. Unfortunately, such a benchmark is non-existent for answer summarization for technical queries from SQA sites.

To fill the gap, we manually construct a high-quality benchmark to enable automatic evaluation of answer summarization for technical queries for SQA sites. It contains 111 query-summary pairs extracted from 382 Stack Overflow answers with 2,014 sentence candidates. Using the benchmark, we comprehensively evaluate the performance of existing approaches and find that there is still a big room for improvements.

Motivated by the results, we propose a new approach TechSumBot with three key modules:1) Usefulness Ranking module, 2) Centrality Estimation module, and 3) Redundancy Removal module. We evaluate TechSumBot in both automatic (i.e., using our benchmark) and manual (i.e., via a user study) manners. The results from both evaluations consistently demonstrate that TechSumBot outperforms the best performing baseline approaches from both SE and NLP domains by a large margin, i.e., 10.83%–14.90%, 32.75%–36.59%, and 12.61%–17.54%, in terms of ROUGE-1, ROUGE-2, and ROUGE-L on automatic evaluation, and 5.79%–9.23% and 17.03%–17.68%, in terms of average usefulness and diversity score on human evaluation. This highlights that the automatic evaluation of our benchmark can uncover findings similar to the ones found through user studies. More importantly, automatic evaluation has a much lower cost, especially when it is used to assess a new approach. Additionally, we also conducted an ablation study, which demonstrates that each module in TechSumBot contributes to boosting the overall performance of TechSumBot. We release the benchmark as well as the replication package of our experiment at https://anonymous.4open.science/r/TECHSUMBOT.

Chengran Yang

Singapore Management University

Bowen Xu

School of Information Systems, Singapore Management University

Ferdian Thung

Singapore Management University

Yucen Shi

Singapore Management University

Ting Zhang

Singapore Management University

Singapore

Zhou Yang

Singapore Management University

Singapore

Xin Zhou

Jieke Shi

Singapore Management University

Singapore

Junda He

Singapore Management University

DongGyun Han

Royal Holloway, University of London

United Kingdom

David Lo

Singapore Management University

Singapore

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 13 Oct
Displayed time zone: Eastern Time (US & Canada) change

10:00 - 12:00	Technical Session 22 - Code Summarization and RecommendationResearch Papers / NIER Track / Journal-first Papers / Industry Showcase at Banquet A Chair(s): Houari Sahraoui Université de Montréal

10:00 20m Research paper		Identifying Solidity Smart Contract API Documentation Errors Research Papers Chenguang Zhu The University of Texas at Austin, Ye Liu Nanyang Technological University, Xiuheng Wu Nanyang Technological University, Singapore, Yi Li Nanyang Technological University Pre-print
10:20 10m Vision and Emerging Results		Few-shot training LLMs for project-specific code-summarization NIER Track Toufique Ahmed University of California at Davis, Prem Devanbu Department of Computer Science, University of California, Davis DOI Pre-print
10:30 20m Research paper		Answer Summarization for Technical Queries: Benchmark and New Approach Research Papers Chengran Yang Singapore Management University, Bowen Xu School of Information Systems, Singapore Management University, Ferdian Thung Singapore Management University, Yucen Shi Singapore Management University, Ting Zhang Singapore Management University, Zhou Yang Singapore Management University, Xin Zhou , Jieke Shi Singapore Management University, Junda He Singapore Management University, DongGyun Han Royal Holloway, University of London, David Lo Singapore Management University
10:50 20m Paper		Code Structure Guided Transformer for Source Code SummarizationVirtual Journal-first Papers Shuzheng Gao Harbin Institute of Technology, Cuiyun Gao Harbin Institute of Technology, Yulan He University of Warwick, Jichuan Zeng The Chinese University of Hong Kong, Lun Yiu Nie Tsinghua University, Xin Xia Huawei Software Engineering Application Technology Lab, Michael Lyu The Chinese University of Hong Kong
11:10 10m Vision and Emerging Results		Taming Multi-Output Recommenders for Software EngineeringVirtual NIER Track Christoph Treude University of Melbourne
11:20 20m Industry talk		MV-HAN: A Hybrid Attentive Networks based Multi-View Learning Model for Large-scale Contents RecommendationVirtual Industry Showcase Ge Fan Tencent Inc., Chaoyun Zhang Tencent Inc., Kai Wang Tencent Inc., Junyang Chen Shenzhen University DOI Pre-print
11:40 20m Research paper		Which Exception Shall We Throw?Virtual Research Papers Hao Zhong Shanghai Jiao Tong University