Software developers often resort to Stack Overflow (SO) to fill their programming needs. Given the abundance of relevant posts, navigating them and comparing different solutions is tedious and time-consuming. Recent work has proposed to automatically summarize SO posts to concise text to facilitate the navigation of SO posts. However, these techniques rely only on information retrieval methods or heuristics for text summarization, which is insufficient to handle the ambiguity and sophistication of natural language.
This paper presents a deep learning based framework called ASSORT for SO post summarization. ASSORT includes two complementary learning methods, ASSORT$S$ and ASSORT${IS}$, to address the lack of labeled training data for SO post summarization. ASSORT$S$ is designed to directly train a novel ensemble learning model with BERT embeddings and domain-specific features to account for the unique characteristics of SO posts. By contrast, ASSORT${IS}$ is designed to reuse pre-trained models while addressing the domain shift challenge when no training data is present (i.e., zero-shot learning). Both ASSORT$S$ and ASSORT${IS}$ outperform six existing techniques by at least 13% and 7% respectively in terms of the F1 score. Furthermore, a human study shows that participants significantly preferred summaries generated by ASSORT$S$ and ASSORT${IS}$ over the best baseline, while the preference difference between ASSORT$S$ and ASSORT${IS}$ was small.
Fri 19 MayDisplayed time zone: Hobart change
11:00 - 12:30 | Developers' forumsSEIP - Software Engineering in Practice / Journal-First Papers / Technical Track / DEMO - Demonstrations at Meeting Room 102 Chair(s): Omar Haggag Monash University, Australia | ||
11:00 15mTalk | Automatic prediction of rejected edits in Stack Overflow Journal-First Papers Saikat Mondal University of Saskatchewan, Gias Uddin University of Calgary, Canada, Chanchal K. Roy University of Saskatchewan Link to publication DOI Pre-print | ||
11:15 15mTalk | Automated Summarization of Stack Overflow Posts Technical Track Bonan Kou Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University | ||
11:30 15mTalk | Semi-Automatic, Inline and Collaborative Web Page Code Curations Technical Track Roy Rutishauser University of Zurich, André N. Meyer University of Zurich, Reid Holmes University of British Columbia, Thomas Fritz University of Zurich | ||
11:45 15mTalk | You Don’t Know Search: Helping Users Find Code by Automatically Evaluating Alternative Queries SEIP - Software Engineering in Practice Rijnard van Tonder Sourcegraph | ||
12:00 7mTalk | TECHSUMBOT: A Stack Overflow Answer Summarization Tool for Technical Query DEMO - Demonstrations Chengran Yang Singapore Management University, Bowen Xu Singapore Management University, Jiakun Liu Singapore Management University, David Lo Singapore Management University | ||
12:07 8mTalk | An empirical study of question discussions on Stack Overflow Journal-First Papers Wenhan Zhu University of Waterloo, Haoxiang Zhang Centre for Software Excellence at Huawei Canada, Ahmed E. Hassan Queen’s University, Michael W. Godfrey University of Waterloo, Canada | ||
12:15 15mTalk | Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence Technical Track zejun zhang Australian National University, Zhenchang Xing , Xin Xia Huawei, Xiwei (Sherry) Xu CSIRO’s Data61, Liming Zhu CSIRO’s Data61, Qinghua Lu CSIRO’s Data61 |