ISPY: Automatic Issue-Solution Pair Extraction from Community Live Chats
Community live chats contain rich sets of information for potential improvement on software quality and productivity. One of the important applications is to mine knowledge on issues and their potential solutions. However, it remains a challenging problem to accurately mine such knowledge due to the noisy nature of interleaved dialogs in live chat data. In this paper, we first formulate the problem of issue-solution pair extraction from developer live chat data, and propose an automated approach, named ISPY, based on natural language processing and deep learning techniques with customized enhancements, to address the problem. Specifically, ISPY automates three tasks: 1) Disentangle live chat logs, employing a feedforward neural network to automatically disentangle a conversation history into separate dialogs; 2) Detect dialogs discussing issues, using a novel convolutional neural network (CNN), which consists of a BERT-based utterance embedding layer, a context-aware dialog embedding layer, and an output layer; 3) Extract appropriate utterances and combine them as corresponding solutions, based on the same CNN structure but with different feeding inputs. To evaluate ISPY, we compare it with six baselines, utilizing a dataset with 750 dialogs including 171 issue-solution pairs and evaluate ISPY from eight Gitter communities. The results show that, for issue-detection, our approach achieves the F1 of 76%, and outperforms all baselines by 30%. For solution-extraction, our approach achieves the F1 of 63%, and outperforms the baselines by 20%. Furthermore, we apply ISPY on three new communities to extensively evaluate ISPY’s practical usage. Moreover, we publish over 30K issue-solution pairs extracted from 11 communities. We believe that ISPY can facilitate community-based software development by promoting knowledge sharing and shortening the issue-resolving process.
Tue 16 NovDisplayed time zone: Hobart change
18:00 - 19:00 | Mining and IssuesNIER track / Research Papers at Koala Chair(s): Hongyu Zhang University of Newcastle | ||
18:00 20mTalk | VizSmith: Automated Visualization Synthesis by Mining Data-Science Notebooks Research Papers Rohan Bavishi University of California at Berkeley, Shadaj Laddad UC Berkeley, Hiroaki Yoshida Fujitsu Laboratories of America, Inc., Mukul Prasad Fujitsu Research of America, Koushik Sen University of California at Berkeley | ||
18:20 20mTalk | ISPY: Automatic Issue-Solution Pair Extraction from Community Live Chats Research Papers Lin Shi Institute of Software at Chinese Academy of Sciences, Ziyou Jiang Institute of Software at Chinese Academy of Sciences, Ye Yang Stevens Institute of Technology, Xiao Chen Institute of Software at Chinese Academy of Sciences, YuMin Zhang Institute of Software Chinese Academy of Sciences, Fangwen Mu Institute of Software Chinese Academy of Sciences, Hanzhi Jiang Institute of Software at Chinese Academy of Sciences, Qing Wang Institute of Software at Chinese Academy of Sciences Pre-print | ||
18:40 10mTalk | Understanding Code Fragments with Issue Reports NIER track | ||
18:50 10mTalk | An Empirical Study on Obsolete Issue Reports NIER track |