Online chatrooms are gaining popularity as a communication channel between widely distributed developers of Open Source Software (OSS) projects. Most discussion threads in chatrooms follow a Q&A format, with some developers (askers) raising an initial question and others (respondents) joining in to provide answers. These discussion threads are embedded with rich information that can satisfy the diverse needs of various OSS stakeholders. However, retrieving information from threads is challenging as it requires a thread-level analysis to understand the context. Moreover, the chat data is transient and unstructured, consisting of entangled informal conversations. In this paper, we address this challenge by identifying the information types available in developer chats and further introducing an automated mining technique. Through manual examination of chat data from three chatrooms on Gitter, using card sorting, we build a thread-level taxonomy with nine information categories and create a labeled dataset with 2,959 threads. We propose a classification approach (named F2Chat) to structure the vast amount of threads based on the information type automatically, helping stakeholders quickly acquire their desired information. F2Chat combines handcrafted non-textual features with deep textual features extracted by neural models. Specifically, it has two stages with the first one leveraging the siamese architecture to pretrain the textual feature encoder, and the second one facilitating an in-depth fusion of two types of features. Evaluation results suggest that our approach achieves an average F1-score of 0.628, which improves the baseline by 57%. Experiments also verify the effectiveness of our identified non-textual features under both intra-project and cross-project validations.
Thu 18 NovDisplayed time zone: Hobart change
19:00 - 20:00 | DevelopersResearch Papers / Industry Showcase / NIER track at Kangaroo Chair(s): Chetan Arora Deakin University | ||
19:00 20mTalk | Automating Developer Chat Mining Research Papers Shengyi Pan Zhejiang University, Lingfeng Bao Zhejiang University, Xiaoxue Ren Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, Shanping Li Zhejiang University | ||
19:20 20mTalk | Thinking Like a Developer? Comparing the Attention of Humans with Neural Models of Code Research Papers Pre-print Media Attached | ||
19:40 10mTalk | Infrastructure in Code: Towards developer-friendly cloud applications Industry Showcase Vladislav Tankov Higher School of Economics, JetBrains, JetBrains Research, Dmitriy Valchuk JetBrains, ITMO University, Yaroslav Golubev JetBrains Research, Timofey Bryksin JetBrains Research; HSE University Pre-print | ||
19:50 10mTalk | Towards Fluid Software Architectures: Bidirectional Human-AI Interaction NIER track |