Write a Blog >>
ICSE 2020
Wed 24 June - Thu 16 July 2020
Thu 9 Jul 2020 07:12 - 07:20 at Silla - I15-Ecosystems 1 Chair(s): Raula Gaikovina Kula

Software developers have benefited from various sources of knowledge such as forums, question-and-answer sites, and social media platforms to help them in various tasks. However, extracting software-related knowledge from different platforms involves many challenges. Our work is motivated by the following use cases: (1) We consider a developer who wants to acquire new knowledge based on software-relevant tweets. Developers face challenges while using Twitter, which relate to having to deal with a huge amount of irrelevant tweets produced on Twitter. As the developer has limited time to inspect new tweets, a content aggregator for Twitter data related to software development is essential. More generally, automatic identification of software-relevant tweets will also enable downstream applications such as creation of a specialized Twitter feed for the developer community. (2) We consider a software developer who acts as a content creator and publishes a coding tutorial on YouTube. Viewers can visually follow the instructions provided in the videos and leave a comment that expresses their experience with the video. Digesting information taken from the comments will help the content creator to be more engaged with their audience and improve their future videos. Automatic filtering of relevant comments will enable creators to study feedback provided by viewers more efficiently, and similar to tweets, such filtering can be used to improve downstream analytics tasks, such as detection of common topics among relevant comments. In both platforms mentioned above (Twitter and YouTube comments), sentences are typically short, contain a lot of noise, and may contain non-standard words. In order to address these challenges, we propose SIEVE~\cite{sulistya2019sieve}, an approach to improve the effectiveness of knowledge extraction tasks by performing cross-platform analysis. Our approach is based on transfer representation learning and word embedding, leveraging information extracted from a source platform which contains rich domain-related content. The information extracted is then used to solve tasks in another platform (considered as target platform) with less domain-related content. We first build a word embedding model as a representation learned from the source platform, and use the model to improve the performance of knowledge extraction tasks in the target platform. We experiment with Software Engineering Stack Exchange and Stack Overflow as source platforms, and two different target platforms, i.e., Twitter and YouTube. We conducted experiments based on the existing datasets provided by Sharma et al. for Twitter, and Poche et al. for YouTube comments. Our experiments show the effectiveness of our proposed cross-platform analysis approach which achieves performance improvements of up to 28% and 10.3% for the first and second use case respectively.

Thu 9 Jul

Displayed time zone: (UTC) Coordinated Universal Time change

07:00 - 08:00
07:00
12m
Talk
Impact Analysis of Cross-Project Bugs on Software EcosystemsTechnical
Technical Papers
Wanwangying Ma Nanjing University, Lin Chen Nanjing University, Xiangyu Zhang Purdue University, Yang Feng Nanjing University, Zhaogui Xu Nanjing University, China, Zhifei Chen Huawei, Yuming Zhou Nanjing University, Baowen Xu Nanjing University
07:12
8m
Talk
SIEVE: Helping Developers Sift Wheat from Chaff via Cross-Platform AnalysisJ1
Journal First
Agus Sulistya Telkom Institute of Technology Surabaya, Gede Artha Azriadi Prana Singapore Management University, Abhishek Sharma Singapore Management University, Singapore, David Lo Singapore Management University, Christoph Treude The University of Adelaide
07:20
18m
Talk
Sharing at Scale: An Open-Source-Software-based License Compliance EcosystemSEIP
Software Engineering in Practice
Frances Paulisch Siemens Healthineers, Arun Azhakesan Siemens Healthineers
07:38
8m
Talk
Extended abstract “Software Deployment on Heterogeneous Platforms: A Systematic Mapping Study”J1
Journal First
Hugo Andrade Chalmers University of Technology, Jan Schroeder Chalmers | University of Gothenburg, Ivica Crnkovic Chalmers | University of Gothenburg
07:46
8m
Talk
A Large Scale Study of Long-Time Contributor Prediction for GitHub ProjectsJ1
Journal First
Lingfeng Bao Zhejiang University, Xin Xia Monash University, David Lo Singapore Management University, Gail Murphy University of British Columbia