Reddit as a New Source of User Feedback for Software Requirements
App stores and social media mining have proven to be a good source for collecting user feedback to foster requirements engineering and software evolution processes. Recent literature on mining software-related data from social platforms, including Twitter and Facebook, show that it complements app stores mining. However, many other platforms exist where users discuss about software applications and give their feedback about them, which are not thoroughly explored and analyzed. As little is known about data and content available related to software applications on reddit, we introduce it as a new potential data source. The main research question addressed in this paper is to explore if and how requirements engineering and software evolution can benefit by obtaining user feedback data from reddit. In this paper, we have performed an exploratory study in which we analyzed the usage characteristics (frequency of posts, number of comments, and number of users for each subreddits) of reddit posts about software applications. Furthermore, we investigated the posts’ content showing that they contain relevant information for requirements engineering and software evolution. Finally, we investigated the automatic classification potential and applied machine learning algorithms on unstructured and noisy reddit data for automated classification into bug reports, features, and other categories. The classifier with support vector machine(SVM) algorithm outperformed and achieved 84% macro F1-score. Our results demonstrate that reddit posts provide useful feedback about software applications, which can be used for improving requirements engineering and software evolution processes. This way, reddit complements the existing data sources. A limitation of this study is the non-validation of results by requirements analysts and software engineers.