T-FREX: A Transformer-based Feature Extraction Method from Mobile App Reviews
Mobile app reviews are a large-scale data source for software-related knowledge generation activities, including software maintenance, evolution and feedback analysis. Effective extraction of features (i.e., functionalities or characteristics) from these reviews is key to support analysis on the acceptance of these features, identification of relevant new feature requests and prioritization of feature development, among others. Traditional methods focus on syntactic pattern-based approaches, typically context-agnostic, evaluated on a closed set of apps, difficult to replicate and limited to a reduced set and domain of apps. Meanwhile, the pervasiveness of Large Language Models (LLMs) based on the Transformer architecture in software engineering tasks lays the groundwork for empirical evaluation of the performance of these models to support feature extraction. In this study, we present T-FREX, a Transformer-based, fully automatic approach for mobile app review feature extraction. First, we collect a set of ground truth features from users in a real crowdsourced software recommendation platform and transfer them automatically into a dataset of app reviews. Then, we use this newly created dataset to fine-tune multiple LLMs on a named entity recognition task under different data configurations. We assess the performance of T-FREX with respect to this ground truth, and we complement our analysis by comparing T-FREX with a baseline method from the field. Finally, we assess the quality of new features predicted by T-FREX through an external human evaluation. Results show that T-FREX outperforms on average the traditional syntactic-based method, especially when discovering new features from a domain for which the model has been fine-tuned.
Wed 13 MarDisplayed time zone: Athens change
14:00 - 15:30 | Mobile AppsResearch Papers / Tools Demo Track / Early Research Achievement (ERA) Track at KURU Chair(s): Daniel Feitosa University of Groningen | ||
14:00 15mTalk | Accurate and Efficient Code Matching Across Android Application Versions against Obfuscation Research Papers Runhan Feng Shanghai Jiao Tong University, Zhuohao Zhang University of Electronic Science and Technology of China, Yetong Zhou Shanghai Jiao Tong University, Ziyang Yan Shanghai Jiao Tong University, Yuanyuan Zhang Shanghai Jiao Tong University | ||
14:15 15mTalk | Understanding Android OS Forward Compatibility Support for Legacy Apps: A Data-Driven Analysis Research Papers Shuang Li Shandong University, Rui Li Shandong University, Yifan Yu Shandong University, Kailun Yan Shandong University, Shishuai Yang Shandong University, Wenrui Diao Shandong University | ||
14:30 15mTalk | T-FREX: A Transformer-based Feature Extraction Method from Mobile App Reviews Research Papers Quim Motger Universitat Politècnica de Catalunya, Alessio Miaschi ItaliaNLP Lab, Institute for Computational Linguistics “A. Zampolli” (CNR-ILC), Pisa, Felice Dell'Orletta ItaliaNLP Lab, Istituto di Linguistica Computazionale “Antonio Zampolli”, Xavier Franch Universitat Politècnica de Catalunya, Jordi Marco Universitat Politècnica de Catalunya Pre-print | ||
14:45 15mTalk | PredRacer: Predictively detecting data races in android applications Research Papers Xin Guo School of Computer Science and Engineering, Southeast University, Xiaofang Qi School of Computer Science and Engineering, Southeast University, Yanhui Li Nanjing University, Chao Wu School of Computer Science and Engineering, Southeast University | ||
15:00 7mTalk | PMDET: Automated Detection Tool of Android Parcel Mismatch Tools Demo Track Yunfan Zhan Shanghai Jiao Tong University, Qidan He Jingdong Group, Yijun Wang Shanghai Jiao Tong University, Xiuzhen Chen Shanghai Jiao Tong University | ||
15:07 15mTalk | JNFuzz-Droid: A Lightweight Fuzzing and Taint Analysis Framework for Android Native Code Research Papers Jianchao Cao Jiangxi Normal University, Fan Guo Jiangxi Normal University, Yanwen Qu Jiangxi Normal University | ||
15:22 7mTalk | Extending Refactoring Detection to Kotlin: A Dataset and Comparative Study Early Research Achievement (ERA) Track Iman Hemati Moghadam Formal Methods and Tools, University of Twente, Mohammad Mehdi Afkhami Computer Engineering Department, Vali-e-Asr University of Rafsanjan, Parsa Kamalipour Computer Engineering Department, Vali-e-Asr University of Rafsanjan, Vadim Zaytsev University of Twente, Netherlands |