A Data-driven Approach for Automated Quality Concern Extraction from App Reviews
The rapid growth of mobile applications has significantly increased the volume of user-generated reviews on platforms such as the Google Play Store. These reviews provide vital feedback on software quality, offering information-rich attributes such as ease of use, performance, security, and reliability. However, their unstructured and informal nature, along with frequent vagueness, makes manual analysis challenging. Automated solutions using traditional machine learning (ML) and deep learning (DL) approaches have been proposed, but the level of automation remains limited. Such methods often lack deep contextual understanding and depend on hand-crafted features, resulting in reduced effectiveness in multi-label classification scenarios. This proposed research aims to automate the extraction of software quality concerns from mobile app reviews using transformer-based models. The approach leverages self-attention mechanisms and contextual embeddings to enhance semantic understanding while reducing the reliance on manual feature engineering. The extracted concerns are organized according to the ISO/IEC 25010 standard, enabling structured quality evaluation and automated assessment. A dataset of 20,000 real-world app reviews will be used for evaluation, with performance measured through precision, recall, and F1-score. The anticipated outcome is a multi-label classification system that significantly improves the automation and accuracy of software quality analysis, providing actionable insights for developers and quality assurance teams.