Locating requirements in backlog items: Content analysis and experiments with large language models
Context: Due to the rise of agile software development, requirements are increasingly managed via issue tracking systems (ITSs). In addition to representing requirements, ITSs help development teams allocate work items to team members. These systems provide a single point of access to a variety of information, including the (sprint and product) backlog and the task board. Objective: We first tackle a knowledge problem, addressing questions such as: How are requirements formulated in ITSs? Which types of requirements? At which granularity level? Secondly, we investigate the potential of automated techniques for identifying requirements in backlog items. While research exists on the automated extraction and classification of requirements, the informal nature of the backlog and the lack of standardized writing patterns call for research into new techniques for this task. Method: We perform a quantitative content analysis on backlog items sampled from seven open-source and seven proprietary projects. To explore automated techniques for locating requirements, we experiment with five large language models (LLMs) due to their established significance in NLP. Results: The findings show that user-oriented functional requirements are the most prevalent type. In addition, backlog items are often inconsistently labeled and frequently contain multiple requirements with different levels of granularity. The experiments with LLMs reveal that encoder-only models (BERT and RoBERTa) are most suitable for extracting and classifying requirements in backlogs compared to decoder-only models (Llama 3, Mistral 7B and GPT-4 in ChatGPT). Conclusions: The findings show insights and patterns on requirements documentation in ITSs, leading to a better empirical understanding of Agile RE. The experimental results with LLMs provide the foundation for developing automated unobtrusive tools to identify and classify requirements in ITSs.