ICSME 2025
Sun 7 - Fri 12 September 2025 Auckland, New Zealand

Large Language Models (LLMs) are widely adopted for automated code generation with promising results. Although prior research has assessed LLM-generated code and identified various quality issues- such as redundancy, poor maintainability, and sub-optimal performance- a systematic understanding and categorization of these inefficiencies remain unexplored. Therefore, we empirically investigate inefficiencies in LLM-generated code by state-of-the-art models, i.e., CodeLlama, DeepSeek-Coder, and CodeGemma. To do so, we manually analyze 492 generated code snippets in the HumanEval+ dataset. We then construct a taxonomy of inefficiencies in LLM-generated code that includes 5 categories (General Logic, Performance, Readability, Maintainability, and Errors) and 19 subcategories of inefficiencies. We validate the obtained taxonomy through an online survey with 58 LLM practitioners and researchers. The surveyed participants affirmed the completeness of the proposed taxonomy, and the relevance and the popularity of the identified code inefficiency patterns. Our qualitative findings indicate that inefficiencies are diverse and interconnected, affecting multiple aspects of code quality, with logic and performance-related inefficiencies being the most frequent and often co-occur while impacting overall code quality. Our taxonomy provides a structured basis for evaluating the quality of LLM-generated code and guiding future research to improve code generation efficiency.