SWEnergy: An Empirical Study on Energy Efficiency in Agentic Issue Resolution Frameworks with SLMs
\textbf{Context.} Autonomous agents powered by Large Language Models (LLMs) are increasingly used for software engineering, but their reliance on large, proprietary models limits deployment on local hardware. This has spurred interest in Small Language Models (SLMs), but their practical effectiveness and efficiency within complex agentic frameworks for automated issue resolution remain poorly understood.
\textbf{Goal.} We investigate the performance, energy efficiency, and resource consumption of four leading agentic issue resolution frameworks when deliberately constrained to using SLMs. Our goal is to understand the viability of these systems for this task in resource-limited settings and characterize the resulting trade-offs.
\textbf{Method.} We conduct a controlled evaluation of four leading agentic frameworks (SWE-Agent, OpenHands, Mini SWE Agent, AutoCodeRover) using two SLMs (Gemma-3 4B, Qwen-3 1.7B) on the SWE-bench Verified Mini benchmark. On fixed hardware, we measure energy, duration, token usage, and memory over 150 runs per configuration.
\textbf{Results.} We find that framework architecture is the primary driver of energy consumption. The most energy-intensive framework, AutoCodeRover (Gemma), consumed 9.4x more energy on average than the least energy-intensive, OpenHands (Gemma). However, this energy is largely wasted. Task resolution rates were near-zero (4% for AutoCodeRover, 0% for all others), demonstrating that current frameworks, when paired with SLMs, consume significant energy on unproductive reasoning loops. The SLM’s limited reasoning was the bottleneck for \textit{success}, but the framework’s design was the bottleneck for \textit{efficiency}.
\textbf{Conclusions.} Current agentic frameworks, designed for powerful LLMs, fail to operate efficiently with SLMs. We find that framework architecture is the primary driver of energy consumption, but this energy is largely wasted due to the SLMs’ limited reasoning. Achieving viable, low-energy solutions requires a paradigm shift from passive orchestration to new architectures that actively manage the SLM’s weaknesses.