Automated Software Architecture Design Recovery from Source Code Using LLMsResearch Track Paper
This program is tentative and subject to change.
Recent advancements in Large Language Models (LLMs) offer promising opportunities for automating software architecture recovery (SAR). In this study, we assess the effectiveness of state-of-the-art general-purpose LLMs when used as off-the-shelf tools by practitioners seeking architectural insights from source code. We evaluate four models across three key tasks: (i) identifying implementation-level class diagrams, (ii) identifying architectural and design patterns, and (iii) identifying architectural styles. The experiment adopts a realistic usage setting, combining prompt engineering with a Self-Reflection mechanism to simulate how users iteratively refine queries. Results show that LLMs can support SAR activities, particularly in identifying structural and stylistic elements, but they struggle with complex abstractions such as class relationships and fine-grained design patterns. In addition to performance evaluation, we analyze the types of errors made by the models and assess the impact of Self-Reflection in refining their outputs, offering deeper insights into LLM behavior and highlighting implications for future research and practice.