SSAR: A Novel Software Architecture Recovery Approach Enhancing Accuracy and Scalability
Software architecture is critical for software development and maintenance. As software evolves, its architecture may drift from the original design, resulting in architectural degradation that negatively affects software quality. To effectively manage and maintain software systems, architects need an accurate understanding of the current architecture, but manual analysis is time-consuming and error-prone. Therefore, numerous automated architecture recovery techniques have been developed to facilitate this process. However, existing techniques often face limitations in either accuracy or efficiency, especially when dealing with large-scale software systems. In this paper, we propose a novel architecture recovery approach that integrates semantic similarity and structural dependencies between files to construct a weighted graph. Then it applies an optimized community detection algorithm to partition the graph for software modularization. To evaluate the effectiveness of our approach, we selected nine open-source projects with ground-truth architectures and compared our approach against six state-of-the-art architecture recovery techniques. Experimental results have shown that our approach improves accuracy ranging from 5.0% to 90.9%, 3.8% to 16.9%, and 12.5% to 500% in the three well-known architecture similarity metrics (𝑀𝑜 𝐽𝑜𝐹𝑀, 𝑎2𝑎, and 𝑐2𝑐𝑐𝑣𝑔), respectively. Additionally, it reduces the average execution time by 5% to 99%. In conclusion, our approach not only achieves a more precise recovery of software architecture but also significantly cuts down the time and effort required for the recovery process.