How do Design Decisions Influence the Distribution of Software Metrics?
Background. Source code analysis techniques usually rely on metric-based assessment. However, most of these techniques have low accuracy. One possible reason is because metric thresholds are extracted from classes driven by distinct design decisions. Previous studies have already shown that classes implemented according to some coarse-grained design decisions, such as programming languages, have different distribution of metric values. Therefore, these design decisions should be taken into account when using benchmarks for metric-based source code analysis. Goal. Our goal is to investigate whether other fine-grained design decisions also influence over distribution of software metrics. Method. We conduct an empirical study to evaluate the distributions of four metrics applied over fifteen real-world systems based on three different domains. Initially, we evaluated the influence of the class design role on the distributions of measures. For this purpose, we have defined an automatic approach to identify the design role played by each class. Then, we looked for other fine-grained design decisions that could have influenced the measures. Results. Our findings show that distribution of metrics are sensitive to the following design decisions: (i) design role of the class (ii) used libraries, (iii) coding style, (iv) exception handling, and (v) logging and debugging code mechanisms. Conclusion. The distribution of software metrics are sensitive to fine-grained design decisions and we should consider taking them into account when building benchmarks for metric-based source code analysis.