Identifying Key Classes for Initial Software Comprehension: Can We Do It Better?
Key classes are excellent starting points for developers, especially newcomers, to comprehend an unknown software system. Though many unsupervised key class identification approaches have been proposed in the literature by representing software as class dependency networks (aka software networks) and using some network metrics (e.g., h-index, a-index, and coreness), they are never aware of the field where the nodes exist and the effect of the field on the importance of the nodes in it. According to the classic field theory in physics, every material particle is in a field through which they exert an impact on other particles in the field via non-contact interactions (e.g., electromagnetic force, gravity, and nuclear force). Similarly, every node in a software network might also exist in a field, which might affect the importance of class nodes in it. In this paper, we propose an approach, iFit, to identify key classes in object-oriented software systems. First, we represent software as a CSNWD (Weighted Directed Class-level Software Network) to capture the topological structure of software, including classes, their couplings, and the direction and strength of couplings. Second, we assume that the nodes in the CSNWD exist in a gravitation-like field and propose a new metric, CG (Cumulative Gravitation-like importance), to measure the importance of classes. CG is inspired by Newton’s gravitational formula and uses the PageRank value computed by a biased-PageRank algorithm as the masses of classes. Finally, classes in the system are sorted in descending order according to their CG values, and a cutoff is utilized, that is, the top-ranked classes are recommended as key classes. The experiments were performed on a data set composed of six open-source Java systems from the literature. The results show that iFit is superior to the baseline approaches on 93.75% of the cases, and is scalable to large-scale software systems. Besides, we find that iFit is neutral to the weighting mechanisms used to assign the weights for different coupling types in the CSNWD, that is, when applying iFit to identify key classes, we can use any one of the weighting mechanisms.
Fri 19 MayDisplayed time zone: Hobart change
11:00 - 12:30 | Program comprehensionTechnical Track / Journal-First Papers at Meeting Room 103 Chair(s): Oscar Chaparro College of William and Mary | ||
11:00 15mTalk | Code Comprehension Confounders: A Study of Intelligence and Personality Journal-First Papers Link to publication Pre-print | ||
11:15 15mTalk | Identifying Key Classes for Initial Software Comprehension: Can We Do It Better? Technical Track Weifeng Pan Zhejiang Gongshang University, China, Xin Du Zhejiang Gongshang University, China, Hua Ming Oakland University, Dae-Kyoo Kim Oakland University, Zijiang Yang Xi'an Jiaotong University and GuardStrike Inc | ||
11:30 15mTalk | Improving API Knowledge Discovery with ML: A Case Study of Comparable API Methods Technical Track Daye Nam Carnegie Mellon University, Brad A. Myers Carnegie Mellon University, Bogdan Vasilescu Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University Pre-print | ||
11:45 15mTalk | Evidence Profiles for Validity Threats in Program Comprehension Experiments Technical Track Marvin Muñoz Barón University of Stuttgart, Marvin Wyrich Saarland University, Daniel Graziotin University of Stuttgart, Stefan Wagner University of Stuttgart Pre-print | ||
12:00 15mTalk | Developers’ Visuo-spatial Mental Model and Program Comprehension Technical Track Pre-print | ||
12:15 15mTalk | Two Sides of the Same Coin: Exploiting the Impact of Identifiers in Neural Code Comprehension Technical Track Shuzheng Gao Harbin institute of technology, Cuiyun Gao Harbin Institute of Technology, Chaozheng Wang Harbin Institute of Technology, Jun Sun Singapore Management University, David Lo Singapore Management University, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China |