Identifying Key Classes for Initial Software Comprehension: Can We Do It Better? (ICSE 2023 - Technical Track)

Who

Weifeng Pan, Xin Du, Hua Ming, Dae-Kyoo Kim, Zijiang Yang

Track

ICSE 2023 Technical Track

Time Zone

The program is currently displayed in (GMT+10:00) Hobart.

Use conference time zone: (GMT+10:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 19 May 2023 11:15 - 11:30 at Meeting Room 103 - Program comprehension Chair(s): Oscar Chaparro

Abstract

Key classes are excellent starting points for developers, especially newcomers, to comprehend an unknown software system. Though many unsupervised key class identification approaches have been proposed in the literature by representing software as class dependency networks (aka software networks) and using some network metrics (e.g., h-index, a-index, and coreness), they are never aware of the field where the nodes exist and the effect of the field on the importance of the nodes in it. According to the classic field theory in physics, every material particle is in a field through which they exert an impact on other particles in the field via non-contact interactions (e.g., electromagnetic force, gravity, and nuclear force). Similarly, every node in a software network might also exist in a field, which might affect the importance of class nodes in it. In this paper, we propose an approach, iFit, to identify key classes in object-oriented software systems. First, we represent software as a CSN_WD (Weighted Directed Class-level Software Network) to capture the topological structure of software, including classes, their couplings, and the direction and strength of couplings. Second, we assume that the nodes in the CSN_WD exist in a gravitation-like field and propose a new metric, CG (Cumulative Gravitation-like importance), to measure the importance of classes. CG is inspired by Newton’s gravitational formula and uses the PageRank value computed by a biased-PageRank algorithm as the masses of classes. Finally, classes in the system are sorted in descending order according to their CG values, and a cutoff is utilized, that is, the top-ranked classes are recommended as key classes. The experiments were performed on a data set composed of six open-source Java systems from the literature. The results show that iFit is superior to the baseline approaches on 93.75% of the cases, and is scalable to large-scale software systems. Besides, we find that iFit is neutral to the weighting mechanisms used to assign the weights for different coupling types in the CSN_WD, that is, when applying iFit to identify key classes, we can use any one of the weighting mechanisms.

Weifeng Pan

Zhejiang Gongshang University, China

China

Xin Du

Zhejiang Gongshang University, China

Hua Ming

Oakland University

United States

Dae-Kyoo Kim

Oakland University

United States

Zijiang Yang

Xi'an Jiaotong University and GuardStrike Inc