Software evolution is the process of developing, maintaining, and updating software systems. It is known that the software systems tend to increase their complexity and size over their evolution to meet the demands required by the users. Due to this fact, researchers have increasingly carried out studies on software evolution to understand the systems’ evolution pattern and propose techniques to overcome inherent problems in software evolution. Many of these works collect data but do not make them publicly available. Many datasets on software evolution are outdated, and/or are small, and some of them do not provide time series from software metrics. We propose an extensive software evolution dataset with temporal information about open-source Java systems. To build this dataset, we proposed a methodology of four steps: selecting the systems using a criterion, extracting and measuring their releases, and generating their time series. Our dataset contains time series of 46 software metrics extracted from 46 open-source Java systems, and we make it publicly available.
Viktor Csuvik Department of Software Engineering, MTA-SZTE Research Group on Artificial Intelligence, University of Szeged, Szeged, Hungary, László Vidács University of Szeged, Hungary