Software Artifact Mining in Software Engineering Conferences: A Meta-Analysis (ESEIW 2022 - ESEM Technical Papers Track)

Who

Zeinab Abou Khalil, Stefano Zacchiroli

Track

ESEIW 2022 ESEM Technical Papers

Time Zone

The program is currently displayed in (GMT+03:00) Athens.

Use conference time zone: (GMT+03:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 22 Sep 2022 11:15 - 11:35 at Bysa - Session 1A - Behavioral Software Engineering Chair(s): Valentina Lenarduzzi

Abstract

Software development activities produce numerous types of artifacts as byproducts: source code, version control system metadata, bug reports, mailing list conversations, test data, etc. Empirical software engineering (ESE) has thrived mining all those artifacts to uncover the inner workings of software development and improve its practices. But which artifacts are studied in the field is a moving target, which we study empirically in this paper.

We perform a meta-analysis of software artifact mining studies published in top conferences in (empirical) software engineering, for a total of 9622 papers, which we analyze using natural language processing (NLP) techniques. We characterize quantitatively the types of software artifacts that are most often mined in those studies and their evolution over a 16-year period (2004-2020). We analyze the combinations of artifact types that are most often mined together, as well as the relationship between study purposes and mined artifacts.

We discuss the implications of our findings to inform research policy decisions about study repeatability and the production of open datasets to enable future studies in the field.

Zeinab Abou Khalil

Inria

Stefano Zacchiroli

Télécom Paris, Polytechnic Institute of Paris

France