EASE 2024
Tue 18 - Fri 21 June 2024 Salerno, Italy

Motivation is an important factor in software development. However, it is a subjective concept that is hard to quantify and study empirically. Therefore, it seems that the wealth of data available about real software development projects in repositories such as GitHub cannot be used to study motivation. We present a new methodology to overcome this difficulty, based on the use of labeling functions. A labeling function is a validated heuristic that need only be better than a guess, computable on a dataset. We define four labeling functions for motivation, for example working in diverse hours of the day, and show that they indeed correlate with motivation. We then apply them to more than 150 thousand developers working on GitHub projects. This enables us to characterize and compare the behaviors of developers who are motivated or less so. The results indicate that the effect of motivation is indeed large, and allow us to build a model to predict developer retention in a project.