Characterizing the Usage of CI Tools in ML Projects
Background: Nowadays, Continuous Integration (CI) has become a widely adopted software development practice that enables faster code change integration and better software maintenance. At the same time, Machine Learning (ML) is being used by software applications for real-world scenarios like autonomous driving, which they previously could not resolve. ML projects employ development processes different from those of traditional software projects, but they also require multiple iterations to integrate new functionality and improve their quality, and thus may benefit from CI practices.
Aims: While there are many works covering CI within traditional software, none of them have empirically explored the adoption of CI and its associated failures and errors in the context of ML projects’ development. To address this knowledge gap, we performed an empirical analysis to compare CI adoption between ML projects and Non-ML projects in GitHub.
Method: We developed TraVanalyzer, the first Travis CI configuration analyzer, to analyze the different CI adoption practices in ML projects, and also developed a CI log analyzer to identify different types of CI problems in ML projects.
Results: We found that Travis CI is the most popular CI tool for ML projects, and that their CI adoption in general lags behind that of Non-ML projects, but that ML projects which adopted CI, used it for building, testing, code analysis, and automatic deployment more than Non-ML projects. We also found that only 24.6% of Travis-using ML projects adopted automated deployment, and that the majority of them perform their testing in CI using traditional unit testing frameworks, even though ML testing differs from regular unit testing. Furthermore, while CI in ML projects is as likely to experience problems as CI in Non-ML projects, it has more varied reasons for build-breakage. Yet, the most frequent CI failures of ML projects are testing-related problems such as unit test failures due to exceptions and test misconfiguration, similar to CI failures of Non-ML and OSS projects.
Conclusion: To the best of our knowledge, this is the first work that has analyzed ML projects’ CI usage, practices, and issues, contextualized its results by comparing them with similar Non-ML projects, and which provided findings for researchers and ML developers to identify possible issues and improvement scopes for CI in ML projects.
Fri 23 SepDisplayed time zone: Athens change
11:00 - 12:30 | Session 4A - DevOps & Development ApproachesESEM Emerging Results and Vision Papers / ESEM Technical Papers at Bysa Chair(s): Marcela Fabiana Genero Bocco University of Castilla-La Mancha | ||
11:00 20mFull-paper | Characterizing the Usage of CI Tools in ML Projects ESEM Technical Papers Dhia Elhaq Rzig University of Michigan - Dearborn, Foyzul Hassan University of Michigan - Dearborn, Chetan Bansal Microsoft Research, Nachiappan Nagappan Microsoft Research | ||
11:20 20mFull-paper | Investigating the Impact of Continuous Integration Practices on the Productivity and Quality of Open-Source Projects ESEM Technical Papers Jadson Santos Universidade Federal do Rio Grande do Norte, Daniel Alencar Da Costa University of Otago, Uirá Kulesza Federal University of Rio Grande do Norte | ||
11:40 20mFull-paper | Identifying Source Code File Experts ESEM Technical Papers Otávio Cury da Costa Castro Federal University of Piaui, Guilherme Amaral Avelino Federal University of Piaui, Pedro A. Santos Neto LOST/UFPI, Ricardo Britto Ericsson / Blekinge Institute of Technology, Marco Tulio Valente Federal University of Minas Gerais, Brazil Pre-print | ||
12:00 15mVision and Emerging Results | DevOps Practitioners’ Perceptions of the Low-code Trend ESEM Emerging Results and Vision Papers Saima Rafi University of Murcia, Muhammad Azeem Akbar LUT University, Mary Sánchez-Gordón Østfold University College, Ricardo Colomo-Palacios Østfold University College | ||
12:15 15mVision and Emerging Results | A Preliminary Investigation of MLOps Practices in GitHub ESEM Emerging Results and Vision Papers Fabio Calefato University of Bari, Filippo Lanubile University of Bari, Luigi Quaranta University of Bari, Italy |