"Project smells" — Experiences in Analysing the Software Quality of ML Projects with mllint
Thu 12 May 2022 05:20 - 05:25 at ICSE room 5-odd hours - Tools and Environments 1 Chair(s): Timo Kehrer
Fri 27 May 2022 09:15 - 09:20 at Room 306+307 - Papers 18: Recommender Systems, tools and environments Chair(s): Christian Bird
Fri 27 May 2022 13:30 - 15:00 at Ballroom Gallery - Posters 3
Machine Learning (ML) projects incur novel challenges in their development and productionisation over traditional software applications, though established principles and best practices in ensuring the project’s software quality still apply. While using static analysis to catch code smells has been shown to improve software quality attributes, it is only a small piece of the software quality puzzle, especially in the case of ML projects given their additional challenges and lower degree of Software Engineering (SE) experience in the data scientists that develop them. We introduce the novel concept of project smells which consider deficits in project management as a more holistic perspective on software quality in ML projects. An open-source static analysis tool, mllint, was also implemented to help detect and mitigate these. Our research evaluates this novel concept of project smells in the industrial context of ING, a global bank and large software- and data-intensive organisation. We also investigate the perceived importance of these project smells for proof-of-concept versus production-ready ML projects, as well as the perceived obstructions and benefits to using static analysis tools such as mllint. Our findings indicate a need for context-aware static analysis tools, that fit the needs of the project at its current stage of development, while requiring minimal configuration effort from the user.
Tue 10 MayDisplayed time zone: Eastern Time (US & Canada) change
| 11:00 - 12:00 | Tools and Environments 4NIER - New Ideas and Emerging Results / Technical Track / SEIP - Software Engineering in Practice at ICSE room 5-odd hours  Chair(s): Guido Salvaneschi University of St. Gallen | ||
| 11:005m Talk | Towards Property-Based Tests in Natural Language NIER - New Ideas and Emerging Results Colin Gordon Drexel UniversityPre-print Media Attached | ||
| 11:055m Talk | Using a Semantic Knowledge Base to Improve the Managementof Security Reports in Industrial DevOps Projects SEIP - Software Engineering in PracticePre-print Media Attached | ||
| 11:105m Talk | What's bothering developers in code review? SEIP - Software Engineering in Practice Emma Söderberg Lund University, Luke Church University of Cambridge | Lund University | Lark Systems, Jürgen Börstler Blekinge Institute of Technology, Diederick Niehorster Lund University, Christofer Rydenfält Lund UniversityPre-print Media Attached | ||
| 11:155m Talk | "Project smells" — Experiences in Analysing the Software Quality of ML Projects with mllint SEIP - Software Engineering in Practice Bart van Oort Delft University of Technology, Luís Cruz Deflt University of Technology, Babak Loni ING Bank N.V., Arie van Deursen Delft University of Technology, NetherlandsPre-print Media Attached | ||
| 11:205m Talk | Discovering Repetitive Code Changes in Python ML Systems Technical Track Malinda Dilhara University of Colorado Boulder, USA, Ameya Ketkar Oregon State University, USA, Nikhith Sannidhi University of Colorado Boulder, Danny Dig University of Colorado Boulder, USADOI Pre-print Media Attached | ||
| 11:255m Talk | OJXPerf: Featherlight Object Replica Detection for Java Programs Technical Track Bolun Li North Carolina State University, Hao Xu College of William and Mary, Qidong Zhao North Carolina State University, Pengfei Su University of California, Merced, Milind Chabbi Scalable Machines Research, Shuyin Jiao North Carolina State University, Xu Liu North Carolina State University, Oak Ridge National Laboratory, USADOI Pre-print Media Attached | ||
Thu 12 MayDisplayed time zone: Eastern Time (US & Canada) change
| 05:00 - 06:00 | Tools and Environments 1Technical Track / SEIP - Software Engineering in Practice / NIER - New Ideas and Emerging Results at ICSE room 5-odd hours  Chair(s): Timo Kehrer University of Bern | ||
| 05:005m Talk | MLSmellHound: A Context-Aware Code Analysis Tool NIER - New Ideas and Emerging Results Jai Kannan Deakin University, Scott Barnett Deakin University, Anj Simmons Deakin University, Luís Cruz Deflt University of Technology, Akash Agarwal Deakin UniversityDOI Pre-print | ||
| 05:055m Talk | A Unified Code Review Automation for Large-scale Industry with Diverse Development Environments SEIP - Software Engineering in Practice Hyungjin Kim Samsung Research, Samsung Electronics, Yonghwi Kwon Samsung Research, Samsung Electronics, Hyukin Kwon Samsung Research, Samsung Electronics, Yeonhee Ryou Samsung Research, Samsung Electronics, Sangwoo Joh Samsung Research, Samsung Electronics, Taeksu Kim Samsung Research, Samsung Electronics, Chul-Joo Kim Samsung Research, Samsung ElectronicsDOI Pre-print Media Attached | ||
| 05:105m Talk | Using a Semantic Knowledge Base to Improve the Managementof Security Reports in Industrial DevOps Projects SEIP - Software Engineering in PracticePre-print Media Attached | ||
| 05:155m Talk | What's bothering developers in code review? SEIP - Software Engineering in Practice Emma Söderberg Lund University, Luke Church University of Cambridge | Lund University | Lark Systems, Jürgen Börstler Blekinge Institute of Technology, Diederick Niehorster Lund University, Christofer Rydenfält Lund UniversityPre-print Media Attached | ||
| 05:205m Talk | "Project smells" — Experiences in Analysing the Software Quality of ML Projects with mllint SEIP - Software Engineering in Practice Bart van Oort Delft University of Technology, Luís Cruz Deflt University of Technology, Babak Loni ING Bank N.V., Arie van Deursen Delft University of Technology, NetherlandsPre-print Media Attached | ||
| 05:255m Talk | FlakiMe: Laboratory-Controlled Test Flakiness Impact Assessment Technical Track Maxime Cordy University of Luxembourg, Luxembourg, Renaud Rwemalika University of Luxembourg, Adriano Franci University of Luxembourg, Mike Papadakis University of Luxembourg, Luxembourg, Mark Harman University College LondonPre-print Media Attached | ||
Fri 27 MayDisplayed time zone: Eastern Time (US & Canada) change
| 09:00 - 10:30 | Papers 18:  Recommender Systems, tools and environmentsTechnical Track / Journal-First Papers / NIER - New Ideas and Emerging Results / SEIP - Software Engineering in Practice at Room 306+307 Chair(s): Christian Bird Microsoft Research | ||
| 09:005m Talk | Predicting the Objective and Priority of Issue Reports in Software Repositories Journal-First Papers Maliheh Izadi Sharif University of Technology, Kiana Akbari Sharif University of technology, Abbas Heydarnoori Sharif University of TechnologyLink to publication DOI Pre-print Media Attached | ||
| 09:055m Talk | Using Deep Learning to Generate Complete Log Statements Technical Track Antonio Mastropaolo Università della Svizzera italiana, Luca Pascarella Università della Svizzera italiana (USI), Gabriele Bavota Software Institute, USI Università della Svizzera italianaPre-print Media Attached | ||
| 09:105m Talk | Better Modeling the Programming World with Code Concept Graphs-augmented Multi-modal Learning NIER - New Ideas and Emerging Results Martin Weyssow DIRO, Université de Montréal, Houari Sahraoui Université de Montréal, Bang Liu DIRO & Mila, Université de MontréalPre-print Media Attached | ||
| 09:155m Talk | "Project smells" — Experiences in Analysing the Software Quality of ML Projects with mllint SEIP - Software Engineering in Practice Bart van Oort Delft University of Technology, Luís Cruz Deflt University of Technology, Babak Loni ING Bank N.V., Arie van Deursen Delft University of Technology, NetherlandsPre-print Media Attached | ||
| 09:205m Talk | Discovering Repetitive Code Changes in Python ML Systems Technical Track Malinda Dilhara University of Colorado Boulder, USA, Ameya Ketkar Oregon State University, USA, Nikhith Sannidhi University of Colorado Boulder, Danny Dig University of Colorado Boulder, USADOI Pre-print Media Attached | ||
| 09:255m Talk | FlakiMe: Laboratory-Controlled Test Flakiness Impact Assessment Technical Track Maxime Cordy University of Luxembourg, Luxembourg, Renaud Rwemalika University of Luxembourg, Adriano Franci University of Luxembourg, Mike Papadakis University of Luxembourg, Luxembourg, Mark Harman University College LondonPre-print Media Attached | ||
| 09:305m Talk | Semantic Image Fuzzing of AI Perception Systems Technical Track Trey Woodlief University of Virginia, Sebastian Elbaum University of Virginia, Kevin Sullivan University of VirginiaDOI Pre-print Media Attached | ||
| 09:355m Talk | Understanding and improving artifact sharing in software engineering research Journal-First Papers Christopher Steven Timperley Carnegie Mellon University, Lauren Herckis Carnegie Mellon University, Claire Le Goues Carnegie Mellon University, Michael Hilton Carnegie Mellon University, USALink to publication DOI Pre-print Media Attached | ||
| 09:405m Talk | ARCLIN: Automated API Mention Resolution for Unformatted Texts Technical Track Yintong Huo The Chinese University of Hong Kong, Yuxin Su Sun Yat-sen University, Hongming Zhang The Hong Kong University of Science and Technology, Michael Lyu The Chinese University of Hong KongDOI Pre-print Media Attached | ||

