An Empirical Study of Challenges in Machine Learning Asset Management (ICSE 2025 - Journal-first Papers)

Who

Zhimin Zhao, Yihao Chen, Abdul Ali Bangash, Bram Adams, Ahmed E. Hassan

Track

ICSE 2025 Journal-first Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 1 May 2025 14:45 - 15:00 at 215 - SE for AI 3 Chair(s): Lina Marsso

Abstract

[Context] In machine learning (ML) applications, assets include not only the ML models themselves, but also the datasets, algorithms, and deployment tools that are essential in the development, training, and implementation of these models. Efficient management of ML assets is critical to ensure optimal resource utilization, consistent model performance, and a streamlined ML development lifecycle. This practice contributes to faster iterations, adaptability, reduced time from model development to deployment, and the delivery of reliable and timely outputs. [Objective] Despite research on ML asset management, there is still a significant knowledge gap on operational challenges, such as model versioning, data traceability, and collaboration issues, faced by asset management tool users. These challenges are crucial because they could directly impact the efficiency, reproducibility, and overall success of machine learning projects. Our study aims to bridge this empirical gap by analyzing user experience, feedback, and needs from Q &A posts, shedding light on the real-world challenges they face and the solutions they have found. [Method] We examine 15, 065 Q &A posts from multiple developer discussion platforms, including Stack Overflow, tool-specific forums, and GitHub/GitLab. Using a mixed-method approach, we classify the posts into knowledge inquiries and problem inquiries. We then apply BERTopic to extract challenge topics and compare their prevalence. Finally, we use the open card sorting approach to summarize solutions from solved inquiries, then cluster them with BERTopic, and analyze the relationship between challenges and solutions. [Results] We identify 133 distinct topics in ML asset management-related inquiries, grouped into 16 macro-topics, with software environment and dependency, model deployment and service, and model creation and training emerging as the most discussed. Additionally, we identify 79 distinct solution topics, classified under 18 macro-topics, with software environment and dependency, feature and component development, and file and directory management as the most proposed. [Conclusions] This study highlights critical areas within ML asset management that need further exploration, particularly around prevalent macro-topics identified as pain points for ML practitioners, emphasizing the need for collaborative efforts between academia, industry, and the broader research community.

Zhimin Zhao

Queen's University

Yihao Chen

Queen's University

Abdul Ali Bangash

Software Analysis and Intelligence Lab (SAIL), Queen's University, Canada

Canada

Bram Adams

Queen's University

Canada

Ahmed E. Hassan

Queen’s University

Canada

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 1 May
Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30	SE for AI 3Research Track / SE in Society (SEIS) / Journal-first Papers at 215 Chair(s): Lina Marsso École Polytechnique de Montréal

14:00 15m Talk		Dissecting Global Search: A Simple yet Effective Method to Boost Individual Discrimination Testing and RepairSE for AI Research Track Lili Quan Tianjin University, Li Tianlin NTU, Xiaofei Xie Singapore Management University, Zhenpeng Chen Nanyang Technological University, Sen Chen Nankai University, Lingxiao Jiang Singapore Management University, Xiaohong Li Tianjin University Pre-print
14:15 15m Talk		FixDrive: Automatically Repairing Autonomous Vehicle Driving Behaviour for $0.08 per ViolationSE for AI Research Track Yang Sun Singapore Management University, Chris Poskitt Singapore Management University, Kun Wang Zhejiang University, Jun Sun Singapore Management University Link to publication DOI Pre-print File Attached
14:30 15m Talk		MARQ: Engineering Mission-Critical AI-based Software with Automated Result Quality AdaptationSE for AI Research Track Uwe Gropengießer Technical University of Darmstadt, Elias Dietz Technical University of Darmstadt, Florian Brandherm Technical University of Darmstadt, Achref Doula Technical University of Darmstadt, Osama Abboud Munich Research Center, Huawei, Xun Xiao Munich Research Center, Huawei, Max Mühlhäuser Technical University of Darmstadt
14:45 15m Talk		An Empirical Study of Challenges in Machine Learning Asset ManagementSE for AI Journal-first Papers Zhimin Zhao Queen's University, Yihao Chen Queen's University, Abdul Ali Bangash Software Analysis and Intelligence Lab (SAIL), Queen's University, Canada, Bram Adams Queen's University, Ahmed E. Hassan Queen’s University
15:00 15m Talk		A Reference Model for Empirically Comparing LLMs with HumansSE for AI SE in Society (SEIS) Kurt Schneider Leibniz Universität Hannover, Software Engineering Group, Farnaz Fotrousi Chalmers University of Technology and University of Gothenburg, Rebekka Wohlrab Chalmers University of Technology
15:15 7m Talk		Building Domain-Specific Machine Learning Workflows: A Conceptual Framework for the State-of-the-PracticeSE for AI Journal-first Papers Bentley Oakes Polytechnique Montréal, Michalis Famelis Université de Montréal, Houari Sahraoui DIRO, Université de Montréal DOI Pre-print File Attached