Characterising Algorithm Debt in Machine and Deep Learning Systems
Thu 1 May 2025 11:12 - 11:18 at 204 - ACM Student Research Presentations Chair(s): Md Tajmilur Rahman, Lola Burgueño
Technical Debt (TD) refers to the long-term costs incurred due to suboptimal decisions made for short-term gains. Algorithm Debt (AD), is a TD type that arises from the suboptimal implementation of algorithms, that can result in model degradation and poor scalability. Machine and Deep Learning (ML/DL) systems are particularly susceptible to AD due to the complexity of their methods and dependencies on large-scale data. Despite the significance of AD in ML/DL systems, its causes, effects, and mitigation strategies remain underexplored, creating a knowledge gap. To address this, we investigated the causes, effects, and mitigation strategies of AD in ML/DL systems. Using a mixed-methods approach, we conducted a systematic review of 44 primary studies, interviewed 21 ML/DL practitioners, complemented by analysis of 65 questionnaire. We also conducted experiments with ML/DL models and their embeddings to empirically evaluate their performance in the automated detection of AD. Our findings suggest that the causes of AD can be categorised into three key areas: data quality issues, ML knowledge gaps, and model training. Our experiments further revealed that Logistic Regression model trained with AD-specific features achieved the highest F1-score of 54%, outperforming other techniques. This study contributes to software engineering by proposing a framework that outlines the causes, effects, and mitigation strategies of AD in ML/DL systems. The framework serves as a guideline for practitioners and provides insights for developing effective AD management strategies in ML/DL systems.
Iko-Ojo is a PhD student under the supervision of Dr. Chirath Hettiarachchi at the Australian National University, in the CECC School of Computing. He is co supervised by Dr. Fatemeh Fard from University of British Columbia, Prof. Alex Potanin, and Prof. Hanna Suominen. His main research interests are in Software Engineering, Technical Debt, and ML/DL. He is currently working on Algorithm Debt in ML/DL Software (e.g., Deep Learning frameworks).
Tue 29 AprDisplayed time zone: Eastern Time (US & Canada) change
09:00 - 10:30 | ACM Student Research Posters and Judging Session 1SRC - ACM Student Research Competition at Canada Hall 3 Poster Area Chair(s): Md Tajmilur Rahman Gannon University | ||
09:00 90mTalk | Consistent Graph Model Generation with Large Language Models SRC - ACM Student Research Competition Boqi Chen McGill University | ||
09:00 90mTalk | Enhancing OSS Remediation with Patch Backporting SRC - ACM Student Research Competition Lyuye Zhang Nanyang Technological University | ||
09:00 90mTalk | Improving Formal Methods VisualizationsFormal Methods SRC - ACM Student Research Competition Avinash Palliyil Georgia Institute of Technology | ||
09:00 90mTalk | MUARF: Leveraging Multi-Agent Workflows for Automated Code Refactoring SRC - ACM Student Research Competition Yisen Xu Software PErformance, Analysis, and Reliability (SPEAR) lab, Concordia University, Montreal, Canada | ||
09:00 90mTalk | Identifying Performance-Sensitive Configurations in Software Systems with LLM-Driven Agents SRC - ACM Student Research Competition Zehao Wang Concordia University | ||
09:00 90mTalk | Characterising Algorithm Debt in Machine and Deep Learning Systems SRC - ACM Student Research Competition Emmanuel Iko-Ojo Simon Australian National University | ||
09:00 90mTalk | Automatic Fuzz Drivers for JavaScript with Type Distributions SRC - ACM Student Research Competition Mayant Mukul University of British Columbia |
Thu 1 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | ACM Student Research PresentationsSRC - ACM Student Research Competition at 204 Chair(s): Md Tajmilur Rahman , Lola Burgueño University of Malaga A subset of finalist ACM SRC students will give short presentations in this session. That decision about who will present will be made after the poster sessions, and this schedule will be updated, so don’t rely on the precise timing until just before the session.. They all also have posters in Canada Hall 3 Poster area, with judging to be on Tuesday. Awards will be announced in the banquet on Thursday evening. | ||
11:00 6mTalk | Automatic Fuzz Drivers for JavaScript with Type Distributions SRC - ACM Student Research Competition Mayant Mukul University of British Columbia | ||
11:06 6mTalk | CASS: Context-Aware Slice Summarization for Debugging Regression Failures SRC - ACM Student Research Competition Sahar Badihi University of British Columbia, Canada Pre-print | ||
11:12 6mTalk | Characterising Algorithm Debt in Machine and Deep Learning Systems SRC - ACM Student Research Competition Emmanuel Iko-Ojo Simon Australian National University | ||
11:18 6mTalk | Consistent Graph Model Generation with Large Language Models SRC - ACM Student Research Competition Boqi Chen McGill University | ||
11:24 6mTalk | Enhancing OSS Remediation with Patch Backporting SRC - ACM Student Research Competition Lyuye Zhang Nanyang Technological University | ||
11:30 6mTalk | Identifying Performance-Sensitive Configurations in Software Systems with LLM-Driven Agents SRC - ACM Student Research Competition Zehao Wang Concordia University | ||
11:36 6mTalk | Improving Formal Methods VisualizationsFormal Methods SRC - ACM Student Research Competition Avinash Palliyil Georgia Institute of Technology | ||
11:42 6mTalk | MUARF: Leveraging Multi-Agent Workflows for Automated Code Refactoring SRC - ACM Student Research Competition Yisen Xu Software PErformance, Analysis, and Reliability (SPEAR) lab, Concordia University, Montreal, Canada | ||
11:48 6mTalk | On the Automation of Code Review Tasks Through Cross-Task Knowledge Distillation SRC - ACM Student Research Competition Oussama Ben Sghaier DIRO, Université de Montréal | ||
11:54 6mTalk | On the Fly Input Refinement for Code Language Models SRC - ACM Student Research Competition Ravishka Shemal Rathnasuriya University of Texas at Dallas | ||
12:00 6mTalk | Program Feature-based Fuzzing Benchmarking SRC - ACM Student Research Competition Miao Miao The University of Texas at Dallas | ||
12:06 6mTalk | Revisiting SWE-Bench: On the Importance of Data Quality for LLM-based Code Models SRC - ACM Student Research Competition Reem Aleithan York University, Canada | ||
12:12 6mTalk | The Balancing Act of Policies in Developing Machine Learning Explanations SRC - ACM Student Research Competition Jacob Tjaden Colby College | ||
12:18 6mTalk | To Mock or Not to Mock: Divergence in Mocking Practices Between LLM and Developers SRC - ACM Student Research Competition Hanbin Qin Stevens Institute of Technology | ||
12:24 6mTalk | Towards Compatibly Mitigating Technical Lag in Maven Projects SRC - ACM Student Research Competition Rui Lu East China Normal University |