Aligning Data Debt with AI-Integrated Software Project Lifecycle Processes: A Standard-Based Mapping Approach
Artificial Intelligence (AI) technologies have become increasingly central to software development, enhancing efficiency with tools such as intelligent code assistants and driving innovations in products like chatbots, recommendation engines, and predictive analytics. Despite these advancements, the inherent complexity of AI-integrated software projects often leads to the accumulation of technical debt (TD), which can compromise the reliability and sustainability of systems in the long term. Managing TD effectively in these projects can be achieved by adapting international standards. Although these standards are not designed for TD management, they can be systematically applied to detect and address TD by aligning with AI system lifecycle processes. The aim of this study is to demonstrate how AI-related TD correlates with various AI lifecycle processes, thereby enabling systematic detection and management of TD in AI-integrated software projects. To achieve this, we studied 73 unique cases of TD, each reflecting either an instance or a root cause of data-related TD. These cases were subsequently mapped to the processes and activities outlined in the ISO/IEC 5338 AI Systems Lifecycle Processes standard. Subsequently, the accuracy of these mappings was validated bidirectionally by a large language model and two domain experts. Our findings revealed that data-related TD categories are associated with a diverse range of processes such as design definition, quality management and human resource management and tend to accumulate more significantly in certain areas within the AI lifecycle. This study not only serves as a proof of concept for developing a management approach for AI-related TD, but also enhances the body of knowledge on managing TD in AI projects by detailing how TD interacts with and impacts various AI lifecycle processes.