In software engineering processes for machine learning (ML)-enabled systems, integrating and verifying ML components is a major challenge. A prerequisite is the specification of ML component requirements, including models and data, an area where traditional requirements engineering (RE) processes face new obstacles. An underexplored source of RE-relevant information in this context is ML documentation such as ModelCards and DataSheets. However, it is uncertain to what extent RE-relevant information can be extracted from these documents. This study first investigates the amount and nature of RE-relevant information in 20 publicly available ModelCards and Datasheets. We show that these documents contain a significant amount of potentially RE-relevant information. Next, we evaluate how effectively three established RE representations (EARS, Rupp’s template, and Volere) can structure this knowledge into requirements. Our results demonstrate that there is a pathway to transform ML-specific knowledge into structured requirements, incorporating ML documentation in software engineering processes for ML systems.
Yi Peng University of Gothenburg and Chalmers University of Technology, Hans-Martin Heyn University of Gothenburg & Chalmers University of Technology, Jennifer Horkoff Chalmers and the University of Gothenburg
Anne Hess Technical University of Applied Sciences Würzburg-Schweinfurt, Gerald Heller Consultant and Trainer, Hartmut Schmitt HK Business Solutions GmbH, Cornelia Seraphin msg systems AG, Ismaning, Oliver Karras TIB - Leibniz Information Centre for Science and Technology