Black-Box Models for Non-Functional Properties of AI Software SystemsResearch Paper
Non-functional properties (NFPs) such as latency, memory requirements, or hardware cost are an important characteristic of AI software systems, especially in the domain of resource-constrained embedded devices. Embedded AI products require sufficient resources for satisfactory latency and accuracy, but should also be cost-efficient and therefore not use more powerful hardware than strictly necessary. Traditionally, modeling and optimization efforts focus on the AI architecture, utilizing methods such as neural architecture search (NAS). However, before developers can start optimizing, they need to know which architectures are suitable candidates for their use case. To this end, architectures must be viewed in context: model post-processing (e.g. quantization), hardware platform, and run-time configuration such as batching all have significant effects on NFPs and therefore on AI architecture performance. Moreover, scalar parameters such as batch size cannot be benchmarked exhaustively. We argue that it is worthwhile to address this issue by means of black-box models before deciding on AI architectures for optimization and hardware/software platforms for inference. To support our claim, we present an AI product line with variable hardware and software components, perform benchmarks, and present notable results. Additionally, we evaluate both compactness and generalization capabilities of regression tree-based modeling approaches from the machine learning and product line engineering communities. We find that linear model trees perform best: they can capture NFPs of known AI configurations with a mean error of up to \perc{13}, and can predict unseen configurations with a mean error of 10 to \perc{26}. We find linear model trees to be more compact and interpretable than other tree-based approaches.