Machine Learning (ML)-enabled systems are now ubiquitous. These systems leverage ML models that require frequent re-trainings, which can lead to unsustainable costs. To prevent superfluous re-trains and optimize system performance in the long run, we introduce Ripple, a system that leverages probabilistic model checking techniques to automate the decision of when to retrain the ML models employed by an ML-enabled system. The main challenge tackled by Ripple is estimating how the predictive quality of an ML model varies over long-term horizons, both when it is and when it is not re-trained. Ripple introduces Look-Ahead Adaptation Impact Predictors (LA-AIPs) that are used in combination with a probabilistic model checker to determine whether to retrain the ML model. This allows Ripple to optimize system performance in the long term, contributing to creating more sustainable ML-enabled systems. We demonstrate Ripple’s feasibility via a fraud detection system use case, showcasing its ability to plan for the long term and to account for retrain latency, improving over myopic adaptation approaches.
Maria Casimiro INESC-ID, IST, University of Lisbon & S3D, Carnegie Mellon University, Valentim Romão INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Paolo Romano University of Lisbon, Portugal, Luis Rodrigues INESC-ID, IST, ULisboa, David Garlan Carnegie Mellon University