Adaptive Performance Regression Detection via Semi-Supervised Siamese Learning
Timely detection of performance regression issues is critical to ensuring the stability and user experience of software systems. Traditional methods often rely on high-quality annotated data or data distribution assumptions, which cannot effectively adapt to performance changes in dynamic workload environments. To solve this problem, we propose DynamicRegress, a performance regression detection method based on Siamese network and semi-supervised learning. DynamicRegress integrates multi-dimensional key performance indicators (KPIs) with workload context to accurately characterize system states and detect performance regressions in real-time. By employing a dual weight-shared LSTM network, DynamicRegress reduces training complexity while retaining strong feature extraction capabilities. Data augmentation and a weighted loss function are incorporated to enhance the learning of minority regression cases, mitigating the class imbalance issue. Additionally, a semi-supervised learning strategy generates high-quality pseudo-labels to expand the training dataset, effectively addressing the challenge of limited labeled data. Experiments on production data from a top-tier global cloud service provider demonstrate that DynamicRegress achieves a superior F1 Score of 0.958 (outperforming the best baseline method by 0.282) while maintaining a low detection latency of 0.006 seconds per KPI pair. DynamicRegress provides a robust adaptive solution for performance regression detection in dynamic and complex software systems, and we have made the code publicly available to facilitate further research.