Replay-Based Continual Learning for Test Case Prioritization
In the large-scale Continuous Integration (CI) environment, regression testing can encounter high time and resource demands in the ad hoc execution. So, Test Case Prioritization (TCP) is crucial for enhancing regression testing efficiency in CI. TCP methods aim to optimize regression testing by ordering test cases to effectively cover new code changes and their potential side effects and maximize early fault detection. Traditional prioritization processes use diverse data sources, including code coverage analysis, test execution history, and domain-specific features. Heuristic-based or code coverage-driven prioritization techniques may not be enough for accurate results in a rapidly changing environment. For this reason, there has been a significant shift towards employing Machine Learning (ML) techniques in TCP in recent years to harness the vast and complex datasets generated by CI practices. ML-based TCP approaches integrate multifaceted test case features from various sources to enhance the accuracy of test case prioritization. This trend reflects a broader movement towards data-driven decision-making in software testing, offering the potential to significantly reduce the regression testing burden by tailoring test suites more effectively to the needs of each software build, thereby saving time and resources while maintaining or improving software quality. Recent studies show that ML-based methods used in TCP can be categorized into four groups: Supervised Learning, Unsupervised Learning, Reinforcement Learning, and Natural Language Processing. Codebases for software projects can change so rapidly by introducing new feature distributions into the CI systems. We analyzed a Java application’s CI and version control system (VCS) history data received from International Business Machines Corporation (IBM). Frequent inclusion of new test suites introduces new patterns into the dataset properties. To keep up with the changes, ML models require frequent re-training on the old and new datasets to maintain high accuracy on new data. The volume of the dataset tends to grow with time as more data comes in. Frequent re-training of ML models on the whole dataset is computationally costly and requires extensive storage. Learning incrementally from non-stationary new data without requiring the old dataset can solve this TCP problem. Continual Learning (CL) or life-long learning/ Incremental learning adapts to the changes without needing old training samples. While CL has recently been studied in several works for different domains, we could not find effective research implementing CL into the TCP domain. Given the dynamic environment of software testing, applying CL in industrial test case prioritization is critical for maintaining the efficiency and effectiveness of software testing processes in dynamic environments. However, modifying ML models on new datasets may introduce other problems to the models, such as catastrophic forgetting. This can occur when the model is trained on a new distribution, and the model weights drastically change. Different strategies have been suggested to solve the problem of catastrophic forgetting in CL. This abstract discusses integrating pre-training and replay-based continual learning methods to enhance test case prioritization. Pre-training-based continual learning leverages the strong representation by pre-training models on a large dataset. This approach helps initialize the model with a broad understanding, which can be further incrementally trained to accommodate new tasks without significant performance loss on previous tasks. The dataset we got from IBM has a few years of test execution data for CI and VCS. The model can be trained on a large volume of data for the pre-training method. Replay-based continual learning, on the other hand, involves retaining a small buffer of old training samples. This strategy includes a small fraction of old samples with a new dataset while incrementally training the model, enabling it to maintain its performance on older tasks by reinforcing previous learnings. Integrating pre-training and replay-based methods was found to be most effective in the literature [1]. Pre-training provides a solid foundation of generic knowledge; replay-based methods complement this by continuously reinforcing past learnings, ensuring that the adaptation to new tasks does not come at the expense of previously acquired knowledge. There are several design choices to leverage the benefits of this combined method. The frequency of incremental training on new datasets can be a significant design decision. This frequency can be time-driven or property-driven. Experimental works will guide the decision on incremental training frequency. Next, in replay-based approaches, the memory buffer size, the number of old samples, and the criteria for old sample selection are some of the decision parameters. Also, the small buffer memory requires effective management in terms of data-retaining strategies. Empirical evidence supports the effectiveness of this integrated approach. Hu et al. (2021) introduced prioritized experience replay in continual learning, emphasizing selecting representative experiences to alleviate catastrophic forgetting [2]. Similarly, Merlin et al. (2022) provided practical recommendations for replay-based continual learning methods, highlighting the importance of memory size and data augmentation in enhancing performance [3]. We will conduct detailed investigations to find the best values for these decision parameters. For time-based frequency, we will experiment with different intervals, such as weekly, every ten or fifteen days, monthly, three months, and six months of incremental training. Property-based choices can be new test suite additions, significant changes in test suites, and an increase or decrease in test case fail rate. Similarly, for the replay-based method, the samples can be selected from each incremental training dataset; the selection can be random or property-based. For example, an even distribution of passed or failed samples can be selected to avoid overfitting. In conclusion, integrating pre-training and replay-based continual learning methods presents a promising research direction for enhancing large-scale test case prioritization in CI. Future research should explore different strategies to maximize the benefits of continual learning in test case prioritization.
Reference 1. Liyuan Wang, Xingxing Zhang, Hang Su, and Jun Zhu. 2023. A Comprehensive Survey of Continual Learning: Theory, Method and Application. 2. Guannan Hu, Wu Zhang, and Wenhao Zhu. 2021. Prioritized Experience Replay for Continual Learning. 2021. https://doi.org/10.1109/ICCIA52886.2021.00011. 3. Gabriele Merlin, Vincenzo Lomonaco, Andrea Cossu, Antonio Carta and D. Bacciu. 2022. https://doi.org/10.1007/978-3-031-13324-4_47.
Tue 28 MayDisplayed time zone: Eastern Time (US & Canada) change
08:30 - 10:30 | |||
08:30 20mDay opening | Welcome to CCIW CCIW Tim A. D. Henderson Google | ||
08:50 25mTalk | Thinktank: Leveraging LLM Reasoning for Advanced Task Execution in CI/CD CCIW Tim Keller SAP SE | ||
09:15 25mTalk | Widespread Error Detection in Large Scale Continuous Integration Systems CCIW Stanislaw Swierc Meta Platforms, Inc., James Lu Meta Platforms, Inc., Thomas Yi Meta Platforms, Inc. Link to publication | ||
09:40 25mTalk | Scalable Continuous Integration using Remote Execution CCIW | ||
10:05 25mTalk | Replay-Based Continual Learning for Test Case Prioritization CCIW Asma Fariha Ontario Tech University, Akramul Azim Ontario Tech University, Ramiro Liscano Ontario Tech University |