Microservices Performance Testing with Causality-enhanced Large Language Models
This program is tentative and subject to change.
Efficient performance testing for microservices is essential for engineers to ensure that deviations of performance and resource usage metrics from expectation are promptly identified within the rapid release cycles microservice development. To this aim, engineers would need to explore the space of possible workload configurations and focus only on the most critical ones, namely those that unexpectedly cause performance issues, such as low-load configurations. This requires a huge effort, and can be infeasible in short release cycles. We present CALLMIT, a framework using Large Language Models (LLM) enhanced by causal reasoning to automatically generate critical workloads for microservices performance testing. Engineers query CALLMIT to generate workload configurations expected to expose deviations from performance requirements, so as to actually run only tests that trigger critical configurations. We present the experimental evaluation on three subjects, with comparison to a conventional Retrieval-Augmented Generation technique. The results show that causal models improve the correct identification by LLM of performance-critical workload configurations.
This program is tentative and subject to change.
Sun 27 AprDisplayed time zone: Eastern Time (US & Canada) change
16:00 - 17:30 | |||
16:00 12mLong-paper | Augmenting Large Language Models with Static Code Analysis for Automated Code Quality Improvements Research Papers | ||
16:12 12mLong-paper | Benchmarking Prompt Engineering Techniques for Secure Code Generation with GPT Models Research Papers Marc Bruni University of Applied Sciences and Arts Northwestern Switzerland, Fabio Gabrielli University of Applied Sciences and Arts Northwestern Switzerland, Mohammad Ghafari TU Clausthal, Martin Kropp University of Applied Sciences and Arts Northwestern Switzerland Pre-print | ||
16:24 12mLong-paper | ELDetector: An Automated Approach Detecting Endless-loop in Mini Programs Research Papers Nan Hu Xi’an Jiaotong University, Ming Fan Xi'an Jiaotong University, Jingyi Lei Xi'an Jiaotong University, Jiaying He Xi'an Jiaotong University, Zhe Hou China Mobile System Integration Co. | ||
16:36 12mLong-paper | Testing Android Third Party Libraries with LLMs to Detect Incompatible APIs Research Papers Tarek Mahmud Texas State University, bin duan University of Queensland, Meiru Che Central Queensland University, Anne Ngu Texas State University, Guowei Yang University of Queensland | ||
16:48 12mLong-paper | Vulnerability-Triggering Test Case Generation from Third-Party Libraries Research Papers Yi Gao Zhejiang University, Xing Hu Zhejiang University, Zirui Chen , Tongtong Xu Nanjing University, Xiaohu Yang Zhejiang University | ||
17:00 6mShort-paper | Microservices Performance Testing with Causality-enhanced Large Language Models Research Papers Cristian Mascia University of Naples Federico II, Roberto Pietrantuono Università di Napoli Federico II, Antonio Guerriero Università di Napoli Federico II, Luca Giamattei Università di Napoli Federico II, Stefano Russo Università di Napoli Federico II | ||
17:06 6mShort-paper | MaRV: A Manually Validated Refactoring Dataset Data and Benchmarking Henrique Gomes Nunes Universidade Federal de Minas Gerais, Tushar Sharma Dalhousie University, Eduardo Figueiredo Federal University of Minas Gerais | ||
17:12 6mShort-paper | PyResBugs: A Dataset of Residual Python Bugs for Natural Language-Driven Fault Injection Data and Benchmarking Domenico Cotroneo University of Naples Federico II, Giuseppe De Rosa University of Naples Federico II, Pietro Liguori University of Naples Federico II | ||
17:18 6mShort-paper | The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models Data and Benchmarking Jonathan Katzy Delft University of Technology, Răzvan Mihai Popescu Delft University of Technology, Arie van Deursen TU Delft, Maliheh Izadi Delft University of Technology |