CrossProbe: LLM-empowered Cross-Project Bug Detection for Deep Learning Frameworks
Deep Learning (DL) models may introduce reliability challenges in the underlying DL frameworks. These frameworks may be prone to bugs that can lead to crash or wrong results, particularly when involving complex model architectures and substantial computational demands. Such framework bugs can disrupt DL applications, impacting customer experience and potentially causing financial losses. Traditional approaches to testing DL frameworks face limitations in adapting to the vast search space of model structures, diverse APIs, and the complexity of hybrid programming and hardware environments. Recent advancements using Large Language Models (LLMs) have improved DL framework fuzzing, but their efficacy depends heavily on the quality and diversity of input prompts, which are often constructed using single-framework data.
In this paper, we propose an innovative approach for enhancing test generation for DL frameworks by leveraging “mirroring issues”—analogous bugs identified across different frameworks with common functionalities. Our approach is inspired by the fact that DL frameworks, such as PyTorch and TensorFlow, often share common bugs due to dependencies, developer errors, or edge-case inputs. We develop CrossProbe that utilizes LLMs to effectively learn from existing issues of one framework and transfer the acquired knowledge to generate test cases for finding mirroring issues in another framework, thus enabling cross-framework bug detection. To overcome the challenges of test case generation arising from the incompatible functionalities and different implementations between frameworks, we introduce three processes: \textit{alignment}, \textit{screening}, and \textit{distinction}. These processes help mitigate transfer errors by establishing API pair databases, filtering unsuitable cases, and highlighting cross-framework distinctions. Experiments demonstrate that CrossProbe is efficient by saving 36.3% iterations of generation, and achieves a 25.0% higher success rate in issue transferring compared to existing state-of-the-art LLM-based testing techniques. CrossProbe detects 24 unique bugs using its transferred knowledge. Out of them, 19 are previously unknown and each requires cross-framework knowledge in deep learning for identification.
Thu 26 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
11:00 - 12:15 | |||
11:00 25mTalk | Enhanced Prompting Framework for Code Summarization with Large Language Models Research Papers Minying Fang Qingdao University of Science and Technology, Xing Yuan Qingdao University of Science and Technology, Yuying Li Qingdao University of Science and Technology, Haojie Li Qingdao University of Science and Technology, Chunrong Fang Nanjing University, Junwei Du Qingdao University of Science and Technology DOI | ||
11:25 25mTalk | CrossProbe: LLM-empowered Cross-Project Bug Detection for Deep Learning Frameworks Research Papers Hao Guan University of Queensland, Southern University of Science and Technology, Guangdong Bai University of Queensland, Yepang Liu Southern University of Science and Technology DOI | ||
11:50 25mTalk | Safe4U: Identifying Unsound Safe Encapsulations of Unsafe Calls in Rust using LLMs Research Papers Huan Li Zhejiang University, China, Bei Wang Zhejiang University, China, Xing Hu Zhejiang University, Xin Xia Zhejiang University DOI |
This is the main event hall of Clarion Hotel, which will be used to host keynote talks and other plenary sessions. The FSE and ISSTA banquets will also happen in this room.
The room is just in front of the registration desk, on the other side of the main conference area. The two large doors with numbers “1” and “2” provide access to the Cosmos Hall.