Can Mamba Be Better? An Experimental Evaluation of Mamba in Code Intelligence
This program is tentative and subject to change.
The Transformer architecture and its core attention mechanism form the foundation of Code Language Models (code LMs) and have driven their remarkable progress across a wide range of code intelligence tasks. However, the quadratic complexity inherent in the attention mechanism poses scalability challenges. Recently, subquadratic architectures such as Mamba and Mamba-2 have emerged as compelling alternatives to the Transformer. While they have shown promising results and attracted increasing academic interest, their effectiveness in code intelligence tasks remains unexplored.
To fill this gap, we present the first experimental evaluation of Mamba-based models on two representative code intelligence tasks: line-level code completion and code generation, delving into their effectiveness and efficiency. We begin by evaluating Mamba and Mamba-2 under both full Fine-Tuning (FT) and Parameter-Efficient Fine-Tuning (PEFT) settings. We further pre-train the models on code corpora to boost their capacity in code comprehension. Our results consistently show that Mamba-based models outperform their Transformer-based counterparts. Furthermore, to explore whether the observed superiority stems from the Mamba architecture and to mitigate the influence of varying pre-training datasets, we pre-train CodeGPT, Mamba, and Mamba-2 from scratch on identical code corpora. Our findings reveal that the Mamba-2 block achieves the highest capability in code modeling. Besides, Mamba-based blocks exhibit advantages in terms of memory efficiency. We also evaluate performance in low-resource scenarios and at larger model scales, where Mamba-based models demonstrate consistent robustness. This work provides a comprehensive investigation into Mamba-based models in the context of code intelligence, uncovering their strengths and promising potential for future applications.
This program is tentative and subject to change.
Mon 17 NovDisplayed time zone: Seoul change
11:00 - 12:30 | |||
11:00 10mTalk | TensorGuard: Gradient-Based Model Fingerprinting for LLM Similarity Detection and Family Classification Research Papers Zehao Wu Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Haoyu Wang Huazhong University of Science and Technology | ||
11:10 10mTalk | Root Cause Analysis of RISC-V Build Failures via LLM and MCTS Reasoning Research Papers Weipeng Shuai Institute of Software, Chinese Academy of Sciences, Jie Liu Institute of Software, Chinese Academy of Sciences, Zhirou Ma Institute of Software, Chinese Academy of Sciences, Liangyi Kang Institute of Software, Chinese Academy of Sciences, Zehua Wang Institute of Software, Chinese Academy of Sciences, Shuai Wang Institute of Software, Chinese Academy of Sciences, Dan Ye Institute of Software at Chinese Academy of Sciences, Hui Li , Wei Wang Institute of Software at Chinese Academy of Sciences, Jiaxin Zhu Institute of Software at Chinese Academy of Sciences | ||
11:20 10mTalk | An Empirical Study of Knowledge Transfer in AI Pair Programming Research Papers Alisa Carla Welter Saarland University, Niklas Schneider Saarland University, Tobias Dick Saarland University, Kallistos Weis Saarland University, Christof Tinnes Saarland University, Marvin Wyrich Saarland University, Sven Apel Saarland University | ||
11:30 10mTalk | Efficient Understanding of Machine Learning Model Mispredictions Research Papers Martin Eberlein Humboldt-Universtität zu Berlin, Jürgen Cito TU Wien, Lars Grunske Humboldt-Universität zu Berlin | ||
11:40 10mTalk | Can Mamba Be Better? An Experimental Evaluation of Mamba in Code Intelligence Research Papers Shuo Liu City University of Hong Kong, Jacky Keung City University of Hong Kong, Zhen Yang Shandong University, Zhenyu Mao City University of Hong Kong, Yicheng Sun City University of Hong Kong | ||
11:50 10mTalk | "My productivity is boosted, but ..." Demystifying Users’ Perception on AI Coding Assistants Research Papers | ||
12:00 10mTalk | HFUZZER: Testing Large Language Models for Package Hallucinations via Phrase-based Fuzzing Research Papers Yukai Zhao , Menghan Wu Zhejiang University, Xing Hu Zhejiang University, Xin Xia Zhejiang University | ||
12:10 10mTalk | Provable Fairness Repair for Deep Neural Networks Research Papers Jianan Ma Hangzhou Dianzi University, China; Zhejiang University, Hangzhou, China, Jingyi Wang Zhejiang University, Qi Xuan Zhejiang University of Technology; Binjiang Institute of Artificial Intelligence, Zhen Wang Hangzhou Dianzi University, China | ||
12:20 10mTalk | AutoAdapt: On the Application of AutoML for Parameter-Efficient Fine-Tuning of Pre-Trained Code Models Journal-First Track Amal Akli University of Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Mike Papadakis University of Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg | ||