MiSum: Multi-Modality Heterogeneous Code Graph Learning for Multi-Intent Binary Code Summarization
The current landscape of binary code summarization predominantly revolves around the generation of a single summarization, limiting the scope of understanding and usability for reverse engineers. The existing approaches often fail to address the multifaceted needs of users, such as detailed insights into usage patterns, implementation nuances, and design rationale, as highlighted in the domain of source code summarization. Consequently, the necessity of multi-intent binary code summarization, an essential way to enhance the efficacy of reverse engineering processes, is underscored. To address this gap, our basic observation is that the two types of information essential for binary code summarization (i.e., the assembly code and pseudo code) can complement each other well. Specifically, the assembly code, characterized by its low-level nature, intricately delineates the execution logic, whereas the pseudo code, operating at a higher level, retains valuable contextual information. Based on this insight, we propose MiSum, a novel multi-modality heterogeneous code graph alignment and learning method to integrate information from both assembly code and pseudo code. MiSum introduces a unified multi-modality heterogeneous code graph (MM-HCG) that achieves alignment between assembly code graph and pseudo code graph and carries low-level execution details and high-level structural information. To fuse the graph information, we propose MM-HCG heterogeneous graph learning with heterogeneous mutual attention and message passing, which caters to important code blocks and discovers inter-dependencies between different forms of codes. We also propose an intent-aware summary generator with an intent-aware attention mechanism to produce customized summaries corresponding to multiple intents. Extensive experiments, including evaluations across various architectures and optimization levels, demonstrate that MiSum outperforms state-of-the-art baselines in BLEU, METEOR, and ROUGE-L metrics. Human evaluations further validate its ability to effectively support reverse engineers in understanding diverse binary code intents, providing a significant advancement in the field of binary code analysis.
Mon 23 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
14:00 - 15:30 | Code SearchResearch Papers / Journal First / Ideas, Visions and Reflections at Aurora A Chair(s): Xin Xia Zhejiang University | ||
14:00 20mTalk | 10 years later: revisiting how developers search for code Research Papers Kathryn Stolee North Carolina State University, Tobias Welp Google, Caitlin Sadowski , Sebastian Elbaum University of Virginia DOI | ||
14:20 20mTalk | Approaching Code Search for Python as a Translation Retrieval Problem with Dual Encoders Journal First | ||
14:40 20mTalk | Zero-Shot Cross-Domain Code Search without Fine-Tuning Research Papers Keyu Liang Zhejiang University, Zhongxin Liu Zhejiang University, Chao Liu Chongqing University, Zhiyuan Wan Zhejiang University, David Lo Singapore Management University, Xiaohu Yang Zhejiang University DOI | ||
15:00 10mTalk | Measuring What Matters: An Aggregate Metric for Assessing Enterprise Code Summaries Ideas, Visions and Reflections Ashita Saxena IBM Research, Palanivel Kodeswaran IBM Research India, Sayandeep Sen IBM Research India, Srikanth Tamilselvam IBM Research | ||
15:10 20mTalk | MiSum: Multi-Modality Heterogeneous Code Graph Learning for Multi-Intent Binary Code Summarization Research Papers Kangchen Zhu National university of Defense Technology, Zhiliang Tian National University of Defense Technology, Shangwen Wang National University of Defense Technology, Weiguo Chen National University of Defense Technology, Zixuan Dong National University of Defense Technology, mingyue leng National University of Defense Technology, Xiaoguang Mao National University of Defense Technology DOI |
Aurora A is the first room in the Aurora wing.
When facing the main Cosmos Hall, access to the Aurora wing is on the right, close to the side entrance of the hotel.