A Multi-Modal Retrieval-Augmented Framework for Compiler Backend Generation with LLMs
Compiler backend development still heavily relies on manual effort, making it both time-consuming and labor-intensive. While large language models (LLMs) have shown strong capabilities in general code generation, their accuracy in generating backend functions remains limited. Directly using function descriptions as prompt often fails to bridge the gap between function semantics and implementation, resulting in low accuracy. Moreover, improving LLMs’ accuracy on backend functions typically requires fine-tuning, which demands significant computational resources and is impractical to most backend developers. Although several AI-driven approaches for backend generation have emerged, their outputs still require extensive manual modification and remain dependent on fine-tuning LLMs.
In this paper, we propose MultiFork, a retrieval-augmented framework that integrates a multi-modal retriever with LLMs to enhance backend function generation. MultiFork encodes backend-specific attributes as graphs and combines them with function-level textual features to retrieve similar functions from existing backends. Retrieved functions are then used to construct few-shot prompts that guide LLMs in generating accurate target functions without requiring LLM fine-tuning. Experimental results show that MultiFork significantly improves function generation accuracy across six LLMs, and all of them outperform a fine-tuned language model when combined with MultiFork. Moreover, when combined with LLMs, MultiFork improves the accuracy of an existing AI-driven backend generation approach by up to 39.31% in terms of correct statements, further improving backend development efficiency. The artifact for this work is anonymously archived at \href{https://zenodo.org/records/16876827}{Zenodo}.
Thu 19 MarDisplayed time zone: Athens change
11:00 - 12:30 | Session 4A - Code Representation and AnalysisResearch Track / Tool Demo Track at Panorama Chair(s): Stefan Grintz SAP | ||
11:00 15mTalk | GDPO: Dual Learning for Self-Supervised Code Summarization in the Era of Large Language Models Research Track Chen Xiao , Wang Shuwei Institute of Information Engineering, Chinese Academy of Sciences;and University of Chinese Academy of Sciences, Zhang Weize Institute of Information Engineering, Chinese Academy of Sciences;and University of Chinese Academy of Sciences, Jiang Zhengwei Institute of Information Engineering, Chinese Academy of Sciences;and University of Chinese Academy of Sciences, Wang Qiuyun Institute of Information Engineering, Chinese Academy of Sciences;and University of Chinese Academy of Sciences | ||
11:15 15mTalk | Mind the Merge: Evaluating the Effects of Token Merging on Pre-trained Models for Code Research Track Mootez Saad Dalhousie University, Hao Li Queen's University, Tushar Sharma Dalhousie University, Ahmed E. Hassan Queen’s University | ||
11:30 15mTalk | CONCORD: A DSL for Generating Simplified and Scalable Graph-Based Code Representations Research Track Pre-print | ||
11:45 15mTalk | Combining Static Code Analysis and Large Language Models Improves Correctness and Performance of Algorithm Recognition Research Track Denis Neumüller Ulm University, Sebastian Boll Ulm University, David Schüler Ulm University, Matthias Tichy Ulm University | ||
12:00 15mTalk | A Multi-Modal Retrieval-Augmented Framework for Compiler Backend Generation with LLMs Research Track Ming Zhong SKLP, Institute of Computing Technology, CAS, Fang Lv Institute of Computing Technology, Chinese Academy of Sciences, Hongna Geng , Xin Sun , Lulin Wang , Lulin Wang , Huimin Cui Institute of Computing Technology, Chinese Academy of Sciences, Xiaobing Feng ICT CAS | ||
12:15 7mTalk | Static Analysis assisted Knowledge Graph based Automatic Functionality Discovery for Mainframe Applications Tool Demo Track Sasaank Janapati , Atul Kumar IBM Research India, Nandakishore S Menon IBM Research India, Sridhar Chimalakonda Indian Institute of Technology Tirupati | ||