ISSTA 2025
Wed 25 - Sat 28 June 2025 Trondheim, Norway
co-located with FSE 2025
Fri 27 Jun 2025 11:00 - 11:25 at Cosmos 3C - LLM-based Testing 2 Chair(s): Jie M. Zhang

Binary code similarity analysis (BCSA) is a crucial research area in many fields such as cybersecurity. Specifically, function-level diffing tools are the most widely used in BCSA: they perform (similar) function matching one by one for evaluating the similarity between binary programs (binaries). However, such methods need a high time complexity, making it unscalable in large-scale scenarios (e.g., 1/n-to-n searching). Towards effective and efficient program-level BCSA, we propose KEENHash, a novel hashing approach that hashes binaries into program-level representations through large language model (LLM)-generated function embeddings. KEENHash condenses a binary into one compact and fixed-length program embedding using K-Means and Feature Hashing, allowing us to do effective and efficient large-scale program-level BCSA, surpassing the previous state-of-the-art methods. The experimental results show that KEENHash is 215 times faster than the state-of-the-art function matching tool while maintaining effectiveness. Furthermore, in a large-scale scenario with 5.3 billion similarity evaluations, KEENHash takes only 395.83 seconds while the tool will cost 56 days. We also evaluate KEENHash on the program clone search of large-scale BCSA across extensive datasets in 202,305 binaries. Compared with 4 state-of-the-art methods, KEENHash outperforms all of them by at least 23.16%, and displays remarkable superiority over them in the large-scale BCSA security scenario of malware detection.

Fri 27 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:15
LLM-based Testing 2Research Papers at Cosmos 3C
Chair(s): Jie M. Zhang King's College London
11:00
25m
Talk
KEENHash: Hashing Programs into Function-aware Embeddings for Large-scale Binary Code Similarity Analysis
Research Papers
Zhijie Liu ShanghaiTech University, China, Qiyi Tang Tencent Security Keen Lab, Sen Nie Tencent Security Keen Lab, Shi Wu Tencent Security Keen Lab, Liangfeng Zhang School of Information Science and Technology, ShanghaiTech University, Yutian Tang University of Glasgow, United Kingdom
DOI
11:25
25m
Talk
Porting Software Libraries to OpenHarmony: Transitioning from TypeScript or JavaScript to ArkTS
Research Papers
Bo Zhou Northeastern University, Jiaqi Shi Northeastern University, Ying Wang Northeastern University, Li Li Beihang University, Li Tsz On The Hong Kong University of Science and Technology, Hai Yu Northeastern University, China, Zhiliang Zhu Northeastern University, China
DOI
11:50
25m
Talk
STRUT: Structured Seed Case Guided Unit Test Generation for C Programs using LLMs
Research Papers
Jinwei Liu Xidian University, Chao Li Beijing Institute of Control Engineering; Beijing Sunwise Information Technology, Rui Chen Beijing Institute of Control Engineering; Beijing Sunwise Information Technology, Shaofeng Li Xidian University, Bin Gu Beijing Institute of Control Engineering, Mengfei Yang China Academy of Space Technology
DOI

Information for Participants
Fri 27 Jun 2025 11:00 - 12:15 at Cosmos 3C - LLM-based Testing 2 Chair(s): Jie M. Zhang
Info for room Cosmos 3C:

Cosmos 3C is the third room in the Cosmos 3 wing.

When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.

:
:
:
: