ICPC 2026
Sun 12 - Mon 13 April 2026 Rio de Janeiro, Brazil
co-located with ICSE 2026

Complex software systems are typically built from multiple functional modules, which serve as both structural units for implementing specific functionalities and the actual granularity for third-party library (TPL) reuse. Consequently, recovering high-quality modular structures from binaries becomes essential for advancing program comprehension and TPL reuse detection. However, this valuable modular structure is lost during compilation, making it unavailable for direct extraction from binaries. While existing community detection-based binary modularization methods can extract modules, they suffer from poor scalability when processing large binaries and produce inconsistently sized modules across binaries of varying scales. To address these limitations, we propose Modubin, an efficient binary modularization approach based on the locality of homologous functions. Our method leverages their call cohesion through a three-phase pipeline: Local Aggregation segments function intervals based on address ordering and locally aggregates homologous functions into function clusters; Module Growth progressively merges adjacent fragmented edges into stable module backbones to form initial module prototypes; Module Merging refines the results based on inter-module call coupling strength to produce final modules. Experimental results demonstrate that Modubin achieves remarkable efficiency improvements, reducing average processing time by 96.9% to just 5.18 seconds per binary while exhibiting near O(n) time complexity. Meanwhile, it improves modularization quality by 6.89% across six evaluation metrics and shows stronger stability in practical TPL reuse detection scenarios.