Yongpan Wang (Institute of Information Engineering Chinese Academy of Sciences & University of Chinese Academy of Sciences, China), Hong Li (Institute of Information Engineering Chinese Academy of Sciences & University of Chinese Academy of Sciences, China), Xiaojie Zhu (King Abdullah University of Science and Technology, Thuwal, Saudi Arabia), Siyuan Li (Institute of Information Engineering Chinese…

Binary code search plays a crucial role in applications like software reuse detection, and vulnerability identification. Currently, existing models are typically based on either internal code semantics or a combination of function call graphs (CG) and internal code semantics. However, these models have limitations. Internal code semantic models only consider the semantics within the function, ignoring the inter-function semantics, making it difficult to handle situations such as function inlining. The combination of CG and internal code semantics is insufficient for addressing complex real-world scenarios. To address these limitations, we propose BINENHANCE, a novel framework designed to leverage the inter-function semantics to enhance the expression of internal code semantics for binary code search. Specifically, BINENHANCE constructs an External Environment Semantic Graph (EESG), which establishes a stable and analogous external environment for homologous functions by using different inter-function semantic relation (textit{e.g.}, textit{call}, textit{location}, textit{data-co-use}). After the construction of EESG, we utilize the embeddings generated by existing internal code semantic models to initialize EESG nodes. Finally, we design a Semantic Enhancement Model (SEM) that uses Relational Graph Convolutional Networks (RGCNs) and a residual block to learn valuable external semantics on the EESG for generating the enhanced semantics embedding. In addition, BinEnhance utilizes data feature similarity to refine the cosine similarity of semantic embeddings. We conduct experiments under six different tasks (textit{e.g.}, under textit{function inlining} scenario) and the results illustrate the performance and robustness of BINENHANCE. The application of BinEnhance to HermesSim, Asm2vec, TREX, Gemini, and Asteria on two public datasets results in an improvement of Mean Average Precision (MAP) from 53.6% to 69.7%. Moreover, the efficiency increases fourfold.

View More Papers

The Guardians of Name Street: Studying the Defensive Registration...

Boladji Vinny Adjibi (Georgia Tech), Athanasios Avgetidis (Georgia Tech), Manos Antonakakis (Georgia Tech), Michael Bailey (Georgia Tech), Fabian Monrose (Georgia Tech)

Read More

Transparency or Information Overload? Evaluating Users’ Comprehension and Perceptions...

Xiaoyuan Wu (Carnegie Mellon University), Lydia Hu (Carnegie Mellon University), Eric Zeng (Carnegie Mellon University), Hana Habib (Carnegie Mellon University), Lujo Bauer (Carnegie Mellon University)

Read More

GAP-Diff: Protecting JPEG-Compressed Images from Diffusion-based Facial Customization

Haotian Zhu (Nanjing University of Science and Technology), Shuchao Pang (Nanjing University of Science and Technology), Zhigang Lu (Western Sydney University), Yongbin Zhou (Nanjing University of Science and Technology), Minhui Xue (CSIRO's Data61)

Read More

LeoCommon – A Ground Station Observatory Network for LEO...

Eric Jedermann, Martin Böh (University of Kaiserslautern), Martin Strohmeier (armasuisse Science & Technology), Vincent Lenders (Cyber-Defence Campus, armasuisse Science & Technology), Jens Schmitt (University of Kaiserslautern)

Read More