MARX: Uncovering Class Hierarchies in C++ Programs
Author(s): Andre Pawlowski, Moritz Contag, Victor van der Veen, Chris Ouwehand, Thorsten Holz, Herbert Bos, Elias Athanasopoulos, Cristiano Giuffrida
Download: Paper (PDF)
Date: 27 Feb 2017
Document Type: Reports
Additional Documents: Slides Video
Associated Event: NDSS Symposium 2017
Abstract:
Reverse engineering of binary executables is a difficult task which gets more involved by the way compilers translate high-level concepts used in paradigms such as objectoriented programming into native code, as it is the case for C++. Such code is harder to grasp than, e. g., traditional procedural code, since it is generally more verbose and adds complexity through features such as polymorphism or inheritance. Hence, a deep understanding of interactions between instantiated objects, their corresponding classes, and the connection between classes would vastly reduce the time it takes an analyst to understand the application. The growth in complexity in contemporary C++ applications only amplifies the effect.
In this paper, we introduce Marx, an analysis framework to reconstruct class hierarchies of C++ programs and resolve virtual callsites. We have evaluated the results on a diverse set of large, real-world applications. Our experimental results show that our approach achieves a high precision (93.2% of the hierarchies reconstructed accurately for Node.js, 88.4% for MySQL Server) while keeping analysis times practical. Furthermore, we show that, despite any imprecision in the analysis, the derived information can be reliably used in classic software security hardening applications without breaking programs. We showcase this property for two applications built on top of the output of our framework: vtable protection and type-safe object reuse. This demonstrates that, in addition to traditional reverse engineering applications, Marx can aid in implementing concrete, valuable tools e. g., in the domain of exploit mitigations.