Jan Friebertshauser, Florian Kosterhon, Jiska Classen, Matthias Hollick (Secure Mobile Networking Lab, TU Darmstad)
Embedded systems, IoT devices, and systems on a chip such as wireless network cards often run raw firmware binaries. Raw binaries miss metadata such as the target architecture and an entry point. Thus, their analysis is challenging. Nonetheless, chip firmware analysis is vital to the security of modern devices. We find that state-of-the-art disassemblers fail to identify function starts and signatures in raw binaries. In our case, these issues originate from the dense, variable-length ARM Thumb2 instruction set. Binary differs such as BinDiff and Diaphora perform poor on raw ARM binaries, since they depend on correctly identified functions. Moreover, binary patchers like NexMon require function signatures to pass arguments. As a solution for fast diffing and function identification, we design and implement Polypyus. This firmware historian learns from binaries with known functions, generalizes this knowledge, and applies it to raw binaries. Polypyus is independent from architecture and disassembler. However, the results can be imported as disassembler entry points, thereby improving function identification and follow-up results by other binary differs. Additionally, we partially reconstruct function signatures and custom types from Eclipse PDOM files. Each Eclipse project contains a PDOM file, which caches selected project information for compiler optimization. We showcase the capabilities of Polypyus on a set of 20 firmware binaries.