[Tech] Binary Analysis
Protections Againest Exploitation
ASLR:
DEP: Data Execution Protection
Stack Canaries:
CFI: Control flow integrity
ROP: Return-oriented programming
Closed Source Memory Errors:
- Resort to blackbox fuzzing: resulting in shallow coverage close to the provided test cases
- Rely on dynamic binary translation: instrument the binary at prohibitively high runtime cost (e.g., 10x to 100x for AFL fuzzing in QEMU mode on LAVA-M)
- Use unsound static rewriting based on heuristics
Question: The fundamental difficulty for static rewriting techniques is disambiguating reference and scalar constants
There are three fundamental techniques to rewrite binaries:
- recompilation [14], which attempts to lift the code to an intermediate representation; Lifting code to IR for recompilation requires correctly recovering type information from binaries, which remains an open problem.
- trampolines [15], [16], which relies on indirection to insert new code segments without changing the size of basic blocks; Trampolines may significantly increase code size, and the extra level of indirection increases performance overhead.
- reassembleable assembly [12], [13], which creates an assembly file equivalent to what a compiler would emit, i.e., with relocation symbols for the linker to resolve. Consequently, we believe that resymbolizing binaries for reassembleable assembly is one the most promising technique for static binary rewriting.