Software Binary Analysis

Software Binaries are nothing but compiled executables of your program written in a programming language. When you pass the program as a text file with a certain extension (for example, .cpp or .c) to its respective compiler you get the executable, which is nothing but the binary. Simply, these files are something you can not directly read unless disassembled. The term binary means a sequence of 0s and 1s which would represent the low and high voltages respectively, but Software Binaries are not comprised of just 0s and 1s but a series of CPU readable numbers. These numbers represent different operations or opcodes to be performed by CPU.

Software engineers of do analyze these binaries to locate and maybe patch the security vulnerabilities, which for some reasons a cannot be located the source code. The analysis performed may be static analysis, dynamic analysis or hybrid (a combination of both) analysis.

To make the binaries safe from these attacks, developers often obfuscate the binaries, which simply modifies the binary such that it can no more be interpreted by an exploiter.

Static Binary Analysis

For Static Analysis, we do not execute the binary. If the binary is non text file and is a sequence of computer-readable numbers how do we analyze it? Confusing right? Well, it's not. We disassemble the binary using certain libraries or tools, such as obdump from binutils. Different disassemblers can give different binaries and vice versa these disassembled binaries can also be converted back to the source code using decompilers. But, decompiling does not usually reproduce the original source code and may loose precision due to some information loss during optimization in compiling. Talking about the static analysis, the obtained assembly is a sequence of instructions segregated in sections. In the static analysis, you analyze the textual or the readable assembly without actually running it.

Given below is a snippet of assembly code disassembled using objdump:

Static Binary Analysis has its own limitations such as determining the shared library addresses (as most of the libraries are linked dynamically at the run time). Some of these problems can be tackled using the Dynamic Binary Analysis.

Dynamic Binary Analysis

In Dynamic Analysis, the binary is executed. Each instruction that executes is analyzed in the same order of execution. Depending upon your tool you may analyze one instruction, basic block, section or routine at a time. Program traces can also be analyzed. These binaries are instrumented for analysis and to modify their behavior. The dynamic binary translation tools usually run in the client program's address space.

Basic Block:

A basic block is a sequence of instructions in the program/assembly code with a single entry and multiple exits. A basic block cannot have a branch entering in any instructions except the starting instruction.

Routine:
A routine is a basically a function or a procedure of the program into Assembly. A routine can have multiple basic blocks.

Trace:
It is represented by a sequence of instructions with single entry and may be multiple exits. It has seuence of instructions that can be consecutively executed. A trace can be comprised of multiple basic blocks.

Share this