Reverse Engineering

 

Introduction

 

Reverse engineering, sometimes referred to as reversing, is the unraveling of the complexities of software.  It is a process of recreating the internal rules of a program, while only having access to the external behavior of the program.  Reverse engineering is converting a program into a more readable and understandable form.  For instance, when given a machine language program, e.g. a binary executable, the reverse engineer makes some assumptions about its internal program and attempts to recreate those assumptions using tools such as debuggers and decompilers.  The reverse engineer must not only understand software abstractions but also understand hardware architectures.  He must also devise a program input plan that can allow the traversal of as many logical pathways as possible.  Some of the advantages of reverse engineering are learning the types of system functions being used, files being accessed by the program, protocols being used by the program, and the kinds of communication with other entities within a network that are being used.

 

 

 

Target hardware code representation. E.g., binary code.

 
Ultimately, the goal of a reverse engineer is to not understand the entire program.  His goal is to understand enough of it to be effective.  For example, some parts of a program may consist of menial Òbread and butterÓ code and other parts may consist of proprietary algorithms.  Upon identifying some of those distinctions within the program, the reverse engineer can narrow his focus to those nontrivial parts.  One of the main goals of a reverse engineer is to change the logical flow of a program by changing the structure of it, which is known as patching.  Code is seamlessly placed over the original code.  Patching allows you to add, delete or change existing code.

 

 

Intermediate code representation.  E.g., assembly code.

 
 


Legalities

 

As of this writing, reverse engineering is not illegal by law.  The main reason is that reverse engineering can give people the ability to examine the quality of software they use and purchase.  Another reason, along similar lines, is that critical errors can be spotted and reported for correction.  In fact, this coincides with KerckhoffÕs Principle, in that cryptographic algorithms should be disclosed to the public so that a greater audience can evaluate its strength.  So, the part that is illegal is changing the software to bypass copyright protections and digital rights management schemes. 

 

Conceptual View

 

Normally, a program is compiled from a high-level source representation to an intermediate representation.  Usually, this intermediate representation is in the form of an assembly language.  This intermediate representation gets converted to the hardwareÕs binary target format.  During reverse engineering, the binary executable gets converted back to the intermediate language and then, if necessary, gets converted to the high-level source code.

 

 

 

 

 

 

 

 

 

 


Tools, Approaches & Methods

 

There are four types of tools a reverse engineer can use:

 

Debugger

 

A debugger is a program that is used to step through the code of another program.  Debuggers allow you to set breakpoints and watch variables throughout an execution trace.

 

Fault injection

 

A fault injection tool inserts defects into code to observe anomalous behavior.  This kind of behavior may disclose helpful information for reverse engineering.  This is actually a kind of software engineering testing technique that is along the lines of error seeding.

 

Disassembler

 

A disassembler converts hardware machine code into intermediate assembly code.

 

Decompiler

 

A decompiler converts hardware machine code or intermediate assembly code into high-level source code.

 

There are three kinds of reverse engineering analyses:

 

White-box analysis

 

White-box analysis consists of analyzing and understanding the program code, without running the program.  Static analyzers are used by taking the program file(s) as input and outputting not only the potential program but also statistical data on some of the characteristics of code.

 

Black-box analysis

 

Black-box analysis consists of probing the external behavior of a program with inputs.  Black-box analysis helps in identifying areas of white-box analysis exploration.  Black-box analysis is usually done first.

 

Gray-box analysis

 

Gray-box analysis consists of using black-box analysis in conjunction with white-box analysis.  For instance, nested code segments can be treated in a black-box fashion and then upon diving further into the code segment white-box analysis can be conducted.

 

There are about seven methods a reverse engineer can utilize:

 

Using inputs to trace through code.

 

Input tracing is a very tedious way of stepping through code from the initial input.  The idea is to uncover end-to-end pathways for understanding the code.

 

Exploiting differences in software versions.

 

Observing differences in different software versions can surface potential vulnerabilities.  Usually, software updates are bug fixes.

 

Code coverage.

 

Code coverage helps determines which logical pathways have and have not been traversed.  Code coverage helps construct a map of code pathways.

 

Kernel access.

           

Kernel access allows running commands on device drivers and inserting addresses into kernel memory.

 

Data leakage from shared buffers.

 

Additional data can be found from shared buffers.  If buffers are not cleaned before it is used by another entity then the old data could still exist within those buffers.  This leftover data may disclose potential vulnerabilities.

 

API resources.

 

API resources manifest certain functions that may be known to be problematic.  These functions can be targeted within the program and then exploited.

 

Links

 

http://www.acm.uiuc.edu/sigmil/RevEng

http://scgwiki.iam.unibe.ch:8080/SCG/370

http://www.cc.gatech.edu/reverse/

http://en.wikipedia.org/wiki/Reverse_engineering

http://www.freedomtotinker.com

http://www.securityfocus.com/popups/forums/bugtraq/intro.shtml