Dynamic Analysis of Android Apps
Ralph Meier
Find Vulnerabilities thanks to Disassembly and Decompilation
Reverse engineering is the inverse process to the normal development of a product, i.e. the process begins with a finished product and it is dismantled into its individual parts. The product does not necessarily have to be completed. Reverse engineering is not limited to software, but can also be applied to hardware.
When taking apart a finished product, the thought of copying or imitating a product quickly arises. This is certainly a valid point, but there are many other reasons for using reverse engineering.
Analysing one’s own product makes sense, for example, when many know-how carriers have left the company and the existing documentation is incomplete. Therefore, with the help of reverse engineering, one wants to regain lost knowledge and carry out so-called redocumentation and design recovery.
In the hardware sector, reverse engineering can be used to find alternatives to chips in older products or designs in order to avoid the current chip shortage to some extent. In some cases, this can also prevent obsolescence if a chip manufacturer goes bankrupt or required components are no longer produced.
Interfacing means building up compatibility with third-party systems through the use of reverse engineering. In doing so, one finds out how the third-party system works and how it is best addressed. With enough knowledge, an interface can be created between one’s own and the third-party system.
In the security analysis, software and hardware are taken apart to determine possible points of attack and vulnerabilities. These are then reported to the manufacturer to prevent future exploitation. In black box testing, reverse engineering is often used to extend the understanding of the test object.
Reverse engineering is also used in the analysis of malware. Among other things, to find out how the key of a ransomware is created in order to be able to restore encrypted data as quickly as possible. Or to find out which hacker collective is behind an attack.
There are two different types of reverse translation of software or compiled programming code in general. One is disassembly, where machine language code is converted into human-readable assembly code. The other is decompilation, where byte code is converted back into the original programming language.
Disassembly is used for software artefacts that are in the form of binary or hexadecimal machine language. This is created by compiling high-level languages such as C or C++. During the compilation process, many optimisation measures are carried out by the compiler used for the target platform. This makes the resulting program faster, but meta-information from the original programming code, such as names of variables and functions, is lost. Therefore, a conversion back to the original programming language is not possible; instead, machine code can be converted to assembly. Thus, the individual operations and memory accesses can be retraced. Professional tools are used for disassembly, which can trace the path of execution and display jumps within the code in an understandable form.
In interpretable programming languages such as Java or C#, the program code is not compiled into machine code but into byte code. The byte code, in turn, is interpreted at runtime by a virtual machine, for example the Java Virtual Machine, and executed on the target system. Some of the bytecode is already optimised, but it still contains meta-information, which makes it possible to convert it back to the original programming language. Depending on how the compilation was done, the bytecode can be completely translated back.
The result of disassembling or decompiling is mainly influenced by the configuration of the compiler as well as other tools, such as the use of an obfuscator tool. Obfuscator tools use a wide variety of techniques to make reverse engineering more difficult. They are available for all programming languages and many are open source, for example on Github. An obfuscator tool is applied to the programming code before it is compiled. Among other things, they rely on renaming variable and method names to randomly generated character strings, and include additional iterations, nesting and queries. In some cases, used variable contents are stored in an encrypted form and common programming design patterns are converted into complex incomprehensible sequences in order to confuse decompilers and thus produce an incorrect result. Code and meta-information that is not needed for compilation is removed to leave as little information as possible for future attackers. The list of techniques is by no means exhaustive, often own creations or variations are used to obfuscate the source code.
An obfuscator tool is used when developers want to make access to the source code of their own product, their intellectual property, difficult or impossible. When developing malware, obfuscator tools are often used, various character encodings and also encryption are applied to the source code in order to increase the skill level of the malware analysts and the time required for analysis. In addition, anti-debugging techniques are often used, but these will not be discussed further in this article.
By using the obfuscation techniques described above, malware developers also prevent their malware from being detected by automatic analyses of anti-virus solutions.
Reverse engineering is one of the most important techniques in the analysis of malware and generally a very helpful activity in discovering vulnerabilities in different products. Moreover, reverse engineering is also valuable in other different areas and should therefore not be forgotten. Disassembly and decompilation can be helpful in black-box testing of software to identify further information and possible points of attack. Obfuscator tools make reverse engineering more difficult and they are almost a dime a dozen.
Our experts will get in contact with you!
Ralph Meier
Ralph Meier
Ralph Meier
Ralph Meier
Our experts will get in contact with you!