Enhancing Data Understanding
Rocco Gagliardi
How to Analyze and Categorize Executables
Tomaso Vasella, for example, explained how to analyze an Android application using reverse engineering techniques in Analysis of mobile applications (for a basic introduction to reverse engineering, see Reverse Engineering by Ralph Meier). The article demonstrates how to utilize some very complex procedures and technologies that the ordinary IT user is unlikely to be able to use or would not have time to use if they were in a hurry.
While tcpdump is not the first tool you should use if you have a network problem, there are some basic steps you could do before launching Ghidra and start jumping back and forth through the code.
When you receive an executable, your first thought is probably, Where do I start? In our opinion, if you merely want to know if a danger exists or not, an antivirus is a great place to start. However, more can be done.
The analysis may be loosely separated into static and dynamic, as we saw in Reverse Engineering. In contrast to the dynamic analysis, which focuses on the object’s activity, static analysis looks at the object’s appearance. Since static analysis doesn’t need running programs, it is much simpler and safer. However, this type of analysis is not a precise science. Each of us develops personal skills and methods over time, but the job primarily comprises searching for, accumulating, and correlating the information disseminated through the code. There is no checklist to follow, but there are well-known patterns and tools available to use.
The general steps to follow are logical and simple:
This is the stage of the procedure that is the most difficult. We will undoubtedly uncover many hints if we start “searching for something,” but they either won’t be helpful or we’ll have spent a lot of time gathering unnedeed clues.
Therefore, we must at least generally define the area in which we are interested: network, filesystem, or registry access, monitoring, obfuscation, or evasion methods, etc. By doing so, we may scan the object for references to specific functions and quickly determine whether to examine it in more depth or move on to the next item.
Things are not always so straightforward; the programmers of these objects employ certain tactics to impede investigation and conceal the activities that the object is meant to carry out for as long as feasible. For instance, they employ obfuscation methods (as example packing), or in the case of a dynamic analysis, they try to determine if a user is truly engaging with the object, whether it is operating in a sandbox, or whether someone is attempting to monitor the object’s operations. These are only a few of the techniques used to avoid being identified as danger by modern protection systems.
Now let’s establish the scope of our investigation. We want to determine whether the object:
To get a quick first response—ideally within minutes—we will just employ static analysis before deciding whether to go on to dynamic analysis or even disassembly techniques.
The toolkit will vary depending on the situation. Select your tools, study their options in detail, and use them to analyze the object in an isolated environment, normally a virtual machine without a permanent network connection. With the exception of examining certain objects, such as .NET
, the static portion may be carried out on any operating system. We prefer to use REMnux, a specialized Linux distribution that includes most of our favorite tools. On Windows, we have the corresponding version of most of the tools ready to use, and as Mark Russinovich uses to say When in doubt, run procmon!
The following object sample_01.exe was used for the experiments. It is advised that you visit the URL after reading the article, if you are not familiar with MalwareBazaar (or other websites of a similar nature), in order to better appreciate the significance of the data presented.
remnux@remnux:~/Desktop/malware/Injecting$ 7z e 24797d733e9fdd39bcf5b3910438bb449c84428082325f27652330019be683b3.zip 7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21 p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz (806EA),ASM,AES-NI) Scanning the drive for archives: 1 file, 204544 bytes (200 KiB) Extracting archive: 24797d733e9fdd39bcf5b3910438bb449c84428082325f27652330019be683b3.zip -- Path = 24797d733e9fdd39bcf5b3910438bb449c84428082325f27652330019be683b3.zip Type = zip Physical Size = 204544 Enter password (will not be echoed): >infected< Everything is Ok Size: 208384 Compressed: 204544 remnux@remnux:~/Desktop/malware/Injecting$ mv 24797d733e9fdd39bcf5b3910438bb449c84428082325f27652330019be683b3.exe sample_01.exe
Understanding what we have in front of us is the first step. We tend to believe that the exe
extension is connected to Windows, however there are really multiple formats of executable that use the extension exe
. We can use different tools, but the simplest are file or TrID:
emnux@remnux:~/Desktop/malware/Injecting$ file sample_01.exe sample_01.exe: PE32 executable (GUI) Intel 80386, for MS Windows, UPX compressed remnux@remnux:~/Desktop/malware/Injecting$ trid sample_01.exe TrID/32 - File Identifier v2.24 - (C) 2003-16 By M.Pontello Definitions found: 14978 Analyzing... Collecting data from file: sample_01.exe 63.4% (.EXE) UPX compressed Win32 Executable (27066/9/6) 11.8% (.EXE) Win16 NE executable (generic) (5038/12/1) 10.5% (.EXE) Win32 Executable (generic) (4505/5/1) 4.7% (.EXE) OS/2 Executable (generic) (2029/13) 4.6% (.EXE) Generic Win/DOS Executable (2002/3)
The object it’s a Portable Executable for MS Windows and we can immediately note a detail: It contains compressed code. The compression approach, in this case, is well known: UPX and we can simply extract the compressed part:
remnux@remnux:~/Desktop/malware/Injecting$ upx -d sample_01.exe -o sample_01.exe_upx_unpacked Ultimate Packer for eXecutables Copyright (C) 1996 - 2020 UPX 3.96 Markus Oberhumer, Laszlo Molnar & John Reiser Jan 23rd 2020 File size Ratio Format Name -------------------- ------ ----------- ----------- upx: sample_01.exe: FileAlreadyExistsException: sample_01.exe_upx_unpacked: File exists Unpacked 0 files. remnux@remnux:~/Desktop/malware/Injecting$ file sample_01.exe_upx_unpacked sample_01.exe_upx_unpacked: PE32 executable (GUI) Intel 80386, for MS Windows
Once decompressed, we may examine the content:
remnux@remnux:~/Desktop/malware/Injecting$ binwalk -B sample_01.exe_upx_unpacked DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 0 0x0 Microsoft executable, portable (PE) 146968 0x23E18 CRC32 polynomial table, little endian 151840 0x25120 Microsoft executable, portable (PE) 462541 0x70ECD mcrypt 2.5 encrypted data, algorithm: "sProcessorFeaturePresent", keysize: 1069 bytes, mode: "Q", 463648 0x71320 CRC32 polynomial table, little endian 470260 0x72CF4 XML document, version: "1.0" 483616 0x76120 Microsoft executable, portable (PE) 562449 0x89511 mcrypt 2.5 encrypted data, algorithm: "sProcessorFeaturePresent", keysize: 1069 bytes, mode: "Q", 566944 0x8A6A0 XML document, version: "1.0" 572512 0x8BC60 XML document, version: "1.0"
As we already know, it is a Microsoft executable, portable (PE)
, yet this string appears three times! The object most likely includes more executables. If required, the different executables can be extracted and analyzed independently (there are two DLL files):
remnux@remnux:~/Desktop/malware/Injecting$ ll total 980 drwxrwxr-x 2 remnux remnux 4096 Aug 29 03:28 ./ drwxrwxr-x 8 remnux remnux 4096 Aug 27 05:22 ../ -rw-rw-r-- 1 remnux remnux 204544 Aug 27 05:20 24797d733e9fdd39bcf5b3910438bb449c84428082325f27652330019be683b3.zip -rw-r--r-- 1 remnux remnux 208384 Aug 27 09:20 sample_01.exe -rw-r--r-- 1 remnux remnux 579072 Aug 27 09:20 sample_01.exe_upx_unpacked remnux@remnux:~/Desktop/malware/Injecting$ pecheck -lP -go -D sample_01.exe_upx_unpacked 1: 0x00000000 EXE 32-bit 0x0008d5ff ae78a0d8bee1c8eb4d78e60edd1cc607 0x0008d5ff (EOF) b'' b'' 2: 0x00025120 DLL 32-bit 0x0007611f 18c6220175e258be5af701ee26a57c72 0x0008d5ff (EOF) b'' b'' 3: 0x00076120 DLL 32-bit 0x0008bb1f e026b2666d2ae5583a934b0f9d4b5d03 0x0008d5ff (EOF) b'' b'ReflectiveLoader32.dll' remnux@remnux:~/Desktop/malware/Injecting$ ll total 1960 drwxrwxr-x 2 remnux remnux 4096 Aug 29 03:29 ./ drwxrwxr-x 8 remnux remnux 4096 Aug 27 05:22 ../ -rw-rw-r-- 1 remnux remnux 331776 Aug 29 03:29 18c6220175e258be5af701ee26a57c72.vir -rw-rw-r-- 1 remnux remnux 204544 Aug 27 05:20 24797d733e9fdd39bcf5b3910438bb449c84428082325f27652330019be683b3.zip -rw-rw-r-- 1 remnux remnux 579072 Aug 29 03:29 ae78a0d8bee1c8eb4d78e60edd1cc607.vir -rw-rw-r-- 1 remnux remnux 88576 Aug 29 03:29 e026b2666d2ae5583a934b0f9d4b5d03.vir -rw-r--r-- 1 remnux remnux 208384 Aug 27 09:20 sample_01.exe -rw-r--r-- 1 remnux remnux 579072 Aug 27 09:20 sample_01.exe_upx_unpacked
First answer: The object use obfuscation systems, even if not very complex.
Let’s now seek for some network-related indications, such as IP addresses or FQDNs. We just look for static strings in the code using strings to rapidly acquire a result that identifies malaji.top
, after a brief web search, as a domain used to host command and control tools.
remnux@remnux:~/Desktop/malware/Injecting$ strings -7 --encoding=l sample_01.exe_upx_unpacked | grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3}" 0.0.0.0 1.0.0.1 1.0.0.1 remnux@remnux:~/Desktop/malware/Injecting$ strings -7 --encoding=l sample_01.exe_upx_unpacked | grep -E -o "\w+\.\w+\.\w+" poilcy.itosha.top reserve.itosha.top poilcy.malaji.top reserve.malaji.top www.itosha.top site.itosha.top www.malaji.top q5y2qclsk18.malaji.top 0.0.0 1.0.0 1.0.0
Second answer: It is very likely that external servers related to malware are contacted.
We may also filter calls to certain functions using strings. Everyone develops their own list of suspicious functions as they gain experience; In this case, we are interested in functions that inject code, and one of the most commonly used is CreateRemoteThread.
remnux@remnux:~/Desktop/malware/Injecting$ strings sample_01.exe_upx_unpacked | egrep -Ha "Create|Debug" (standard input):CreateThread (standard input):CreateProcessW (standard input):CreateFileW (standard input):OutputDebugStringA -> (standard input):CreateRemoteThread (standard input):CreateFileMappingW (standard input):CreateEventW (standard input):IsDebuggerPresent (standard input):CTcpServer::CreateListenSocket (standard input):CreateFileTransactedW (standard input):WSACreateEvent (standard input):CreateEventW (standard input):CreateMutexW (standard input):CreateThread (standard input):CreateFileW (standard input):CreateFileMappingW (standard input):OutputDebugStringA (standard input):HeapCreate (standard input):CreateIoCompletionPort (standard input):CreateSemaphoreW (standard input):IsDebuggerPresent (standard input):OutputDebugStringW (standard input):CreateThread (standard input):IsDebuggerPresent (standard input):CreateFileW
Third answer: The object is quite likely to attempt to inject code into another process.
All of this evidence leads us to believe that the object employs several tactics commonly used in malicious programming, and we may proceed to a more in-depth analysis.
It didn’t take us long, with a little effort, to classify our item as potentially malicious. However, we still needed to employ several tools and filter the quantity of information that each tool returned to us. There is nothing miraculous about what we have done; these are straightforward patterns that are normally used.
The objective of capa, a capability detection program for executable files, is to automate these search operations. Simply run it against an executable and it will tell you what it believes the program is capable of.
remnux@remnux:~/Desktop/malware/Injecting$ capa sample_01.exe loading : 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 661/661 [00:00<00:00, 2281.69 rules/s] WARNING:capa:-------------------------------------------------------------------------------- WARNING:capa: This sample appears to be packed. WARNING:capa: WARNING:capa: Packed samples have often been obfuscated to hide their logic. WARNING:capa: capa cannot handle obfuscation well. This means the results may be misleading or incomplete. WARNING:capa: If possible, you should try to unpack this input file before analyzing it with capa. WARNING:capa: WARNING:capa: Identified via rule: (internal) packer file limitation WARNING:capa: WARNING:capa: Use -v or -vv if you really want to see the capabilities identified by capa. WARNING:capa:--------------------------------------------------------------------------------
After the suggested decompression:
remnux@remnux:~/Desktop/malware/Injecting$ capa sample_01.exe_upx_unpacked loading : 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 661/661 [00:00<00:00, 2189.72 rules/s] matching: 100%|█████████████████████████████████████████████████████████████████████████████████████| 749/749 [00:05<00:00, 144.79 functions/s, skipped 443 library functions (59%)] +------------------------+------------------------------------------------------------------------------------+ | md5 | ae78a0d8bee1c8eb4d78e60edd1cc607 | | sha1 | 89ae94027b21be1e58ad87794102cda58e48c369 | | sha256 | 6a0fee390b022dabe88985bdeca5a46b55ae647ccb646288fcae3a6017eecbe4 | | os | windows | | format | pe | | arch | i386 | | path | sample_01.exe_upx_unpacked | +------------------------+------------------------------------------------------------------------------------+ +------------------------+------------------------------------------------------------------------------------+ | ATT&CK Tactic | ATT&CK Technique | |------------------------+------------------------------------------------------------------------------------| | DEFENSE EVASION | Indicator Removal on Host::File Deletion T1070.004 | | | Obfuscated Files or Information:: T1027 | | | Process Injection::Thread Execution Hijacking T1055.003 | | | Reflective Code Loading:: T1620 | | DISCOVERY | File and Directory Discovery:: T1083 | | | System Information Discovery:: T1082 | | EXECUTION | Shared Modules:: T1129 | | PRIVILEGE ESCALATION | Access Token Manipulation:: T1134 | +------------------------+------------------------------------------------------------------------------------+ +-----------------------------+-------------------------------------------------------------------------------+ | MBC Objective | MBC Behavior | |-----------------------------+-------------------------------------------------------------------------------| | DATA | Encode Data::XOR [C0026.002] | | DEFENSE EVASION | Obfuscated Files or Information::Encoding-Standard Algorithm [E1027.m02] | | | Self Deletion::COMSPEC Environment Variable [F0007.001] | | EXECUTION | Install Additional Program:: [B0023] | | FILE SYSTEM | Delete File:: [C0047] | | | Get File Attributes:: [C0049] | | | Read File:: [C0051] | | | Writes File:: [C0052] | | MEMORY | Allocate Memory:: [C0007] | | OPERATING SYSTEM | Environment Variable::Get Variable [C0034.002] | | PROCESS | Check Mutex:: [C0043] | | | Create Process:: [C0017] | | | Create Process::Create Suspended Process [C0017.003] | | | Create Thread:: [C0038] | | | Resume Thread:: [C0054] | +-----------------------------+-------------------------------------------------------------------------------+ +------------------------------------------------------+------------------------------------------------------+ | CAPABILITY | NAMESPACE | |------------------------------------------------------+------------------------------------------------------| | self delete | anti-analysis/anti-forensic/self-deletion | | get MAC address on Windows | collection/network | | encode data using XOR | data-manipulation/encoding/xor | | contains PDB path | executable/pe/pdb | | contain a resource (.rsrc) section | executable/pe/section/rsrc | | extract resource via kernel32 functions | executable/resource | | contain an embedded PE file | executable/subfile/pe | | get common file path (2 matches) | host-interaction/file-system | | delete file | host-interaction/file-system/delete | | check if file exists | host-interaction/file-system/exists | | enumerate files via kernel32 functions | host-interaction/file-system/files/list | | get file attributes | host-interaction/file-system/meta | | read file via mapping | host-interaction/file-system/read | | write file on Windows (3 matches) | host-interaction/file-system/write | | get disk information | host-interaction/hardware/storage | | print debug messages | host-interaction/log/debug/write-event | | check mutex | host-interaction/mutex | | get system information on Windows (3 matches) | host-interaction/os/info | | check OS version | host-interaction/os/version | | create process on Windows (2 matches) | host-interaction/process/create | | create process suspended | host-interaction/process/create | | inject thread | host-interaction/process/inject | | acquire debug privileges | host-interaction/process/modify | | modify access privileges | host-interaction/process/modify | | resume thread | host-interaction/thread/resume | | link function at runtime on Windows (5 matches) | linking/runtime-linking | | parse PE header (2 matches) | load-code/pe | | spawn thread to RWX shellcode | load-code/shellcode | +------------------------------------------------------+------------------------------------------------------+
Looking at the second table ATT&CK Tactic, you can read the known techniques that are used by this object. Note that no network activity is reported, which is usually done during a dynamic analysis.
capa executes hundreds of checks that would normally be done by hand and puts the results into a summary table. It does this by using the work of many experts who have looked at thousands of objects and coded, using yara rules, the clues that let us idenfity pattern or strings in the object. By running capa, we can find out in a few seconds if we need to look at the object more closely or if it can be considered clean.
However, deciphering the output of tools like capa needs some reverse engineering skill, therefore we recommend certain tools and search patterns to begin exploring manually what capa performs automatically.
The tools are important, but they are not the most important part of the whole process. It’s better to focus on a few tools and learn how to use them well than to try to figure out how to use all of them, especially since many of them have overlapping functions. In any case, the following are worth taking a look at:
Intent | Tool | Comment |
---|---|---|
Object categorization | file TrID | This group of tools is made to recognize different file kinds based on their binary signatures. There are differences between the utilities, so try the best based on the circumstances or just use both. |
Search | strings bbcrack floss | bbcrack (Balbucrack) is a tool to crack typical malware obfuscation such as XOR, ROL, ADD (and many combinations) and checking for specific patterns (IP addresses, domain names, etc). The FLARE Obfuscated String Solver (FLOSS) uses advanced static analysis techniques to automatically deobfuscate strings from malware binaries. |
Analisys | pecheck capa | Tools used to search for well-known malicious pattern. |
Extraction | pecheck | This tool is used to look into an PE executable for ambedded executable, like DLLs, EXE, overlays, etc. |
Common checks:
Regarding objects intended for Microsoft Windows, we can start by searching for specific functions:
Risk | Function |
---|---|
Keylogging | GetAsyncKeyState SetWindowsHookEx |
Injection | CreateRemoteThread WriteProcessMemory VirtualAllocEx |
Defense | GetTickCount GetCursorPosition GetForegroundWindow |
Execution | WinExec ShellExecute CreateProcess |
Data Manipulation | GetClipboardData GetWindowText |
Network Interaction | InternetOpen HttpOpenRequest HttpSendRequest InternetReadFile |
We’ve demonstrated how, by concentrating on important details and employing a few straightforward methods, we can swiftly decide if a suspicious object needs more attention. Using some tools and defining what to look for, we can get very good results by simply performing a static analysis of the suspect object. Tools such as capa make use of the experience of professionals who have codified their research patterns and made them public. In any case, in order to be able to comprehend the information that these technologies make accessible to us, one needs knowledge of how systems work and programming abilities. A good place to start are the tools mentioned in this article, the rest will come with experience.
Our experts will get in contact with you!
Rocco Gagliardi
Rocco Gagliardi
Rocco Gagliardi
Rocco Gagliardi
Our experts will get in contact with you!