When analyzing malware, one often has to deal with lots of tricks and obfuscation techniques. In this post we will look at several obfuscation and anti-analysis techniques used by the malware Trickbot, based on the sample 8F590AC32A7C7C0DDFBFA7A70E33EC0EE6EB8D88846DEFBDA6144FADCC23663A from mid of December 2018.
After analyzing and understanding the obfuscation techniques, we will take care of deobfuscating the malware with IDA Python in order to make the code easier to analyze in Hexrays’ decompiler.
With a malware as wide spread and publicly known as Trickbot, there is already a lot of research. Some intersections with this article can be found in the work of Michał Praszmo at https://www.cert.pl/en/news/single/detricking-trickbot-loader/, where some of the obfuscation features are touched. A similar, but more in-depth analysis from Hasherezade can be found at https://blog.malwarebytes.com/threat-analysis/malware-threat-analysis/2018/11/whats-new-trickbot-deobfuscating-elements/ in conjunction with the tutorial on https://www.youtube.com/watch?v=KMcSAlS9zGE. Also Vitali Kremez explained the string obfuscation of a Trickbot sample at https://www.vkremez.com/2018/07/lets-learn-trickbot-new-tor-plugin.html
Obfuscated Import Address Table
If you put the unpacked binary in IDA, you can see that Trickbot has several imported functions:
Yet, the first line of the decompiled wWinMain() shows lots of function calls relative to the address stored at dword_42A648.
Looking at the x-refs of this address, we can find out in which context it is written to:
Decompiling the function sub_402D30 () shows that dword_42A648 points to a buffer of 0x208 bytes (or 129 DWORDs). The buffer is modified in the same function with a call to sub_40C8C0().
Note that stru_42A058 holds a pointer to a structure which we will get to know in the following function, as it is an argument for the function call to sub_40C8C0(). This call is done 8 times in a loop as you can see in line 21 to 27.
Within sub_40C8C0() Hexrays’ decompiler shows the following picture:
We can see the following things:
The argument called “hModule” by IDA is a pointer to a structure. Its first DWORD contains a hint to a string used in LoadLibraryW() in line 12 and 13.
The second DWORD of the structure is used in line 14, 16, 18 and 21 and contains a list of hints to function names used in GetProcAddress().
The third DWORD of the structure is used to mark the end of the list of the second DWORD in the for-loop in line 16.
The fourth DWORD of the structure is used in line 15, 16 and 20 and points to a list of offsets, which is used to calculate the address to store the imported functions, with the base of our previously allocated 0x208 bytes array.
Putting it all together, our structure is defined as follows:
Now our function looks much nicer:
It is now obvious that our mysterious buffer of 0x208 bytes is actually an IAT which is stored on the heap. The pointer to the IAT is located at dword_42A648 and the 383 x-refs to this address which we saw in the beginning are mostly calls to this IAT.
Decrypt All Strings
Now the question remains, what the functions sub_407110() and sub_405210() are doing to yield library- and function names. When disassembling them, you can see that both call sub_40E970(). Only the first one, sub_407110(), has an additional call after that, but that is only used to transform a string to a wide string:
So the actual magic happens in sub_40E970():
We see a single call to sub_404080(). But most important is the first function argument, which adds a1 to the base address of the label which IDA called “Src”. Looking at Src, we can see it is a table with offsets to some scrambled strings:
So the argument a1 is simply an offset to the table pointed to by Src and decides which of the strings is provided to sub_404080().
When looking at sub_404080(), we can see a function which has over 100 lines of disassembled code. I just chose the most relevant part to display in a screenshot:
Without going too much into details, you can see that from line 44 to 63 a substitution takes place based on the first function argument (copied to “Dst”) and the string pointer named off_42A050 by IDA in line 44. The string looks like this:
From line 64 to 69 the previously substituted bytes are then mangled by some bit operations, where four input bytes are mapped to three output bytes. According to the blog of Vitali Kremez mentioned above, this was once a base64 algorithm with a custom alphabet. It still is similar to that, but is seems to be extended by the bit manipulation operations.
Putting it all together, we now know that each string of the IAT is scrambled by a substitution cipher and a bit manipulation algorithm. The function arguments provided in the functions sub_407110() and sub_405210() from the IAT algorithm described previously are offsets to string pointers to the scrambled strings stored at 0x00427C1C, called “Src” by IDA.
We also know that sub_407110() returns a wide string, while sub_405210() returns an ANSI string.
When cross referencing those two functions, we can see 159 and 52 calls to them:
Looking at the calls, we can see that the argument, which describes the string offset, is pushed on the stack as second function argument, in our case 73h. The pointer to the output string is the first argument:
Looking a bit further, we can find a third function sub_4019F0(), which calls sub_40E970() for decrypting strings. Again, the argument is provided via a push of a constant number.
So we can write a simple IDA python script to decrypt all strings and print them. The algorithm is quite simple:
- Manually identify all three functions which call sub_40E970()
- For each xref to one of those three functions:
- Disassemble backwards until we find the first push which is a number
- Add the base address of the crypted string table to find the referenced string
- Decrypt the string based on the reversed algorithm
The output looks like this (note that line breaks are not encoded, but do actually break the lines):
We can also adapt our algorithm to print us the import address table, since we know the structure used in sub_40C8C0() to build the IAT:
- Take the pointer at stru_42A058
- Convert the values stored there into an array of the structure “IATstruct” (described previously) with eight array elements
- For each of those eight elements:
- Decrypt the first DWORD as DLL name
- Iterate from second to third DWORD and decrypt them to get all imported function
- Take the fourth DWORD as an offset where the function is placed on our IAT on the heap
Printing the IAT as a dictionary looks like this:
Setting Comments to Decompilation
One thing that always bugged me is that it is trivial to add comments in the disassembly in IDA. But since I use the decompiler a lot, I wanted to add my decrypted strings as comments in the decompiler view.
After rather unsatisfying google searches, I spent hours in IDA’s API documentation, read a bunch of existing IDA plugins to look for hints and tried out a lot. Turns out my IDA 6.9 is very crashy when working with IDA Python and also the documentation is not always as helpful as one would like it to be.
But I finally succeeded with a lot of try and error and a little bit of brute forcing:
First you need to translate the address of the disassembly to the function line of the decompiled code. Then by using a ctree object, you can place a comment there. Unfortunately, the ctree object needs to have the correct “item preciser” (ITP). An ITP specifies if a comment is e.g. placed in a line of an “else”, “do”, “opening curly brace”, and so on.
If you set the incorrect ITP to your ctree object, your comment is “orphaned” and won’t be placed correctly.
I still do not understand how I know which ITP I should use, so I developed a little brute force algorithm:
- Delete all orphaned comments from current function
- For each possible ITP:
- Set comment with current ITP
- If no orphaned comment exists, break loop
This algorithm is rather stupid. But after spending too much time on this issue, I was finally happy to have something that works.
The result looks like this:
Setting Function Information
Being able to decrypt all strings and setting them as comments in the decompiled code helps a lot when reversing the binary. What is missing is a properly useable IAT. We already know that the IAT is constructed during runtime on the heap.
Function calls to the IAT look like this:
The first two lines of the decompilation look as follows in the disassembly:
You can see in the first line that dword_42A648 is copied to eax, and eight lines later the offset 0xBC is added until a call to register ecx is executing the WinAPI call. The last five lines show a second WinAPI call in a simpler fashion, with only one function argument.
The mov instructions in line five to eight are irrelevant for the function call, but the compiler decided to put them between the three pushes for the function arguments of the first call anyways.
The idea of how to fix this is quite simple. Yet, the implementation of the idea turned out to be way more complex:
We write a light weight implementation of a taint tracker and track the usage of dword_42A648, which holds a pointer to the IAT, to find all WinAPI calls. For each call, our taint tracking provides us the offset within the IAT, so we know which WinAPI is called. In our previous example, we would start with eax, which gets a copy of dword_42A648. Then track eax until it is copied to ecx with the offst 0xBC. Then we track ecx until we see a call to ecx. Thus we know that the IAT offset 0xBC is used at this specific call.
In order to tell IDA what kind of return value and parameters each IAT call has, we need to do some more magic. First, we need to import all function definitions we need. E.g. for “SetCurrentDirectoryW” we need to define a function like this “typedef BOOL __stdcall SetCurrentDirectoryW(LPCWSTR lpPathName);”. We import those function definitions as local types in IDA.
In the second step, we create a local structure which reflects our IAT. So instead of only naming the pointer e.g. “CreateThread”, we also use the type CreateThread, which we imported as local type.
This IAT structure is then applied to the address dword_42A648, so we can see which function is called when dword_42A648 is referenced. The decompilation of e.g. sub_402B00 then looks like this:
We can see three calls to WinAPIs and their corresponding names in line 18, 31 and 34. Yet, neither the number of function arguments are correct, nor are their types identified properly. For example, in line 18 IDA shows five function arguments where there should be four and in line 31 there is one where there should be three.
Additionally, the structure PSECURITY_DESCRIPTOR is not set as the third argument in line 18, instead IDA set it to void*. And instead of LPSECURITY_ATTRIBUTES, IDA uses an int* in line 31.
In order to fix this, we can now leverage our taint tracking information and define each call with its corresponding function by using the IDA Python functions apply_callee_tinfo() and set_op_tinfo2() of the idaapi. This triggers IDA’s magic and populates the added information to the disassembly, so that even stack variables are redefined and renamed.
We can now see that the function calls have the correct number of arguments as well as the correct types of arguments. Also the stack variables got redefined and renamed correctly.
You always know you are going down a very dark trail when the IDA Python functions you are using have less than 10 hits on google and most of those hits are just copies of the same text.
Yet, I found the existing IDA Python script “apply_callee_type.py” from Jay Smith on https://github.com/fireeye/flare-ida/blob/master/python/flare/apply_callee_type.py extremely helpful in understanding how to do such magic in IDA.
The final pseudo algorithm looks like this:
- Iterate over the decrypted IAT and for each imported function:
- Look up function definition in IDAs database
- Import function definition to local types for later use
- Create IAT structure and import it as local type called “IAT”
- Set dword_42A648 as type “IAT”
- For each read-reference to dword_42A648
- Get the register which holds dword_42A648
- Disassemble forward until the register is copied with an offset to a new register, remember the used offset
- Disassemble forward until the new register is called, remember this address
- Depending on the used offset, look up the function definition of the IAT function
- Apply function definition to current address
In the first part we have learned how Trickbot obfuscates its strings and how we can leverage static code analysis in order to deobfuscate the strings and put them in a useable format in IDA.
In the second part we analyzed how the dynamically created import address table of Trickbot can be restored and how IDA can be instrumented to process the data types of the imported functions to get a nice and clean decompilation result.
Finally, I would like to thank my colleagues from G DATA Advanced Analytics for proofreading this article.
Additionally, I would like to thank the Trickbot authors for the interesting and partially challenging malware.
You can find the IDA Python scripts on https://github.com/GDATAAdvancedAnalytics/IDA-Python