Running your own HashDB lookup
What’s all the fuss about?
Imagine you are just starting the analysis of your next sample and stumble across some logic that seems like the implementation of API hashing. But it doesn’t worry you, because you just heard about this tool that is able to automagically resolve the hashes to the original API strings. Your lord and savior: HashDB.
But what if you don’t want to depend on some service provider because you are working with confidential files or an unpublished malware? How can you be sure the service provider will not at some point in time be interested in your analysis behavior? Or, most importantly, what if you just cannot reach the internet from your analysis system?
No worries, we got you!
Run HashDB locally
To solve the problems above (and more) we created the HashDB database for local lookups. Our implementation can be found on GitHub and consists of three components:
- The IDA plugin
- The HashDB database
- The HashDB hook
In the following, we will explain the functionality / benefits of each component.
TL;DR
- Download the latest release and unzip it in your IDA plugins directory -
%PROGRAMFILES%/IDA/plugins/
usually (contains a prebuilt lookup database, have a look at HashDB builder on how to build your own) - Create the directory
%LOCALAPPDATA%/hashdb/
- Move the file
hashdb.sqlite3
to the location%LOCALAPPDATA%/hashdb/
- Done
IDA plugin
The plugin is a fork of the original implementation made by OALabs. For an exhaustive explanation look at the README. In short the plugin is able to find the hash algorithm in use, use a static xor value, create an enum to replace hash constants with its API name and more.
Originally the plugin used the requests
module to communicate with the remote HashDB lookup service.
Therefore we had to make some changes here.
We wanted to change the original code as little as possible because of forward compatibility.
The most straight-forward solution was to implement our own “requests” methods that can be used as replacement in the plugin.
The logic is implemented in our hook.
To make sure the plugin gets adapted accordingly, we created a patch
file that can be used to simply replace the requests
import with our own implementation.
Local HashDB database
The core component of the local HashDB implementation is the database with all needed information to make local hash lookups. This database is filled with information from three sources:
- Hash algorithm implementations
- SQLite database with information about relevant Windows APIs and module namens
- A file with strings that should be added to the lookup database
Hash algorithms
To be able to fill the database with the needed string to hash mappings, we need the implementation of all algorithms used for API hashing (or at least, as many as possible). Luckily, there is a public repo by OALabs, that contains plenty of them. If you want to add your own algorithms you can always do so by adding the implementation of it in the algorithms folder. The implementation has to be in the following form:
#!/usr/bin/env python
DESCRIPTION = "your hash description here"
EXTENDED_PERMUTATION = True # set on true only if extended permutations is needed
# Type can be either 'unsigned_int' (32bit) or 'unsigned_long' (64bit)
TYPE = 'unsigned_int'
# Test must match the exact hash of the string 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'
TEST_1 = hash_of_string_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
def hash(data):
# your implementation here
Also if you add a new algorithm to your local database and it’s not top secret: Make sure to add it to OALabs repo as well to help the community!
Database of Windows APIs / modules and list of strings
Furthermore we added a database with relevant Windows modules and APIs. Additionally a strings file is used to add custom strings to the HashDB lookup database.
You can use both files to add information that you want HashDB to be able to lookup.
This is easy for random strings that you have found hashed in your samples: Just put them into the strings file.
If you want to add a module that is not already in the prepared you can use the make_function_db.py
script.
To use it you have to follow these two steps:
mkdir <fancy_name>
python ./make_function_db.py <fancy_name> ./functions_with_forwards.sqlite3
After you have prepared these two files with all the information you want to be able to look up (or just leave the files as they are - this should be enough for most use cases), you are now ready to build the main lookup database.
Main database for the local HashDB lookup
Both of the mentioned files (API database and strings file) are used in the process of database creation (i.e., running the hashdb_builder.py
script) to populate the HashDB lookup database.
The script builds a database containing all given modules, APIs, strings, and permutations and calculates hashes for all combinations of APIs / strings and permutations.
This database will be the source for lookups done by the hook described below.
Hook
The hook is used to connect the IDA plugin with our local HashDB database. As mentioned above, in order to keep the plugin mostly unchanged, we had to implement the needed requests
methods (get
and post
) to run queries on the local database.
The remote HashDB lookup service provides API functions to query their database for specific information.
We implemented some of the endpoints in the hook to query our local lookup database.
We implemented four endpoints in the hooking logic to use the lookup database:
GET /hash
: Used to list all available hash algorithms in the databaseGET /hash/<algorithm>/<hash>
: Get the corresponding API string for the given algorithm + hash combinationGET /module/<module>/<permutation>
: Get a list of API string hashes for a module and algorithm using a specific permutationPOST /hunt
: Hunt for the right hash algorithm (i.e., search for the hash in the database and return the corresponding algorithm)
Conclusion
With the provided adaptation of the IDA plugin and the implementation of a local HashDB lookup using a custom database, you are able to leverage the full potential of HashDB without the cons of using a remote service provider. Additionally you can add any hashed strings from your analysis to the lookup database so that you are able to adapt it to your needs in the most flexible way.
Have fun :)