Static Malware Analysis: Identifying Malware
Introduction
As soon as you get the malware, you will always try to jump right in and try to tear it apart and get the information from it using various techniques. However, there are few things you should make sure you do before performing any analysis, as these will help the speed and guide everything else you do.
One of the techniques is: File Identification
Malware authors constantly try to trick users into opening up their malware, and one of the ways they do this is by disguising a file to look like it’s safe.
So, as a malware analyst, one of the first things we have to do is figure out what type of file we are dealing with. They must apply various techniques that will help them to identify exactly what they’re examining.
Static Analysis
As we know that, static analysis techniques examine the characteristics of malware without executing it. In this phase, various pieces of information about the malware are gathered together which can be used in the dynamic analysis phase when we execute it and see what it really does.
In other words, the data that we gathered in this static analysis gives glimpses into what the malware can do. And, these can be confirmed during the dynamic analysis phase.
Focus Your Analysis
In static analysis, we need to focus on various things like:
1. You may not initially know the type of the file, but it is the first thing you must figure out.
2. Has anyone already analyzed this malware? Knowing this can save you a lot of time.
3. The embedded strings on the malware can reveal a lot of information on what the malware can do.
4. The Windows PE executable header reveals a lot of information about the malware. It can be used as our advantage to help us figure out how the malware will behave.
5. We know that the packers compress a program and hide its internals from us. So, if we can determine if the malware is packed and what packer was used, we can unpack it and get the underlying information.
NOTE:
Despite the fact that the static analysis gets its information without executing the malware, doesn’t mean that it’s any less dangerous. We still need to be very careful during static analysis, and only analyze the malware in Sandbox. This is because there are many tools that may open or execute a file in order to obtain the information you want, including static analysis tools.
We must also be very aware of the tools we use. There are many tools that are available for analyzing malware. However, it is recommended to treat any tool like malware and analyze them. This is basically for two reasons:
- This will allow you to see how the tools work. They may modify the registry entries, drop files, or setup services. Knowing this will help you exclude the activity later on when you run the tool, especially during the dynamic analysis.
- Unfortunately, some tools are located in unscrupulous places on the internet. Some of the malware analysis tools are backdoors or malware themselves. Treating these tools like malware will allow seeing any odd modifications to your Sandbox or weird connections to websites caused by these tools. And, in case any such suspicious activity is detected, we must avoid using that tool.
File Identification
We know that we may not know the type of file we are dealing with at the start of our malware analysis. This is because the attackers can do everything they can to trick the users into opening up the malware by disguising their files.
Of course, some of the things they do to accomplish this always makes analysis a little bit more difficult.
One of the most common ways the attackers do this is by using “double extensions.” Just like it sounds, an attacker adds an extension that looks like “.doc” before the real extension. This attempts to trick the user into thinking that the file is not executable when it, in fact, is.
Another common method attacker gets around signature-based analysis is by packaging the malware in self-extracting archives. There are a lot of archiving utilities, and many of them, such as NSIS and WinRAR, have their own scripting languages. Attackers will package their malware in these archives, including a script to install it and send it off. And, when the archive is executed, the internal script extracts the malware and executes it.
You can think of this as a poor man’s packer.
Speaking of Packers, many file identification tools will also examine executables for the indication of packers. Some of them will also give hints on how to unpack the malware.
File Signatures
Many file types, such as executables and documents, begin or end with specific bytes known as magic bytes. File identification tools use these bytes as signatures in order to identify the type of file they are examining.
e.g. this is a screenshot of the Windows PE program, the format most Windows executables are in. There are a number of signatures within the file that can be used to identify it as such.
Here, the first two bytes of the file are MZ. All windows programs will have MZ as their first two bytes.
In another hand, most of the PE programs will have statements such as “This program cannot be run in DOS mode” down here. However, this is not necessary. And, some packers might change or remove this.
Finally, at offset HEX E0 is the start of the PE header with the two bytes, PE.
Well, you probably won’t have to open up a hex-editor to manually identify the files. It is not a bad idea to know the signatures of a number of common file types that you might come across.
Here are some common file signatures you should know:
File Identification Tools
There are a number of tools available to provide file detection.
File:
· One of the most commonly used file identification tools.
· It works by comparing the file header with known signatures of different file types, which is stored in a plain text file called the magic file.
· This tool is fast, simple, and effective, but it does not provide much in-depth information on the file you are looking at.
· They usually come standard on most UNIX-based systems, including Linux and macOS X. Windows has a version too for this tool.
Here is an example of “File” being run on an executable named openme.doc.exe. Notice the double extension used by the attacker to try and disguise the file type. Now, File tells us that it is a PE32 GUI executable, simple and easy.
Exeinfo PE:
· This tool works only on Windows executables, but it returns a lot of information on them, including possibly what compiler was used to create it.
· This tool is great at detecting packers. If the file you give it is packed, there’s a good chance Exeinfo PE will be able to tell you what packer it is.
· Even better, Exeinfo will give you hints on how to unpack it. This is really great as we don’t have to hunt all over the internet to figure out how to get at the underlying information in the file.
· It is a GUI application but also comes with a command-line version in case you want to use it in automation scripts.
Here we are running Exeinfo PE on openme.doc.exe.
There is a lot of information packed into this tiny interface, but the part we want to look at is at the bottom. The section in dark blue describes what it found in the executable.
Earlier, File has told us that it is a Windows executable, Exeinfo PE tells us we have a 64-bit executable which has been packed with a packer called UPX.
Just below the type information is the unpacking hint (in the light blue section). This tells us how we can unpack the UPX packed program.
UPX is actually a very commonly used packer. It’s open-source and available for most systems, so it’s not uncommon to find malware packed with it. Fortunately, unlike most packers, the same UPX program that packs executables can also unpack them. UPX can be downloaded at http://upx.sf.net
TrID:
· Available for both Windows and Linux.
· Unlike the last two programs, which use the signatures for file type detection, TrID uses a pattern database to determine the file type.
· If the file contains any of the patterns TrID has, it displays them.
· TrID will also show you how likely the file has each of the patterns detected. This is very helpful, as the choices TrID gives is sometimes more accurate than the other signature-based tools.
· The pattern database is updated fairly regularly, so it’s always able to detect new types of files.
· TrID is a command-line tool, and the only required option you have to give it is the path to the pattern database. You can do this through the -d: option.
Finally, the TrID is run against openme.doc.exe.
In this case, TrID tells us that there is an 87.1% chance we have a UPX packed program, with a 6.4% chance we just have a regular Windows or DOS executables.