Ethical hacking: Basic malware analysis tools
Introduction to malware analysis
Malware analysis is a common component in the incident response process. Once malware has been identified on a system, it is often useful to investigate and learn more about its specific functionality.
Malware analysis can have many possible goals. A high-level analysis may be intended to extract a few indicators of compromise to add to a security tool’s signature list. A more in-depth investigation may be required to determine the functionality of a particular sample in order to identify behavior and persistence mechanisms to help with removing it. Finally, an organization may want to perform a comprehensive analysis of a particular sample in order to understand the specifics of an APT’s operations and share information about a new threat with the community.
Basic malware analysis tools
When starting out in malware analysis, there are a variety of useful tools available. Depending on the goals of the analysis, the malware analyst may need to collect different pieces of information. Different tools are ideal for different purposes, so it’s helpful to be as familiar with as many as possible.
Hex editors are some of the simplest of malware analysis tools, but they can also be extremely useful. A hex editor like HxD is designed to show both the raw hexadecimal representation of a file and the ASCII interpretation.
Looking at a potential malware sample in a hex editor can be useful for extracting basic features from the file. Reading a file’s magic number may help in identifying a particular filetype, and examination of the raw hex of the file can help with identification of obfuscation methods like the use of weak XOR encoding. A malware analyst can also manually extract printable strings from a file by looking at its ASCII representation.
Extracting strings from a file can be extremely useful in learning about what the malware does, its origin, and other embedded information (like IP addresses or domain names). A Windows executable file includes the names of imported libraries in plaintext, which can be useful for determining the purpose of a file based on the functions that it tries to access.
The strings command is available as a terminal program on both Linux and Windows. It is designed to pull out any ASCII or Unicode strings within a file. However, this list can contain a large amount of garbage since any sequence of at least three (Windows) or four (Linux) printable characters will be printed.
FireEye has open-sourced StringSifter, a tool for making strings output more useful for malware analysts. The tool uses machine learning to rank strings based on their probable utility to the analyst, decreasing the time spent searching through garbage strings.
When most people think of malware analysis tools, they think of disassemblers. These tools are designed to help with static code analysis by reversing machine code into assembly instructions, which are more human-readable. Disassemblers can also come with decompilers, which take the code all the way back to source code; however, these are often more expensive and less reliable.
A variety of different disassemblers are available on the market. The Interactive Disassembler (IDA) is the most famous, providing an old build as a free version or the most updated version for a yearly fee. Another paid disassembler is Hopper, which is available for the Mac and Linux operating systems.
A variety of free and open-source disassemblers are also available on the market. The NSA recently released Ghidra, a tool that they developed in-house. Another popular open-source disassembler is radare2, which has a wide range of functionality and enjoys frequent updates.
Disassemblers are useful for static analysis, but sometimes it is necessary to run the code in order to understand how it works. Disassemblers run the code within an environment that the malware analyst controls, allowing them to run instructions step by step, set execution breakpoints and inspect the process’s memory and other runtime features.
OllyDbg is a commonly-used debugger for the Windows operating system with a wide range of features. WinDbg is another Windows-based debugger. Its main selling point is the fact that it can be used for kernel-mode debugging. On Linux, the most popular debugger for malware analysis is the GNU debugger (gdb).
While debuggers are useful for performing dynamic malware analysis, they run the malware directly on the target system. If the analyst is using a disposable virtual machine, this may be fine, but otherwise it can be a problem.
Sandboxes are designed to run malware in an isolated environment to prevent it from breaking free and infecting the host machine or other devices. Sandboxes also commonly include a great deal of instrumentation designed to observe the execution of the malware and draw conclusions from it. Running malware in a sandbox is often a good starting point for malware analysis, as it requires minimal hands-on interaction from the analyst and provides a great deal of information about the sample.
A variety of different malware analysis sandboxes exist, including Cuckoo Sandbox, Falcon Sandbox, Joe Sandbox and many others. Each one has its own benefits that balance cost with the set of available features.
Malware analysis isn’t limited to the desktop. Many online tools are designed to provide a user with a great deal of information about a sample with little or no work. Examples of these are Hybrid Analysis and VirusTotal, which handle running all of the tools mentioned above automatically and collecting the results into an easily readable (and scrapable) format. These tools also allow searching for malware based off of hashes and exploring relationships between uploaded files.
However, these tools should not be used lightly. These tools are commonly used to check if a particular file is malicious, and they work by making data about files uploaded by anyone available to anyone.
This can be a security problem for an enterprise if a potential malware sample could contain sensitive internal data. Companies can leak their own internal data by uploading malware to VirusTotal or Hybrid Analysis, and third parties can find this data using the search and correlation functionality available on the sites.
Conclusion: Starting malware analysis
Malware analysis can seem like a daunting task. Trying to figure out what a compiled executable does is a lot different from reading through some source code. Many malware samples are specifically designed to defy easy analysis.
However, in many cases, the goal of malware analysis isn’t understanding every line of code in the piece of malware in front of you. Malware analysis for incident response is designed to understand what a particular sample can be doing on a machine and extracting indicators of compromise that can be used to detect it.
The tools described here can be used with a minimal amount of knowledge. Many are just point and click. However, they often provide enough information to enable an analyst to achieve their goals even with limited knowledge of malware reverse-engineering.
- HxD – Freeware Hex Editor and Disk Editor, mh-nexus
- stringsifter, GitHub
- About IDA, Hex-Rays
- Hopper Disassembler v4, Hopperv4
- ghidra, GitHub
- radare2, GitHub
- OllyDbg, OllyDbg
- Download Debugging Tools for Windows, Microsoft
- GDB: The GNU Project Debugger, gnu.org
- What is Cuckoo?, Cuckoo Sandbox
- Hybrid Analysis, Hybrid Analysis
- VirusTotal, VirusTotal
- Caution: Misuse of security tools can turn against you, Malwarebytes