In the previous article, “Portable Malware Lab for Beginners,” I spoke about nested virtual machines, i.e., deploying a virtual machine with QEMU and Cuckoo. This acts as a base system for our portable malware analysis lab.
However, malware analysis is not limited to execution of a Windows binary; various other aspects are also involved. The main goal of malware is to gain privilege rights into the system which it intends to infect. In order to do so, various methods are used, e.g.:
- Transmission via email.
- Infection via web pages or hacked web servers.
- Infection via removable media.
One may come across many email attachments containing a malicious file. It can either be a zip file that may contain an exe, a pdf, or, in some cases, a Word file/spreadsheet. It should also be noted that malware authors will always try to mask the icons of the files to make them look like they belong to a specific application, e.g., a PDF icon for a binary. They may also make use of right-to-left override Unicode characters to spoof the file extension as shown below.
However, we will first go through with the deployments of ssdeep and yara. These tools will be helpful as you complete the integration of your portable malware lab. As said in the earlier article, the portable malware lab is not just as an amalgamation of different tools; it is intended to help you build a system that will contain all the tools necessary for analyzing malware. The output of several tools can also be integrated with other tools to provide a better overview of the behavior of malware.
YARA is a tool aimed at helping malware researchers to identify and classify malware samples. With YARA, you can create descriptions of malware families based on textual or binary patterns contained in samples of those families. Each description consists of a set of strings and a Boolean expression that determines its logic.
SSDEEP is a program for computing context-triggered piecewise hashes (CTPH). Also called fuzzy hashes, CTPH can match inputs that have homologies. Such inputs have sequences of identical bytes in the same order, although bytes in between these sequences may be different in both content and length.
Note: execute as root user.
Note: After executing this command, go to the directory where you have been downloading and extracting the files, used in the previous article.
# apt-get install ssdeep #apt-get install python-pyrex python-all python-all-dev
Note: these are required for pyssdeep installation
#apt-get install subversion libapr1 libaprutil1 libdb4.8 libsvn1 # apt-get install libfuzzy-dev libfuzzy2 # svn checkout http://pyssdeep.googlecode.com/svn/trunk/ pyssdeep
# cd pyssdeep # python setup.py build # python setup.py install
$ sudo apt-get install libpcre3 libpcre3-dev $ sudo apt-get install python-dev $ wget http://yara-project.googlecode.com/files/yara-1.7.tar.gz $ wget http://yara-project.googlecode.com/files/yara-python-1.7.tar.gz
Untar and configure YARA.
$ tar xvfz yara-1.7.tar.gz $ cd yara-1.7 $ ./configure
If there are no errors, make the executables.
$ make $ make check $ sudo make install
$ cd .. $ tar xvfz yara-python-1.7.tar.gz $ cd yara-python-1.7 $ python setup.py build $ sudo python setup.py install
You should now be able to call YARA from a shell prompt.
$ yara usage: yara [OPTION]… [RULEFILE]… FILE options:
-t <tag> print rules tagged as <tag> and ignore the rest. Can be used more than once.
-i <identifier> print rules named <identifier> and ignore the rest. Can be used more than once.
-n print only not satisfied rules (negate).
-g print tags.
-m print metadata.
-s print matching strings.
-d <identifier>=<value> define external variable.
-r recursively search directories.
-f fast matching mode.
-v show version information.
Thug is a Python low-interaction honey-client based on a hybrid static/dynamic analysis approach.
Thug provides a DOM implementation which is (almost) compliant with W3C DOM Core, HTML, Events, Views, and Style specifications (Level 1, 2 and partially 3).
How to Install Rhino
# apt-get install ca-certificates-java default-jre-headless icedtea-6-jre-cacao icedtea-6-jre-jamvm java-common libjline-java libnss3-1d librhino-java openjdk-6-jre-headless openjdk-6-jre-lib rhino tzdata-java
# rhino -help
Valid options are:
-?, -help Displays help messages.
-w Enable warnings.
Set a specific language version.
-opt [-1|0-9] Set optimization level.
-f script-filename Execute script file, or "-" for interactive.
-e script-source Evaluate inline script.
Add path or URL to the CommonJS modules search path.
-main [module] Set CommonJS main module id or file name.
-sandbox Enable CommonJS sandbox mode.
-debug Generate debug code.
-strict Enable strict mode warnings.
-fatal-warnings Treat warnings as errors.
-encoding charset Use specified character encoding as default when reading scripts.
How to Install V8 and Thug along with the Various Tools
# mkdir tools
# cd tools
# svn checkout http://v8.googlecode.com/svn/trunk v8
# git clone git://github.com/buffer/thug.git
# cp /tools/thug/patches/V8-patch* .
# patch -p0 < V8-patch1.diff
patching file v8/src/log.h
Hunk #1 succeeded at 82 (offset 1 line).
# apt-get install graphviz libcdt4 libcgraph5 libgraph4 libgvc5 libgvpr1 libpathplan4
# apt-get install build-essential libboost-python-dev
# apt-get install libboost-all-dev
# svn checkout http://pyv8.googlecode.com/svn/trunk/ pyv8
# export V8_HOME=/path/tools/v8
# cd pyv8
# python setup.py build #Ignore warnings
# python setup.py install
# cd ..
# easy_install beautifulsoup4
# easy_install html5lib
# git clone git://git.carnivore.it/libemu.git
# apt-get install autoconf2.13 libltdl-dev libtool
# cd libemu
# autoreconf -v -i
# ./configure –prefix=/opt/libemu
# make install
# cd ..
# git clone git://github.com/buffer/pylibemu.git
# cd pylibemu
# python setup.py build
# python setup.py install
# easy_install pefile
# easy_install chardet
# easy_install httplib2
# easy_install cssutils
# easy_install zope.interface
# easy_install pyparsing==1.5.7
Note: This setting is for Python 2.x. In case you are using Python 3.x, then execute the command: # easy_install pyparsing
# easy_install pydot
# easy_install magic
Now test to see if it’s working. If you get the “ImportError: libemu.so.2: cannot open shared object file: No such file or directory” error, follow the solution below:
# touch /etc/ld.so.conf.d/libemu.conf
# echo "/opt/libemu/lib/" > /etc/ld.so.conf.d/libemu.conf
Note: To execute thug
# cd thug/src
# python thug.py
Thug: Pure Python honeyclient implementation
python thug.py [ options ] url
-h, –help Display this help information
-V, –version Display Thug version
-u, –useragent= Select a user agent (see below for values, default: winxpie60)
-e, –events= Enable comma-separated specified DOM events handling
This screen grab shows the output from Thug along with the various sources that were downloaded:
Moving ahead, we now work with the tools related to deobfuscation/reversing of JAR files. The only decompiler that has worked up until now for me has been JD-GUI.
# wget http://jd.benow.ca/jd-gui/downloads/jd-gui-0.3.5.linux.i686.tar.gz
# mkdir jd-gui
# tar xvfz jd-gui-0.3.5.linux.i686.tar.gz -C /Path/to/your/jd-gui/
# cd jd-gui
Features of peeppdf have been outlined below and more information can be found at http://code.google.com/p/peepdf/.
- Decodings: hexadecimal, octal, name objects
- More used filters
- References in objects and where an object is referenced
- Strings search (including streams)
- Physical structure (offsets)
- Logical tree structure
- Modifications between versions (changelog)
- Compressed objects (object streams)
- Shell code analysis (Libemu python wrapper, pylibemu)
- Variables (set command)
- Extraction of old versions of the document
- Checking hashes on VirusTotal
During the installation of Thug, we have already deployed V8 and pylibemu, so we need not go through the entire process once again. However, for peeppdf to provide all the mentioned functionality, “lxml” is the required package that needs to be deployed.
# pip install lxml
For more information about installation of “lxml,” refer to its installation guide at: http://lxml.de/installation.html
Now we proceed with the installation of “peeppdf”.
# wget http://peepdf.googlecode.com/files/peepdf_0.2-BlackHatVegas.tar.gz
# tar xvfz peepdf_0.2-BlackHatVegas.tar.gz
# cd peepdf_0.2-BlackHatVegas
# python ./peepdf.py
Usage: ./peepdf.py [options] PDF_file
Version: peepdf 0.2 r158
-h, –help Shows this help message and exit.
-i, –interactive Sets console mode.
-s SCRIPTFILE, –load-script=SCRIPTFILE
Loads the commands stored in the specified file and
-f, –force-mode Sets force parsing mode to ignore errors.
-l, –loose-mode Sets loose parsing mode to catch malformed objects.
-u, –update Updates peepdf with the latest files from the
-g, –grinch-mode Avoids colorized output in the interactive console.
-v, –version Shows program’s version number.
-x, –xml Shows the document information in XML format.
Deploy Radare, Pyew, and Bokken
While researching, it is quite possible that researchers will come across a variety of samples and they need not be of the same file type. Static analysis is as important as dynamic analysis and this is where Bokken, Radare, and Pyew help us. It is basically a GUI front end for Pyew and Radare projects.
Pyew is a malware analysis tool developed in Python that provides a variety of features, including viewing HEX, disassembly, PE and ELF file formats, and code analysis. It also allows you to write scripts.
Radare, on the other hand, is used for disassembling, debugging, and a variety of tasks.
# apt-get install bokken libdistorm64-1 libgtksourceview2.0-0 libgtksourceview2.0-common libradare2-0.9 pyew python-gtksourceview2 python-radare2
# apt-get install libtidy-0.99-0 tidy python-utidylib
# apt-get install radare radare-common
# apt-get install radare-gtk
Windows-Based Malware Analysis Applications
Since we are using a Linux system and there are numerous Windows programs that are actively being used for analyzing malware, let’s deploy WINE, a windows emulator. By deploying WINE, we will be in a position to use a few of the Windows tools that are being used by researchers. However, there are certain limitations to their use, depending on the packages you have selected to use with WINE.
# apt-get install wine
# wine –version
Now that we have deployed WINE, the first Windows application that we download and deploy is Malzilla. According to the author, it’s a malware hunting tool. However, to summarize the usefulness of Malzilla in a sentence wouldn’t be possible. Since most of the present day malware and exploits are browser-based, Malzilla offers an excellent platform to analyze and reverse-engineer these types of malwares.
Download Malzilla from the below mentioned location and extract the contents from an archive. No installation is required.
# cd Path_To_Malzilla
# wine ./malzilla.exe
# unzip Revelo.zip -d /Path/to/reveloJS/
# cd reveloJS
# unzip Documentation.zip
SWF Investigator is a tool developed by Adobe and is extensively used for analyzing SWF files. It also allows you to conduct static as well as dynamic analysis.
# wget http://labsdownload.adobe.com/pub/labs/swfinvestigator/swfinvestigator_p5_win_update_052213.exe
# wine ./swfinvestigator_p5_win_update_052213.exe
WINE will proceed with the further execution of the executable and the rest of the installation is just like any other Windows application installation.
These two articles were created with an intention of assisting you create your own malware analysis lab in portable mode. Since this is heavily dependent on virtual machine, it is recommended that you ensure that proper backups of all the virtual hard disks are maintained.
Also, there are numerous tools available for *nix/Windows that have not been included, but they can always be used within this environment, either by utilizing the power of WINE; or, by using the method described to implement nested VMs, one can very well deploy an MS Windows OS and the Windows-specific tools.
Note: IDA Pro and Ollydbg function best within the MS Windows environment.