1. Introduction

Yara is a tool that helps us identify and classify malware software samples by the use of rules. We can use Yara to classify files or running processes to determine what family the malwares belong to.

To install Yara, we first need to download it and then issue the following command:

# wget <a href="http://yara-project.googlecode.com/files/yara-1.6.tar.gz">http://yara-project.googlecode.com/files/yara-1.6.tar.gz</a>
	# tar xvzf yara-1.6.tar.gz
# cd yara-1.6
# ./configure
# make
# sudo make install

Afterwards, we can use Yara by executing yara command, which by default will display it’s usage as shown below:

$ yara
usage:  yara [OPTION]... [RULEFILE]... FILE | PID
options:
  -t &lt;tag&gt;                  print rules tagged as &lt;tag&gt; and ignore the rest. Can be used more than once.
  -i &lt;identifier&gt;           print rules named &lt;identifier&gt; and ignore the rest. Can be used more than once.
  -n                        print only not satisfied rules (negate).
  -g                        print tags.
  -m                        print metadata.
  -s                        print matching strings.
  -l &lt;number&gt;               abort scanning after a &lt;number&gt; of rules matched.
  -d &lt;identifier&gt;=&lt;value&gt;   define external variable.
  -r                        recursively search directories.
  -f                        fast matching mode.
  -v                        show version information.

Report bugs to: &lt;victor.alvarez@virustotal.com&gt;

We can see that in order to run Yara, we need to supply the set of rules (RULEFILE) we want to apply and the path to the file (FILE) or pid (PID) of the process we want to scan.

2. The Clamav Rules

We now need to get our hands on the rules file in order to use Yara. In the next section, I will describe the syntax used in the rules file, allowing you to create your own rules file. However, it’s far easier to use the ClamAV rules. The only problem with ClamAV rules is that we can’t actually use them directly with Yara, because Yara has its own way of describing them. This is where the script clamav_to_yara.py,
written by Matthew Richard, comes into play. The script can automatically allow Yara to read the ClamAV rules. To do so, we have to clone the SVN repository of the Malware Analysis book which also includes the clamav_to_yara.py python script. The next step would then be to execute the following command:

# python clamav_to_yara.py

###########################################################################
        Malware Analyst's Cookbook - ClamAV to YARA Converter 0.0.1

###########################################################################

Usage: clamav_to_yara.py [options]

Options:
  -h, --help            show this help message and exit
  -f FILENAME, --file=FILENAME
                        scanned FILENAME
  -o OUTFILE, --output-file=OUTFILE
                        output filename
  -v, --verbose         verbose
  -s SEARCH, --search=SEARCH
                        search filter
Usage: clamav_to_yara.py [options]

clamav_to_yara.py: error: You must supply a filename!

Next, we need to download the ClamAV main signature rules:

# wget <a href="http://database.clamav.net/main.cvd">http://database.clamav.net/main.cvd</a>
	# sigtool --unpack main.cvd

To convert the ClamAV signatures into the Yara form, we need to run the clamav_to_yara python script below:

# python clamav_to_yara.py -f main.ndb -o test.yara -s Agent
[+] Read 64556 lines from main.ndb
[+] Wrote 4229 rules to test.yara

We can see that the tool wrote 4,229 rules to the output file test.yara. Now we can scan a directory with Yara and the new rules with the command below:

# ./yara -r test.yara /home/user/
Trojan_Agent_78 /home/user/setup.exe

We can see that the file /home/user/setup.exe contains a Trojan virus, which is indeed harmful, but we don’t know how. We can also see that the detected string contains a keyword Agent, as we specified when parsing rules from the clamav.ndb file. If we would omit the “-s Agent” parameter, then Yara would now check for all signatures contained in the main.ndb, which would take a considerable longer amount of time to process. Thus, we’re checking only those signatures that contain a string ‘Agent’ in them.

3. The PEiD Rules

We can also easily convert PEiD rules to Yara rules and use Yara to check which packer/encoder was used to compile the possibly malicious executable. This can be a great help in determining the used packer/encoder, which we can later use to decode the executable into its normal form again.

Want to learn more?? The InfoSec Institute Advanced Computer Forensics Training trains you on critical forensic skills that are difficult to master outside of a lab enviornment. Already know how to acquire forensically sound images? Perform file carving? Take your existing forensic knowledge further and sharpen your skills with this Advanced Computer Forensics Boot Camp from InfoSec Institute. Upon the completion of our Advanced Computer Forensics Boot Camp, students will know how to:
  • Perform Volume Shadow Copy (VSC) analysis
  • Advanced level file and data structure analysis for XP, Windows 7 and Server 2008/2012 systems
  • Timeline Analysis & Windows Application Analysis
  • iPhone Forensics

The PEiD signatures can be downloaded from the userdb web site. To convert those rules to Yara rules, we can simply use the peid_to_yara.py python script, which can be downloaded from malwarecookbook. We then do the conversion by executing the following command:

# python peid_to_yara.py -f userdb.txt -o peid.yara

After the command is complete, Yara signatures will be contained in the peid.yara output file. Alternatively we can download the PEiD rules from the yara-project website. After that we can use the same yara command as shown above to check for any files encoded with the supported packer or encoder. To test whether this is true, we can take the setup.exe file and pack it with the upx packer and then run the Yara command with the peid.yara rules to try to detect if the file was encoded with a known packer. In this case, the upx packer should be detected.

Packing the file:

# upx -1 -o setup_upx.exe setup.exe
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2010
UPX 3.07        Markus Oberhumer, Laszlo Molnar &amp; John Reiser   Sep 08th 2010

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
    463152 -&gt;    268592   57.99%    win32/pe     setup_upx.exe

Packed 1 file.

After that, we can run Yara with the PEiD rules to check if it can detect the packed executable. To do that, we need to issue the command below:

# ./yara peid.yara setup_upx.exe
UPX setup_upx.exe

We can see that the setup_upx.exe file was detected as being encoded with the UPX encoder, which is correct. We could have also used the -r option with the Yara command to scan a whole directory, but this wasn’t necessary in our case, since we only wanted to prove that the Yara can now detect the setup_upx.exe as being packed with the UPX encoder.

After all this, we can classify malware examples using the Yara tool only, and we don’t need to scan them with Clamav and PEiD anymore. This is true, because Yara contains the rules from Clamav and PEiD that are used in the scanning process. If we run honepot, we can now classify malware automatically only with the Yara software program. This is proving very useful when we quickly need a malware sample of a specified category.

Let’s say we’re trying to learn a PDF document malware analysis. We can get samples from various places on the Internet, including jsunpack, but wouldn’t it be great if we already had the samples stored in a directory somewhere on the hard drive? We can achieve this very easily with Yara.

4. Conclusion

In this article we showed how we can use the Yara software product to use the Clamav as well as PEiD rules to scan for malicious activity in the files. The above approach is based on signature verification only, which means that it isn’t hard to fool Yara (with the ClamAV and PEiD rules loaded) into thinking that the file is valid and thus not malicious. This is true because the signature verification process can only detect known malicious software, but if we write our own program or encode it with our own encoder, it will probably not be detected, since Yara doesn’t have the appropriate signatures loaded.

Nevertheless, using Yara to try to detect malicious activity in files is still beneficial, as most of the malware on the Internet are standard malicious files and not additionally obfuscated, so the majority of malicious files can be detected.

In order to prevent malware from infecting our system, we need to install at least one antivirus product. In Linux, we could use ClamAV or F-Prot, which is a free alternative for Linux users. But even with an antivirus installed, we can never be 100% secure, since a new undetectable virus can easily be written by a malicious user. The best way to protect the server is to use antivirus software to block known attacks, and to be on a constant alert for any new malware attacks that may not be detected.