1. Introduction

Nowdays there are various threats in the wild that want to get malware installed on victim operating systems. Most of them use some kind of social engineering bundled together with some means to actually execute the malicious code, like JavaScript, malicious PDF documents, malicious Microsoft Office documents, etc.

Of course, if we want the malicious code to execute, we must exploit some kind of vulnerability that exists in web browsers (if we’re propagating malware with JavaScript), Microsoft Word (if we’re propagating malware with .doc documents), Adobe PDF Reader (if we’re propagating malware with .pdf files), etc.

All in all, we need to know that vulnerabilities are being exploited in all programs because of the malicious input data (all programs accept some kind of input data): web browsers accept web sites as input data, Microsoft Word accepts .doc documents as input, Adobe PDF Reader accepts .pdf files as input data, etc. Therefore, if we can construct a malicious input data that exploits the vulnerability present in some of these programs, we can execute arbitrary data.

Here we’ll take a look at the malicious JavaScript code that tries to execute arbitrary instructions on the target operating system.

2. JavaScript

JavaScript is pretty important when analyzing it, because we’re spending considerate amount of our time in web browsers. And since web browsers understand, accept and execute JavaScript, we can feed a URI to the victim and wait for him/her to click on it. Upon clicking on the URI, we can send arbitrary malicious JavaScript to the victim, which will be executed in the web browser. We’re not limited to JavaScript only; we can use any kind of language that web browsers understand, but we’re using JavaScript because we can do pretty much anything with it.

If we’re using JavaScript, we’re not limited to the web browsers only. We can embed malicious JavaScript in any kind of input data being passed to the application that understands it. Thus, we can embed JavaScript into PDF document, SWF files, etc.

Attackers will often obfuscate the JavaScript embedded in any kind of document to harden the analysis of it. In such cases we can use deobfuscator to beautify the JavaScript code in order to make it more readable and thus easier to understand.

Spidermonkey is a stand-alone C library implementation of JavaScript interpreter. We can use it to analyze any JavaScript code, which is far safer than executing it directly in a web browser.

2.1. De-Obfuscating JavaScript Manually

Usually, an attacker obfuscates their JavaScript code so it isn’t readable anymore. An example of such a code can be seen below:

<script>
function deobfuscate(input) {
    // deobfuscation code
}
eval(deobfuscate('23230433239 … '));
</script>

The code doesn’t actually do anything, but we can see what the code is doing: it’s passing the integer argument to the deobfuscate() function, which deobfuscates the integers into real JavaScript code and evaluates and executes it. It’s evident that we need to take a hold of the deobfuscated JavaScript hold that is executed every time the page is loaded, but how?

The answer is by redefining the eval function, which becomes print function. This effectively prints the deobfuscated code rather than executes it. To do that, we need to copy the above code into a separate file (just the JavaScript code without the starting and ending <script> tag) and append the line below at the top of that file:

eval = print

This redefines the eval function into print function. After that we can open that file with a web browser, but a better way is using the js command that comes with SpiderMonkey like below:

# js example.js

The SpiderMonkey will then execute the deobfuscate() function and print the result on the screen instead of executing it. Now we can start analyzing the deobfuscated JavaScript code and take a look at what the attacker was trying to achieve.

2.1. De-Obfuscating JavaScript with Jsunpack

Jsunpack can be used to de-obfuscate obfuscated JavaScript code automatically. It is a web application in which we can directly copy the obfuscated JavaScript code. The web application then analyzes the code and presents it back to us. The Jsunpack web application can be seen in the picture below:


There are multiple input elements that the web applications accepts. We can paste the JavaScript code directly to the input box, we can provide an URL to the webpage that uses obfuscated JavaScript, we can even upload the PDF, PCAP, SWF, HTML or JavaScript files that will be analyzed automatically. The private checkbox option can be enabled if we don’t want the code to be released to the public and be made generally available. The privacy link right beside it presents us with the full explanation of that option, which can be seen on the picture below:

At the bottom of the page there are also three links. The first one is named “Blog” and points to the Jsunpack blog. The second one is named “Source Code” and points to the Google Code website of the Jsunpack-n tool. The third link is named “Recent Submissions” and points to the obfuscated malicious JavaScript code that was recently submitted; this is also presented in the picture below:

On the left side are recent submissions that don’t contain any malicious JavaScript code and on the right side are malicious JavaScript code submissions. If we click on one of the examples, there will be a detailed description of the obfuscated JavaScript code with a download link, which we can use to download a zip archive that contains the malicious JavaScript code.

Let’s download that PDF document, referenced as 61.4.82.210_37.pdf in the malicious uploads. Let’s download the zip archive, which contains the file as presented below:

Want to learn more?? The InfoSec Institute Reverse Engineering course teaches you everything from reverse engineering malware to discovering vulnerabilities in binaries. These skills are required in order to properly secure an organization from today's ever evolving threats. In this 5 day hands-on course, you will gain the necessary binary analysis skills to discover the true nature of any Windows binary. You will learn how to recognize the high level language constructs (such as branching statements, looping functions and network socket code) critical to performing a thorough and professional reverse engineering analysis of a binary. Some features of this course include:

  • CREA Certification
  • 5 days of Intensive Hands-On Labs
  • Hostile Code & Malware analysis, including: Worms, Viruses, Trojans, Rootkits and Bots
  • Binary obfuscation schemes, used by: Hackers, Trojan writers and copy protection algorithms
  • Learn the methodologies, tools, and manual reversing techniques used real world situations in our reversing lab.

The first thing we want to do is to categorize the file based on the header information. We can do that with the file command, which says that the file is a PDF document:

# file c41f10c79ccea7432987a9d7050604a3eb47
c41f10c79ccea7432987a9d7050604a3eb47: PDF document, version 1.2

After that it’s time to download the jsunpack-n program, which emulates browser functionality when visiting a URL. It can detect malicious code that can be used to exploit a web browser and browser plugins. After we’ve downloaded the tool and installed all dependencies (as noted in the INSTALL file), we can run jsunpack-n, which has the options presented below:

# ./jsunpackn.py -h
Usage:
        ./jsunpackn.py [fileName]
        ./jsunpackn.py -i [interfaceName]
        jsunpack-network version 0.3.2c (beta)
        [warning] pynids is disabled, while you cannot process pcap files or a network interface, you can still process JavaScript/HTML files
Options:
  -h, --help            show this help message and exit
  -t TIMEOUT, --timeout=TIMEOUT
                        limit on number of seconds to evaluate JavaScript
  -r REDOEVALTIME, --redoEvalLimit=REDOEVALTIME
                        maximium evaluation time to allow processing of
                        alternative version strings
  -m MAXRUNTIME, --maxRunTime=MAXRUNTIME
                        maximum running time (seconds; cumulative total). If
                        exceeded, raise an alert (default: no limit)
  -f, --fast-evaluation
                        disables (multiversion HTML,shellcode XOR) to improve
                        performance
  -u URLFETCH, --urlFetch=URLFETCH
                        actively fetch specified URL (for fully active fetch
                        use with -a)
  -d OUTDIR, --destination-directory=OUTDIR
                        output directory for all suspicious/malicious content
  -c CONFIGFILE, --config=CONFIGFILE
                        configuration filepath (default options.config)
  -s, --save-all        save ALL original streams/files in output dir
  -e, --save-exes       save ALL executable files in output dir
  -a, --active          actively fetch URLs (only for use with pcap/file/url
                        as input)
  -p PROXY, --proxy=PROXY
                        use a random proxy from this list (comma separated)
  -P CURRENTPROXY, --currentproxy=CURRENTPROXY
                        use this proxy and ignore proxy list from --proxy
  -q, --quiet           limited output to stdout
  -v, --verbose         verbose mode displays status for all files and
                        decoding stages, without this option reports only
                        detection
  -V, --very-verbose    shows all decoding errors (noisy)
  -g GRAPHFILE, --graph-urlfile=GRAPHFILE
                        filename for URL relationship graph, 60 URLs maximium
                        due to library limitations
  -i INTERFACE, --interface=INTERFACE
                        live capture mode, use at your own risk (example eth0)
  -D, --debug           (experimental) debugging option, do not delete
                        temporary files
  -J, --javascript-decode-disable
                        (experimental) dont decode anything, if you want to
                        just use the original contents

Now we can run jsunpack-n on our malicious PDF file as follows:

# ./jsunpackn.py c41f10c79ccea7432987a9d7050604a3eb47
[suspicious:5] [PDF] c41f10c79ccea7432987a9d7050604a3eb47
        suspicious: PDFobfuscation detected Collab[
        file: decoding_0c3be4288226f0bd341d8692d02a42652e9109e1: 78750 bytes
        file: original_f21cc41f10c79ccea7432987a9d7050604a3eb47: 13565 bytes

We can see that the original PDF file was written at a location temp/files/ original_f21cc41f10c79ccea7432987a9d7050604a3eb47, while the decoded JavaScript was written to temp/files/decoding_0c3be4288226f0bd341d8692d02a42652e9109e1. The suspicious function uses a string Collab and is presented below:

function S7aL(u713,u714){Collab['u0067u0065u0074u0049u0063u006fu006e'](u714+u713);}

If we translate the Unicode encoding into ASCII we get the following JavaScript code:

function S7aL(a,b){ Collab['getIcon'](b,a); }

But why is this suspicious? It’s only calling the getIcon() method. We can quickly get an answer to that if we Google a bit. There’s a remote code execution vulnerability in Acrobat Reader when calling Collab ‘getIcon()’ as can be seen here. This can be also seen in the picture below:

It’s indeed the right choice to flag this PDF document as malicious.

There are also various other options we can use when running jsunpack-n. One interesting option is the –timeout option that specifies the number of seconds for evaluation of JavaScript, which is useful if JavaScript is using heap spraying technique. The default timeout is 30 seconds, after which, if processing is still not finished, the evaluation is terminated and the results gathered so far are presented. If we run the above analysis with a verbose flag set, we get the output below:

# ./jsunpackn.py -V c41f10c79ccea7432987a9d7050604a3eb47
[malicious:10] [PDF] c41f10c79ccea7432987a9d7050604a3eb47
        info: [decodingLevel=0] JavaScript in PDF 78663 bytes, with 87 bytes headers
        suspicious: PDFobfuscation detected Collab[
        info: [decodingLevel=1] found JavaScript
        error: undefined variable DDGfx
        info: Decoding option app.viewerVersion=9.1,    0 bytes
        info: Decoding option app.viewerVersion=8.0 and app.viewerVersion=7.0,  56 bytes
        info: Decoding option app.viewerVersion=,       42 bytes
        malicious: Utilprintf CVE-2008-2992 detected
        malicious: Alert detected //alert CVE-2008-2992 util.printf length (7,undefined)
        info: [2] no JavaScript
        info: file: saved ../c41f/c41f10c79ccea7432987a9d7050604a3eb47 to (./temp/files/original_f21cc41f10c79ccea7432987a9d7050604a3eb47)
        file: decoding_0c3be4288226f0bd341d8692d02a42652e9109e1: 78750 bytes
        file: decoding_9ff1f85b784f0684a5ddae6d96d0c9da5302fab1: 56 bytes
        file: original_f21cc41f10c79ccea7432987a9d7050604a3eb47: 13565 bytes

Let’s compare the results with the online version of the jsunpack. The online analysis of the same PDF document can be seen in the picture below:

We can see that the detected vulnerabilities are not the same if we analyze the file with jsunpack-n command line tool and jsunpack online version. Why is that? It’s simply because the online version uses the -f argument, which improves performance by evaluating the PDF document with a limited range of PDF Reader version numbers. If we add that option to the jsunpack command line, we get the same output as we can see below:

# ./jsunpackn.py -V -f c41f10c79ccea7432987a9d7050604a3eb47
[malicious:10] [PDF] c41f10c79ccea7432987a9d7050604a3eb47
        info: [decodingLevel=0] JavaScript in PDF 78663 bytes, with 87 bytes headers
        suspicious: PDFobfuscation detected Collab[
        info: [decodingLevel=1] found JavaScript
        error: undefined variable DDGfx
        info: Decoding option app.viewerVersion=9.1,    0 bytes
        info: Decoding option app.viewerVersion=,       42 bytes
        malicious: collectEmailInfo CVE-2007-5659 detected
        info: [2] no JavaScript
        info: file: saved c41f10c79ccea7432987a9d7050604a3eb47 to (./temp/files/original_f21cc41f10c79ccea7432987a9d7050604a3eb47)
        file: decoding_0c3be4288226f0bd341d8692d02a42652e9109e1: 78750 bytes
        file: decoding_4074b66fea076c2f3fba4f4c05eb3f7329f4bde4: 42 bytes
        file: original_f21cc41f10c79ccea7432987a9d7050604a3eb47: 13565 bytes

Now the same vulnerability is detected by both versions of the jsunpack tool. It’s not redundant to also present the contents of the decoded files. The contents of the decoded file by the online version of the jsunpack tool are presented in the picture below:

The first decoded file 0c3be4288226f0bd341d8692d02a42652e9109e1 is shown below:

We didn’t present the whole file, just the first part of it to be able to definitely say that the files are the same. If we look at the picture above, we can see that is starts with the “var BseFa”, which is exactly the same as the first decoded file in the previous picture.

The jsunpack-n also decoded another file, which can be seen in the picture below:

This time the only content of the file is a comment about the Collab.collectEmailInfo vulnerability that was found in the malicious PDF document. I guess the decompression algorithm didn’t continue the way we want (with other files being found, as with the online version of the tool), because we have a different version of pre.js JavaScript script that isn’t as complete as the one used by the online version of the tool.

The decompressed files above represent each iteration in the deobfuscation process. The first file that starts with c41f is the actual downloaded PDF document. If there is only one decoded file, it means that Jsunpack didn’t detect any decoded data and didn’t decode anything; it just displays the found contents on the screen. But if there are multiple extracted files, we can be sure that the data within the document was encoded somehow. Usually the attackers employ encoding of the data to hide their content when sending exploits to the target machines. If the attacker is trying to hide something he will create two or more decodings, which can be successfully detected by Jsunpack.

The Jsunpack tool can detect up to five stages of decoding levels, which results in up to five files. The more levels there are, the more prominent is that the attacker is trying to hide something and that the document is indeed holding something malicious.

3. Conclusion

We’ve seen that Jsunpack can be a great help with decompressing the decoded PDF files and should be a mandatory tool when analyzing possibly malicious PDF documents.