Part Two in a multi-part series on holistic, multi-disciplinary analysis and reversing.

You can read part one of this series here.

The last post, “Mutex Analysis: The Canary in the Coal Mine,” started off showing how you can use mutexes to discover malware that is difficult to locate using more traditional methods and tools. We used a live compromised system for the example and the post came to a relatively abrupt end when it seemed that we stumbled onto a new/unknown type of malware – or at least one that does not seem to have any public exposure or analysis. This post will be “part 2″ of our analysis.

## Some Logic Behind Memory Crash Dump Analysis

In the last post we forced a crash dump of the process (services.exe) that seemed to contain injected malware based on the mutex analysis steps proposed. I made a mention of examining the memory heaps as a first step in scenarios like this. Before picking up exactly where the first post left off, let’s discuss this point a bit.

The first of the three main types of memory segments is the “code” (aka: “text”) segment. Simplistically, this will be the program itself and its associated DLLs, taken from the file system and properly realigned for memory addressing and alignment. In normal cases, these segments are relatively static.

At the other end of the spectrum are “stack” segments. These segments are extremely volatile since they temporarily hold variables being manipulated inside specific functions as a program executes.

Between those two types of segments (from a volatility perspective) are heap segments. The heap is an internal memory pool created to dynamically allocate memory as needed. Heap blocks of memory are allocated and freed in an arbitrary order. The pattern of allocation and size of blocks is not known until run time. Heap is usually being used by a program for many different purposes as the heap functions as shared memory modified during runtime.

Based on the above [over-simplified] background information, it stands to reason that if something is maliciously injected or otherwise hidden in another process’s memory space, it (or traces of it)can probably be found in a heap block.

Again, the two tools we’ll use in this article for memory and dump analysis will be:

While using PEBrowse, we’ll typically have many sub-windows open in the right-side pane, as the following screenshot illustrates. However, all PEBrowse screenshots in this article (after this one) will only show the window parts most relevant to our discussion.

On the other hand, Windbg is a hybrid CLI/GUI application as shown in the following screenshot. However, rather than show screenshots, I will likely be pasting text output into this article to make formatting easier.

## Simple Initial “Scanning” Our Crash Dump Using PEBrowse

Loading the dump of the process we suspect of containing injected malware (based on the mutex analysis we did in the previous post) into PEBrowse shows it contains 11 heap segments.

Want to learn more?? The InfoSec Institute Reverse Engineering course teaches you everything from reverse engineering malware to discovering vulnerabilities in binaries. These skills are required in order to properly secure an organization from today's ever evolving threats. In this 5 day hands-on course, you will gain the necessary binary analysis skills to discover the true nature of any Windows binary. You will learn how to recognize the high level language constructs (such as branching statements, looping functions and network socket code) critical to performing a thorough and professional reverse engineering analysis of a binary. Some features of this course include:

• CREA Certification
• 5 days of Intensive Hands-On Labs
• Hostile Code & Malware analysis, including: Worms, Viruses, Trojans, Rootkits and Bots
• Binary obfuscation schemes, used by: Hackers, Trojan writers and copy protection algorithms
• Learn the methodologies, tools, and manual reversing techniques used real world situations in our reversing lab.

The reason I like PEBrowse is that is makes it very easy to quickly do a visual scan the heaps for almost anything suspicious that catches your eye. Suspicious things tend to include URLs and IP addresses, names of other executables, and things like that. Much worse things can be found in the heap, as I have a feeling we’ll see in this example. **Note: Like the previous post, I’m doing this analysis (for the first time) concurrently while writing this. I really don’t know what to expect at this point!

As I mentioned in the previous post, we hit something very suspicious in the first heap segment I dumped. It started off relatively normal (gibberish), as you can see below:

However, casually scanning the contents of this heap brings us to the following:

The above screenshot is a perfect example of the types of things to consider suspicious when scanning these segments of memory – especially taking into consideration the legitimate purpose and job of the process you’re examining (which in this case, probably should not contain strings like that).

Scanning a little further, we finally encountered the following:

Dumping this section of memory (I’m sure we’ll get into this later when we find injected malware, and I want to save the discussion of carving until then) and formatting it gives us the following:

SIGNATURE_CHECK]
[kit_hash_begin]
100000
[kit_hash_end]
[cmd_dll_hash_begin]
100000
[cmd_dll_hash_end]
[servers_begin]

http://finddirt.org/cat/v3

http://clickbrake.org/cat/v3

[servers_end]
[modules_begin]
serf|100011
bbr|100007
ldr_facedll|100007
[modules_end]
[injects_begin]
bbr|iexplore.exe,explorer.exe,firefox.exe,safari.exe,opera.exe,chrome.exe,
serf|iexplore.exe,explorer.exe,ieuser.exe,
cmdsscore|services.exe,
[injects_end]
[block_by_crc_begin]
2283582
1377285

1394702
1430267
[block_by_crc_end]
[SCRIPT_SIGNATURE_CHECK_END]

At this point, we’d normally take a few field names like “injects_begin” or “block_by_crc_begin” and use Google to discover what type of malware this is. However, in this case, we found no results in Google for these hits (or many others I later did), indicating that we might have stumbled across a new family of malware.

Our task at this point is to discover if this really is “new” (by “new” we mean not previously exposed or publically reversed, although even if “new,” this has likely been in the wild for quite some time). However, before we get to that point we need to find the malware first!

Scanning other heap segments shows similar configuration type information, but there’s another interesting tactic to use when visually skimming memory dumps this way. That is, starting at the “top-most” segments shown in PEBrowse that aren’t resolved automatically. Those segments are shown in red below:

In the first segment (0×00010000) we have a very interesting find. At the start of this segment is another configuration file that seems malware-related:

Then, shortly after that configuration files ends, yet also contained in the same segment is the following:

Ruh roh Shaggy, this can’t be good!

## Switching to WinDbg

Ok, so at this point we know the following:

1. Based on network traffic, we know the host was definitely compromised and exhibiting a range of C&C type of activity.
2. Scanning the system with a couple of manual and automated tools showed nothing obviously wrong with the system.
3. The process (services.exe) was running on the live system pointing to a very bizarre mutex that couldn’t be legitimized, usually an indication of process injection of some type.
4. We forced a dump of that process using Process Explorer and found configuration files that look malware related and actually match some of the “tells” seen in network activity (based on host names seen).
5. We found memory segments containing both malware configuration files and what appear to be executable files – a strong indication of injection.

Now we’ll use WinDbg to find likely bad PE files inside this memory dump. In this case, “bad” means “injected.” The trick is to find the injected ones.

Of course there will be quite a few legitimate PE files in this dump associated with the main process. They can be enumerated in WinDbg using the “lm” command, as shown here:

0:000> lm
start    end        module name
00230000 00271000   services   (deferred)
6ef90000 6ef9d000   wshbth     (deferred)
6efa0000 6efb2000   pnrpnsp    (deferred)
6fbd0000 6fbe0000   NapiNSP    (deferred)
6fbe0000 6fbe8000   winrnr     (deferred)
73450000 73457000   winnsi     (deferred)
73470000 7348c000   IPHLPAPI   (deferred)
73dd0000 73ddd000   wtsapi32   (deferred)
73e70000 73e80000   nlaapi     (deferred)
74050000 74071000   ntmarta    (deferred)
74730000 748ce000   comctl32   (deferred)
74c20000 74c25000   WSHTCPIP   (deferred)
74d70000 74da2000   winmm      (deferred)
74e60000 74e72000   mpr        (deferred)
74e80000 74eac000   ubpm       (deferred)
75060000 750a4000   dnsapi     (deferred)
75100000 7513c000   mswsock    (deferred)
75140000 75148000   credssp    (deferred)
75180000 75186000   wship6     (deferred)
752c0000 752db000   authz      (deferred)
75390000 753a9000   srvcli     (deferred)
75450000 7549e000   scesrv     (deferred)
754a0000 754a8000   secur32    (deferred)
755b0000 755bf000   scext      (deferred)
755d0000 755ea000   sspicli    (deferred)
755f0000 7563b000   apphelp    (deferred)
75640000 7564c000   CRYPTBASE   (deferred)
756b0000 756d9000   winsta     (deferred)
756e0000 756ee000   RpcRtRemote   (deferred)
756f0000 756fb000   profapi    (deferred)
75760000 7576c000   msasn1     (deferred)
75770000 757ba000   KERNELBASE   (deferred)
758b0000 759cc000   crypt32    (deferred)
759f0000 759fa000   lpk        (deferred)
75a00000 75a9d000   usp10      (deferred)
75aa0000 75aee000   gdi32      (deferred)
75b90000 75c84000   wininet    (deferred)
75c90000 75dc5000   urlmon     (deferred)
75dd0000 75e71000   rpcrt4     (deferred)
75e80000 75e83000   normaliz   (deferred)
75e90000 75fec000   ole32      (deferred)
76020000 76219000   iertutil   (deferred)
76220000 76265000   Wldap32    (deferred)
76270000 76276000   nsi        (deferred)
76280000 7630f000   oleaut32   (deferred)
76370000 7643c000   msctf      (deferred)
764d0000 764ef000   imm32      (deferred)
764f0000 76547000   shlwapi    (deferred)
76550000 76619000   user32     (deferred)
76620000 766cc000   msvcrt     (deferred)
766d0000 767a4000   kernel32   (deferred)
76950000 77599000   shell32    (deferred)
775a0000 776dc000   ntdll      (pdb symbols)
776f0000 77709000   sechost    (deferred)
77790000 777c5000   ws2_32     (deferred)
75160000 75176000   cryptsp.dll


But we want to find the ones that WinDbg doesn’t know about. Everything above are the PE files WinDbg knows about.

First, we simply search the dump for what appears to be the start of PE files. We do so using the “s” (search) command and passing it the first 4 bytes of a PE file as the search pattern. Keep in mind this pattern is the first 4 bytes of *most* PE files, but not all – and especially not all malware PE files, but it’s generally good. The full command is shown below:

0:000> s -d 0x0 L?0xffffffff 0x00905a4d
00018000  00905a4d 00000003 00000004 0000ffff  MZ..............
00020000  00905a4d 00000003 00000004 0000ffff  MZ..............
00230000  00905a4d 00000003 00000004 0000ffff  MZ..............

775a0000  00905a4d 00000003 00000004 0000ffff  MZ..............
776f0000  00905a4d 00000003 00000004 0000ffff  MZ..............
77790000  00905a4d 00000003 00000004 0000ffff  MZ..............


Let’s look at the first result as an example to dissect:

00018000  00905a4d 00000003 00000004 0000ffff  MZ..............


00018000 – The starting offset the pattern was found.

00905a4d – The first four bytes in little-endian order (our search pattern, also given in little-endian order)

00000003 00000004 0000ffff – A sampling of the 12 bytes following the initial pattern searched for.

So as an example, if we want to enumerate some more information about one of those PE files, we can use the command “!lmi” with the starting address of the PE image. The third PE found in the list above starts at 0×00230000, so the command and results for that one would look like this:

0:000> !lmi 00230000
Module: services
Image Name: services.exe
Machine Type: 332 (I386)
Time Stamp: 4a5bbf1b Mon Jul 13 19:11:23 2009
Size: 41000
CheckSum: 41426
Characteristics: 102
Debug Data Dirs: Type  Size     VA  Pointer
CODEVIEW    25, 36424,   35c24 RSDS - GUID: {FB91B983-177B-40F1-88A2-F1F2A5125036}
Age: 2, Pdb: services.pdb
CLSID     4, 36420,   35c20 [Data not mapped]
Symbol Type: DEFERRED - No error - symbol load deferred


But how about for that first one found in the list?

0:000> !lmi 00018000
18000 is not a valid address


Well I think we found our first carving example!

Ok… Here’s where things get a little thorny. :-)

So we know we need to carve a file from this dump that starts at offset 0×00018000, but how do we find the end?

PE files contain a field called SizeOfImage in the Optional Header that gives the size of the PE in memory. THIS APPLIES TO AN IN-MEMORY FILE ONLY! (The process of calculating its size on disk, in network traffic, or anywhere else is such a convoluted process that Kevin Douglas needs to explain it to me every other month or so.) ;-)

I’m sure there’s a better way to do what I’m getting ready to show, but… This is the process I tend to go through.

First we’re going to take a total shot in the dark by dumping the first 10,000 bytes of the file. Basically, we’re going to carve enough of the file to dissect it deep enough to get that actual SizeOfImage value, then go back and carve the actual correct size.

Want to learn more?? The InfoSec Institute Reverse Engineering course teaches you everything from reverse engineering malware to discovering vulnerabilities in binaries. These skills are required in order to properly secure an organization from today's ever evolving threats. In this 5 day hands-on course, you will gain the necessary binary analysis skills to discover the true nature of any Windows binary. You will learn how to recognize the high level language constructs (such as branching statements, looping functions and network socket code) critical to performing a thorough and professional reverse engineering analysis of a binary. Some features of this course include:

• CREA Certification
• 5 days of Intensive Hands-On Labs
• Hostile Code & Malware analysis, including: Worms, Viruses, Trojans, Rootkits and Bots
• Binary obfuscation schemes, used by: Hackers, Trojan writers and copy protection algorithms
• Learn the methodologies, tools, and manual reversing techniques used real world situations in our reversing lab.

We know the starting offset is 00018000. The ending offset for this first shot at carving will be:

10,000 decimal = 2710 hex; 2710 + 18000 = ending offset 1A710

So, the command to dump this file will be:

.writemem "C:\path\to\out.file" 18000 1A710


And when issuing the command we see:

0:000> .writemem "C:\path\dump1.dll" 18000 1A710
Writing 2711 bytes.....


Next we’ll open the file in CFF Explorer and let it parse out the headers and tell us the actual size of the PE:

CFF Explorer is telling us the actual PE size is 5 KB. That is *awfully* small for a PE – malware or otherwise (malware tends to be smaller than legitimate PE files, generally speaking). My initial thought is this is not an actual PE file, but just a false positive left in the slack space of this memory segment, but… Not only can CFF Explorer correctly parse the imports for this PE, but the import list looks like those we see in packed malware:

Because of that, we’ll play along and dump a “correct” copy of this PE. But I’m still suspicious of this file so let’s check out the second PE found that WinDbg doesn’t know about (at starting offset 0×20000):

Well, as we see above, I picked some random number as an ending offset, and just happened to dump the file perfectly, down to the exact byte value. Time to go buy a lottery ticket, since that’ll never happen again.

The bottom red box in the image above also shows that PE meta data could be correctly parsed from this file, indicating it is indeed a valid file. Not only that, but this looks very much like malware also based on the meta information alone.

Now, to go back and correctly dump the first file:

5,120 decimal = 1400 hex; 1400 + 18000 - 1 = ending offset 193FF
0:000> .writemem "C:\Users\f5e79de65\Desktop\dump1.dll" 18000 193FF
Writing 1400 bytes...

So at this point we have two files and a couple questions left. First, are these valid files that we carved – can we do analysis on them?

As we see by loading them into a tool like IDA Pro, we see they can be correctly and fully disassembled and parsed, so the answer to that question is YES – we carved valid PE files and we did it correctly:

Next, we need to find out if these files are known in the wild already.

If you have the luxury of being able to use VirusTotal, there’s no quicker way to answer that question. Some people tend to get up-in-arms over the use of VirusTotal, but we need to be realistic here. Consider the following:

1. This is a scenario where non-attribution is not an issue. Many government agencies and other organizations involved in certain types of cases should not and cannot use ANY public service for analysis (VirusTotal or otherwise), but this is not the case here.
2. As an alternative, opponents of VirusTotal suggest you should just search services like VirusTotal for the hash of the file to see if it has been scanned before. Candidly speaking, I think this is a silly alternative suggestion. Theoretically, because the same sample can be packed many different ways, the hash is likely to be different between campaigns for even the same sample. More importantly – the reality is that when you’re dealing with PE files that have been extracted from memory, the hash will almost always be different (read: the hash will be meaningless). When a PE is loaded into memory, sections are realigned and addresses are resolved at runtime for that particular system. Taking the same sample (same by MD5 hash), running it on three different systems, and extracting that PE from memory on those systems could give you three completely different hash sets.

The topic of using services like VirusTotal is a religious debate, and depending on the situation my opinion on the topic changes, but in this case it will unquestionably save us a considerable amount of time and focus our efforts intelligently.

Well, we see two things above. First, it has a 35% detection rate, which is not bad. Secondly, most of the alerts (of the 15 that generated alerts) are generic alerts, so we’re not *really* sure what this threat *actually* is. The screenshot is truncated, but there were a couple of hits for TDSS, which wouldn’t surprise me.

So does this mean we really didn’t find a new family of malware?

Well, we still have a lot of evidence indicating we’re dealing with some type of malware is not THAT well-known, based on the fact we cannot find any references to most of the configuration-related strings we examined earlier.

This is a perfect example of how VirusTotal can save you considerable time. At this point, I’d pencil in these extracted samples as something akin to TDSS and use my time more valuably by continuing to examine the process dump.

Doing so yields more interesting results. The next non-resolved PE file is found in a memory segment that also contains a reference to the suspicious mutex we found in “part one” of this series, as well as another configuration file in the “unknown” format. Extracting this executables gives us the following:

This file can be considered either an “unknown malware sample” or a legitimate file that’s not malware (which is why none of the AV engines alerted to it). However, when statically analyzing the PE file (using a tool we built for internal R&D), we see it contains several characteristics common to malicious executables:

In this case, considering the amount of evidence we have at this point, I think it’s more likely we just extracted a publicly “new” type of malware. However, we have to be fair and say while the list of evidence is fairly large supporting the notion this is relatively new, it’s all circumstantial evidence at this point!

To prove this is actually a malicious sample just extracted and show it’s something new will be covered in a following part to this series, as this process will be quite involved and will draw heavily on both network traffic analysis and binary reversing and profiling.

In other words, there is much more to come in this series!

Posted with permission from the NetWitness Blog.