This is the third article in a series on the topic of self-modifying code.

Part 1 is here: Writing Self-Modifying Code Part 1: C Hellow Word with RWX and In-Line Assembly

Part 2 is here: Writing Self-Modifying Code Part 2: Using Extended Assembly-Practice

You can download the code mentioned in this article, and the previous articles, for free here: http://www.github.com/aking1012/infosecinstitute-tutorial.

There are plenty of ways that self-modifying code gets implemented. Common methods include OEP adjustment, EXE loaders that handle PE loading requirements and JMP to OEP, compiling as a resources, and in-line decoding/encoding. We can examine each of these in more detail.

OEP adjustment is also known as code-caving. This is where a section is extended and the entry point is redirected to the new area. A small run-once decoding section is then applied to memory. Execution is then directed back to the expected start of execution. This is relatively easy to implement and requires use of some PE editing tools or a decent amount of technical prowess. The redirection is accomplished with a hard-coded jump. It can be relative or absolute depending on whether the binary is allowed to be rebased.

EXE loaders add a level of complexity. They obfuscate imports and typically add sections. One OSS packer that uses this methodology is UPX. It compresses the binary with zlib and handles the PE loading that the operating system normally would. This makes some identification more difficult. The first part is getting the binary into memory. The second part is then parsing the import table and handling loading of libraries. The third part is finding the expected OEP from the PE header and steering execution there.

A third possibility if you use DLLs as payloads is to encrypt the whole DLL. Then you compile as a resource, get the address and length of the resource, map it into memory, decrypt, and reflective injection takes over. This seems like a pretty simple option as well.

The last option is to place encoding and decoding sections internal to functions. This is a lot more complicated to set up. It also requires access to source or detours-like functionality and some carefully designed code. One of the bonuses, however, is that code constantly changes in memory…if you’re doing C wrapping and not just simple shell-code obfuscation that is. Something I’m working on is using a detours-like functionality to re-write functions during execution. It’s a lot harder to control, which is probably why it isn’t used as much. Let’s get started in examining some samples.

OEP adjustment was covered in a flash-talk titled “I Piss on Your AV” with Matti Aharoni of Offensive Security. It was a relatively straight-forward example. UPX tutorials are all over the place. I was turned on to the third possibility by a friend of mine from Europe. I haven’t seen a lot of tutorials floating about on it, so let’s get started with that.

For simplicity’s sake we’re going to use a hello world DLL. Doing it this way, we can just drop in a new DLL whenever we like. So let’s get started. Now that we have that downloaded, let’s get our basic rc file set up. We use .rc files with mingw32 for several things. It’s how we include icons, embedded DLLs, version information, handle UAC requests or lack thereof, and other miscellaneous data. If we use an EXE loader asset we could also use it to make a custom binder for additional antivirus evasion and end user convincing. So, the scaffold is located in the github repository. I try to intuitively name sub-folders.

Bearer DLL Compilation As Built-ins With Windres

First let’s just touch a fake DLL. Taking a look at the example rc file, our first line in make all compiles the resource as a .o file. This is because mingw32 doesn’t like res files. Windres can, however, output them if you’re part of a project team that uses MSVC and you want to be stubborn and keep using mingw32. Then we compile our partc.exe linking in the partcrc.o file. If you want to look at some sample code for that it’s in the ‘dummyresource’ folder.

Now we need our loader scaffold. It will be astoundingly similar to the original with a couple of extra calls. First we need to get the address and length of the resource. We do this with FindResource, LoadResource, and SizeofResource. We don’t want to drop a bunch of files, so we’ll just map it directly into memory or operate on this memory section. Since we’re already in memory and not pointing to the disk, we’ll just operate on memory. We’re going to borrow a lot of code in here and tie it all together. So, our first concern is getting a pointer to the relevant memory section. We could do it a lot like this:

int
MapRes(unsigned
int hinst, unsigned
int resid){

    HRSRC hres = FindResource(hinst, MAKEINTRESOURCE(resid),

        MAKEINTRESOURCE(“BINARY”));

    unsigned
int hfileres = LoadResource(hinst, hres);

    unsigned
int dwSize = SizeofResource(hinst, hres);

    return 0;

};

Check to make sure that everything is building and running at this point and there aren’t any typos.

Now we need to use of some reflective injection code. There are several reflective injectors on github. Let’s borrow one and get started. I’m not a fan of re-inventing the wheel unless you’re taking it a step further. I’ll be using some copyleft code from Joachim Bauch since he goes a long way to making it a single file instead of breaking it up all over the place and provides clear examples. It’s WAY too big to drop into the tutorial so if you’re interested take a look here: https://github.com/fancycode/MemoryModule/blob/master/MemoryModule.c

Now that we have that dropped in and tweaked it a bit, we need to check that a DLL actually loads. I used the chunk of code from the loader example, ‘mallocs’ and a ‘memcpy’, so if we wanted to do some decryption here it would be pretty simple to shim it in. Let’s compile it as a resource with windres and our rc file and give it a test run.

Taking a look at the following lines we should note a couple things. IDR means whatever kind of resource we want. 0 isn’t used. 1-7 are reserved. For me it’s easier to start tracking at 10 rather than 8. GROKIT is arbitrary. Modifying RCDATA seemed to break some things.

#define IDR_GROKIT 10
IDR_GROKIT RCDATA "partc.dll"

Okay, test-runs. This is a pretty simple approach if you can coax whatever you’re using into a DLL format. An interesting code project might be placing UPX-like EXE wrapping functionality into a DLL bearer for a new binder. It’s just a thought. Anyway check the ‘realasset’ folder on github.

Internal to a Function Encoding and Decoding

Last, encoding internal to the function is more complicated. For example, the number of arguments modifying our save EIP offsets. It does offer some interesting benefits for making in-memory analysis horribly painful and memory snapshots a lot less effective. The code on github has an example file we’re going to use. It’s not a malicious payload. It is simply some hello world code and returning values to show that we’re going to leave things exactly as we found them. A polymorphic stub generation engine is definitely useful, but it’s a little out of scope, not to mention it would be in violation of a couple of NDAs.

Want to learn more?? The InfoSec Institute Reverse Engineering course teaches you everything from reverse engineering malware to discovering vulnerabilities in binaries. These skills are required in order to properly secure an organization from today's ever evolving threats. In this 5 day hands-on course, you will gain the necessary binary analysis skills to discover the true nature of any Windows binary. You will learn how to recognize the high level language constructs (such as branching statements, looping functions and network socket code) critical to performing a thorough and professional reverse engineering analysis of a binary. Some features of this course include:

  • CREA Certification
  • 5 days of Intensive Hands-On Labs
  • Hostile Code & Malware analysis, including: Worms, Viruses, Trojans, Rootkits and Bots
  • Binary obfuscation schemes, used by: Hackers, Trojan writers and copy protection algorithms
  • Learn the methodologies, tools, and manual reversing techniques used real world situations in our reversing lab.

Remember how we did VirtualProtect in the last tutorial? If not, go back and examine that code again. It all stacks. Now we need to get our section decoded. Let’s do a simple walking XOR and length at first. We’re going to use in-line assembly and get the relevant EIP values and move them across to a function where we do our walking XOR. Making it a function instead of in-line inside of every function makes it a little easier to reverse since you’re isolating the decryption/decoding for them. Most of the work is done in ASM even though I separated it into a function for clarity. It should be mostly cut and paste-able. The code would look something like this:

int
XORIT(int start, int len){

    unsigned
int daeip = start + 0×22;

    unsigned
int end = start + len – 0xB;

    while (daeip <= end){

        asm(

            “XORL $0×11111111, (%[ASMEIP]);”

            :/**/

            :[ASMEIP] “r” (daeip)

            :/**/
“eax”

            );

        daeip+=4;

    };

};

Okay that seems to work. Since we’re not searching for markers, we just use the same calls we used before. If we were using AES this would be a call to encrypt instead of decrypt. Another approach is using a similar save EIP method and going for an AES approach. I separated the XOR loop so readers could try this if they wanted to do so. We would need EIP start, EIP end, and a key. An additional complexity would be looping through the modified data and putting it back in place since cryptography libraries that I looked at took a different buffer in than they put out. The need for arithmetic on the saved ‘EIP’ values has to do with there being arguments on the stack. We could overcome this by using egg-hunter like functionality if we wanted. The issue with egg-hunter like functionality is that it would make the AES exercise a little more difficult.

Notes On AV Evasion For Metasploit Payloads

It seems like there is a lot of discussion on using msfpayload and msfencode for evasion. This is a clarification of what we hope to accomplish with the article series and a couple of talks. If all we’re doing is obfuscating Metasploit payloads, msfvenom is a good option until AV starts flagging on built-in encoders…as in today, yesterday, last week. This article series and talks at conferences were targeted at getting ahead of the game and making evasion more robust for both Metasploit payloads and other software that might get detected. There are some issues with targeting specific products and specific technologies that impede specific payloads. These are simply overcome once you adopt a methodology similar to what I express here. The bonus with msfvenom is that it only does one framework load, so it is significantly faster than using msfpayload and msfencode.

Here’s discussion of msfvenom over on the rapid7 site:

https://community.rapid7.com/community/metasploit/blog/2011/05/24/introducing-msfvenom

Essentially, our main objective in the articles of this 3-piece series was a more technical and hands-on discussion. I’m just putting out there why there is additional complexity in this particular approach that often causes hesitation. I highly recommend checking the framework directories that aren’t imported after updates and switching to the rapid7 github repositories instead of the old svn update functionality. If you want a lot of empirical data on detection versus Metasploit and built-ins check out g0tm1k’s article here: http://g0tmi1k.blogspot.com/2011/10/analysis-encoding-files.html. I think it sheds some light on why I’m approaching this in such a complex way. You might be amazed at how hard it is to find source and compilation steps for known detected malware. It makes evasion talks incredibly hard to give in a realistic fashion. Hence, my reliance on Metasploit payloads which should surely be detected… Just notes for those who might be critical of this approach.

Final Notes

A couple of final notes. If you’re worried about UAC, that’s specified in the same rc file as our resources were. You can use parent’s privileges, ask for administrator, or let the OS guess. SET got around this by giving executables highly random names at one point. Evidently the windows heuristic for that is “probably shouldn’t run as administrator” and launches as non-administrator.

I combined a few different methods in this last part of the tutorial. We used simple self-modifying code at the function level, embedding additional resources inside a single executable (encrypting or obfuscating those resources would be relatively simple given the layout of the code) and reflective DLL injection. This should serve as a broad range introduction to antivirus evasion, including rolling your own payloads using several different hands on methods.

If we wanted to be able to automate encoding of the function during our build process we would need to specify an additional function, if we don’t export our encoded function, or executable; if we do, that contains all function names, loads the DLL, gets the function base address, and calls the after payload section. We’ll do it with binary patching in a debugger since the egg-hunter like code isn’t in this tutorial.

We borrowed the reflect inject code for this tutorial from https://github.com/fancycode/MemoryModule/. Reflective injection makes it hard to link against DLLs unless we do the fix-ups to the PEB. So we either use monolithic code sections, statically link, or we have to drop files. One strength of encoding at the function level is a constantly changing memory signature. This means that even if in memory scanning is implemented and reflective injection were detected. It would be incredibly difficult to craft a definition that would consistently catch this type of malicious code.

References for borrowed code

Reflective injection – https://github.com/fancycode/MemoryModule/