Demystifying dot NET reverse engineering - PART 2: Introducing Byte Patching
For part 1 of this series, please click here.
Become a certified reverse engineer!
We covered in the first part the very basics regarding dot NET programs, how they are compiled (which we will see again a little bit more in depth) and how we can see inside them using Reflector. We saw how easy is to bypass protections based on hardcoded serials or passwords, this was really very basic and almost always we have to do more to go in depth of real programs protections.
Practicing reverse engineering in general, and not only when we talk about reversing dot NET programs, is not only about getting serials or passwords; reverse engineering is the art of playing with bytes, it's about changing bytes to alter functionalities, to disable or to enable some of them, in some cases it's used to add some entire functionalities, and this is not always a simple task. To do this, mastering assembly is a must, and not only this, but finding in the program the right place and figuring out what bytes to change. This is not usually a simple thing.
In this paper and the upcoming one, we will try to practice some byte changing (commonly called "patching") practices using different homemade targets. We will reuse the first file, which is "CrackMe#1-InfoSecInstitute-dotNET-Reversing", and a second target that will be "ReverseMe#1-InfoSecInstitute-dotNET-Reversing".
Compiling dot NET
As seen in the first part, every dot NET program is coded using some high level dot NET programming language (vb.NET, C#...) and when compiling, this high level programming language is taken to a low level one which is Microsoft Intermediate Language (MSIL) and can be considered as the lowest common denominator for dot NET. We can build a full application using nothing but only MSIL, and though this is not interesting from a point of view of a "dot NET developer", it may be more helpful as it gives an insight into how Common Language Runtime (CLR) actually works and runs our high level code.
Just like Java and Java Virtual Machine, any dot NET program is first compiled (if we can permit saying this) to a IL or MSIL language and is executed in a runtime environment: Common Language Runtime (CLR), and then is recompiled or converted on its execution to local native instructions like x86 or x86-64, which are set depending on what type of processor is currently used. This is done by Just In Time (JIT) compilation used by the CLR.
To recapitulate, the CRL uses a JIT compiler to compile the IL (or MSIL) code which is stored in a Portable Executable (our compiled dot NET high level code) into platform specific code, and then the native code is executed. This means that dot NET is never interpreted, and the use of IL and JIT is to ensure the dot NET code is portable.
The figure below demonstrates this process:
The aim of this paper is to introduce you to some new IL instructions. Beyond the obvious curiosity factor, understanding IL and how to manipulate it will just open the doors of playing around with any dot NET programs and in our case, figuring out our program's security systems weakness.
Before going ahead, it's wise to say that CLR executes the IL code. Allowing this way of making operations and manipulating data, the CRL does not handle directly the memory, it uses instead a stack, which is an abstract data structure which works according to the "last in first out" basis. We can do two important things when talking about the stack: pushing and pulling data. By pushing data or items into the stack, any already present items just go further down in this stack. By pulling data or items from the stack, all present items move upward toward the beginning of it. We can handle only the topmost element of the stack.
Well, let's return back to our "CrackMe#1-InfoSecInstitute-dotNET-Reversing". By typing in a wrong password, the Crack ME shows us this message box:
We got in the first part of "Demystifying dot NET reverse engineering" the hard coded password, now we will see how to force this Crack Me into accepting all wrong passwords by only changing some bytes.
Let's go back to our Crack ME #1 opened into Reflector. We have the original source code that checks the password typed:
And by switching to IL code view we get this:
L.000b: ldstr "p@55w0rd!"
L.0017: bne.un.s L.002d
L.0019: ldstr "Congratulations !"
L.001e: ldc.i4.s 0x40
L.0020: ldstr "Correct!"
L.0025: call valuetype [Microsoft.VisualBasic]Microsoft.VisualBasic.MsgBoxResult [Microsoft.VisualBasic]Microsoft.VisualBasic.Interaction::MsgBox(object, valuetype [Microsoft.VisualBasic]Microsoft.VisualBasic.MsgBoxStyle, object)
L.002b: br.s L.003f
L.002d: ldstr "Invalid password"
L.0032: ldc.i4.s 0x10
L.0034: ldstr "Error!"
L.0039: call valuetype [Microsoft.VisualBasic]Microsoft.VisualBasic.MsgBoxResult [Microsoft.VisualBasic]Microsoft.VisualBasic.Interaction::MsgBox(object, valuetype [Microsoft.VisualBasic]Microsoft.VisualBasic.MsgBoxStyle, object)
This is the direct representation of the internal Intermediate Language, and this is the level we will be prompted to deal with to make changes. As said above, dot NET is essentially a stack machine, and we will need some references to understand what this IL code means. We can find a listing of all IL assembly instructions and their use. I'll try to expose and explain the most important ones relative to reverse engineering uses.
IL Instructions start right after the ".maxstack #" line, the first line is L.0000: ldarg.0 which loads argument 0 onto the stack, and this may be easily compared to NOP instruction in traditional assembly code, but its actual byte code is "00", not "90" as in x86 assembly. If we open any program using the hexadecimal editor, we will find a series of byte codes from the first line to the last one, which is byte code representing every IL instruction composing our program, and this is what we can change to let the program do things which are not "supposed" to be done when we want to invert some tests or jumps or alter any part of the code.
The use of a hexadecimal editor is in some ways the traditional "dirty" way to replace actual bytes; we will discover how to do it this way and we'll see how other more "clean" ways work. To locate the offset of bytes we want to change in our hexadecimal editor, we have to look for these bytes to locate the series of bytes we are willing to change.
Every IL instruction has its specific byte representation. I'll try to introduce to you a non-exhaustive list of most important IL instructions, their functions and the actual bytes representation, and you are not supposed to learn them but use this list as a kind of reference:
Now that we have a quite good IL instructions reference, we can get back to Reflector and our Crack Me to start imagining how we can get rid of its protection:
We can see from the picture above that the portion of code:
Is just translated to this:
L_0017: bne.un.s L_002d
L_0019: ldstr "Congratulations !"
L_001e: ldc.i4.s 0x40
L_0020: ldstr "Correct!"
L_0025: call valuetype [Microsoft.VisualBasic]Microsoft.VisualBasic.MsgBoxResult [Microsoft.VisualBasic]Microsoft.VisualBasic.Interaction::MsgBox(object, valuetype [Microsoft.VisualBasic]Microsoft.VisualBasic.MsgBoxStyle, object)
Using the IL instructions reference we get:
- Call: Calls the method indicated by the passed method descriptor, which in this case calls the string comparison method
- ldc.i4.0: Pushes the integer value of 0 onto the evaluation stack as an int32.
- bne.un.s: Transfers control to a target instruction (short form) when two unsigned integer values or unordered float values are not equal.
Ldstr: Pushes a new object reference to a string literal stored in the metadata.
At this point only bne.un.s seems interesting and worth more explanation. If statements are in Intermediate Language, they are translated to a branch instruction, so bne stands for brach if not equal (BranchNotEqual) and it's used if the two values on the top of a stack are not equal. Then it jumps to line L_002d as you can see:
L_0017: bne.un.s L_002d
L_0019: ldstr "Congratulations !"
This starts making sense, and lets us think about how we can bypass typing in a valid password. Instead of showing "Congratulations!" if the password is correct, we can reverse it and force the program into showing us this message if a password is not correct. The instruction that does this is Beq.s which is, "Transfers control to a target instruction (short form) if two values are equal" (refer to the reference list above).
Technically we have two problems. First, we need to find out byte representation of the IL instruction we want to change. Second we do not have an actual offset of the instruction to go there directly in a hexadecimal editor.
Referring the list above we see that bne.un.s = 0x33 and Beq.s = 0x2E; to find the location in file where we have to make changes, we have to translate a few instructions to make a long enough searching string to find what we are looking for.
Always referring to the list above we get:
ldc.i4.0 = 0x16, bne.un.s L_002d = 0x33 and 0x?? Value representing the L_002d and ldstr = 0x72
So our searching string will look like 1633??72. Of course, using "??" means the use of regular expressions when doing search and means the use of a wildcard, and this depends on which hexadecimal editor you use. I'll use WinHex but you are free to use whatever hexadecimal editor you want:
Here we have to edit 16331472 to 162E1472. Always think about making backups before doing any changes just in case.
Testing our change
Let's now run our modified / patched version of our Crack Me and see if what we did is right:
Seems all right, but let's just see what our byte changing actually looks like inside Reflector:
And by switching to IL view mode we can see:
This still is a basic dot NET byte patching process, but is necessary to start from basics before dealing with many more relatively complicated things. We will see in the next chapter some advanced byte patching techniques and how to deal with them.
Become a certified reverse engineer!
- Link to download Crack Me#1 http://www.mediafire.com/?yjoh2f6bv4d6n4i
- Google for Reflector
- Google for WinHex