Demystifying Dot NET reverse engineering, part 1: Big introduction
This, and all upcoming parts, are made with a strict and pure educational purpose just to gain insights into dot NET programs. What you’re going to do with this and all upcoming parts is your own responsibility. I will not be held responsible for your eventual action and use of this.
All techniques used in this and all upcoming parts have been used only to demonstrate theories and methods described. The scope of this paper and all upcoming parts as well as any other paper written by me (Soufiane Tahiri) is of sharing knowledge and improving reverse engineering techniques.
And please note that disassembling and / or reversing software is prohibited by almost all international laws, if you like software then please BUY IT.
This will be a kind of “saga” of papers that will talk essentially talk about dot NET oriented reverse engineering. We are already on the stable version 4.5 (4.5.50709) / 15 August 2012 of Microsoft .NET Frameworks for Visual Studio 2012 and distributed with Windows 8, Windows Server 2012, but we are still not seeing enough papers about reversing applications developed using dot NET technology.
I’ll try to fill this lack of papers, and this first article is supposed to be a part of an upcoming series that would explain some basics and clarify dot NET architecture to the extent of making a few concepts clearer for reverse engineers.
Before starting, I strongly recommend you take a few hours to teach yourself at least one of the dot NET languages, and I recommend either Visual Basic .NET or C#. It may seem to some that reversing dot NET programs is way easier then reversing “traditional” programs, which is wrong in my point of view.
The concept of dot NET can be easily compared to the concept of JAVA and Java Virtual Machine, at least when talking about compilation. Unlike most traditional programming languages like C/C++, applications developed using dot NET frameworks are compiled to a Common Intermediate Language (CIL or Microsoft Common Intermediate Language MSIL) – which can be compared to bytecode when talking about Java programs – instead of being compiled directly the native machine executable code, the Dot Net Common Language Runtime (CLR) will translate the CIL to the machine code at runtime.
This will definitely increase execution speed, but has some advantages since every dot NET program will keep all classes’ names, functions’ names variables and routines’ names in the compiled program, and this, from a programmer’s point of view, is such a great thing since we can make different parts of a program using different programming languages available and supported by frameworks.
Figure 1. Visual overview of the Common Language Infrastructure (CLI) / Wikipedia
What does this mean to a reverse engineer?
Basically every compiled dot NET application is not more than its Common Intermediate Language representation, which still has all the pre-coded identifiers just the way they were typed by the programmer.
Technically, knowing this Common Intermediate Language will simply lead to identifying high level language instructions and structure, which means that from a compiled dot NET program we can reconstitute back the original dot NET source code, with even the possibility of choosing to which dot NET programming language you want this translation to be made and this is a pretty annoying thing!
When talking about dot NET applications, we talk about “reflection” rather than “decompilation”, this is a technique which lets us discover class information or assembly at runtime, this way we can get all properties, methods, functions, etc. with all parameters and arguments, we can also get all interfaces, structures, etc.
Nowadays there are plenty of tools that can “reflect” the source code of a dot NET compiled executable; a good and really widely used one is “Reflector” with which you can browse classes, decompile and analyze dot NET programs and components. It allows browsing and searching CIL instructions, resources and XML documentation stored in a dot NET assembly. But this is not the only tool we will need when reversing dot NET applications. I’ll try to introduce each one every time we are in need of it.
What will you learn from this first paper?
This first essay will show you how to deal with Reflector to reverse a very simple practice oriented Crack-ME I did the very basic way, and it does not pretend to explain real software protection (that will come via the next articles).
Our Crack-ME is a simple one form dot NET application that asks us for a password. I made it to show you some very basics about dot NET reverse engineering, usually we start by looking at the target and see its behavior, this way we can determinate what should we look for!
This displays a nasty message box when filling in a bad password:
Let us now see why this piece of compiled code shows us “Invalid password”. Open up Reflector, at this point we can configure Reflector via the language’s drop down box in the main toolbar, and select whatever language you may be familiar with. I’ll choose Visual Basic, but the decision is up to you of course.
Figure 2. Choosing the main language
Load this Crack-ME up into it (File > Open menu) and look for anything that would be interest us.
Technically, the Crack-ME is analyzed and placed in a tree structure. We will develop nodes that interest us:
Figure 3. Our Crack-ME is loaded up
You can expand the target by clicking the “+” sign:
Keep on developing the tree and see what is inside of this Crack-ME:
Now we can see that our Crack-ME is composed of References, Code and Resources.
- Code: this part contains the interesting things, and everything we will need at this point is inside of InfoSecInstitute_dotNET_Reversing (which is a Namespace).
- References: is similar to “imports”, “includes” used in other PE files.
- Resources: for now this is irrelevant to us, but it is similar to ones in other Windows programs.
By expanding the code node we will see the following tree:
Reflector detects the only form our Crack-ME has called Form1, with all variables, procedures, functions and Graphical User Interface elements, and as explained above it recognized original names, which makes things easier for us and lets us guess what everything is supposed to do. For example, the function btn_Chk_Click(Object, EventArgs) that seems to be triggered when the button “btn_Chk” is clicked, btn_About_Click(Object, EventArgs) which is presumably called when button “btn_About” is clicked…
Since this is an application to practice, it has not a lot of forms and functions, which makes things easier for us; we can now say that what interests us is what the function btn_Chk_Click () has to say. We want to know what our Crack-ME actually does once we click on btn_Chk, and this can be translated to the language we choose (refer to Figure 2).
To see actual source code, double click on the function name. Reflector shows us the decompiled source code in the language chosen, in this example we will get:
Nothing magical, we can see clearly as sun what this function does, everyone with very basic knowledge in any programming language can see that this function / procedure tests if the password typed is “p@55w0rd!”, and by checking this we get:
This is pretty easy! We can explore more abilities like patching this Crack-ME to accept all typed passwords, which will be certainly the next part of this course.
I tried to avoid giving you complex explanations regarding dot NET application and decided not to flood you in this first part with information that may discourage you. I tried to show you the clearest way to deal with a very basic protection, hopefully this taught you few more things that matter while reversing dot NET programs. In the next parts, I’ll try to introduce more in-depth knowledge about how dot NET programs really work