Introduction

The first Article of this series touched the most significant aspect of the MSIL code Instructions, for instance, how a program written in ILASM, and how to define the basic components (classes, fields, function and methods). In this article, we will continue working with IL code various data type, opcode instructions and learn remaining sophisticated features (Interface, Boxing, and Branching) of the runtime and ILASM. We’ll get to an understanding of details analysis of each opcode instruction and, how to integrate an IL code into an existing high-level C# written code and how to convert an already built C# code into IL code directly, in order to free the programmer from writing complex IL instruction code.

CIL Data Types

CIL also has the provision of data type like other high level languages in order to map the data into their specific compartment. The following table demonstrates how a .NET base class type maps to the equivalent C# keyword, and how each C# keyword maps into CIL codes with constant.

C# Data Type CIL Data Type .NET Base Class CIL Constant
Int Int32 System.Int32 I4
Byte Unsigned int8 System.Byte U1
Sbyte Int8 System.SByte I1
Short Int16 System.Int16 I2
Uint Unsigned int32 System.UInt32 U4
Long Int64 System.Int64 I8
Char Char System.Char CHAR
Float Float32 System.Single R4
Double Float64 System.Double R8
Bool Bool System.Boolean BOOLEAN
String String System.String -
Object Object System.Object -
Void Void System.Void VOID

MSIL Code Labels

Perhaps, you would have noticed the earlier article sample codes that each line of implementation is prefixed or annotated with special token of forms IL_XXX (e.g., IL_0000, IL _0002). These tokens are called code labels and they are completely optional (can be named in any manner). When we dump the assembly source code file using ILDASM.exe, it will automatically generate code labels. However, you may change them to make the code more descriptive. We can extract the token’s information from an assembly by using following command:

Want to learn more?? The InfoSec Institute Reverse Engineering course teaches you everything from reverse engineering malware to discovering vulnerabilities in binaries. These skills are required in order to properly secure an organization from today's ever evolving threats. In this 5 day hands-on course, you will gain the necessary binary analysis skills to discover the true nature of any Windows binary. You will learn how to recognize the high level language constructs (such as branching statements, looping functions and network socket code) critical to performing a thorough and professional reverse engineering analysis of a binary. Some features of this course include:

  • CREA Certification
  • 5 days of Intensive Hands-On Labs
  • Hostile Code & Malware analysis, including: Worms, Viruses, Trojans, Rootkits and Bots
  • Binary obfuscation schemes, used by: Hackers, Trojan writers and copy protection algorithms
  • Learn the methodologies, tools, and manual reversing techniques used real world situations in our reversing lab.

ILDASM /Token test.exe

This command produces the corresponding Token information with IL_XXXX as:

.method /*06000001*/ private hidebysig static 
        void  Main(string[] args) cil managed
{
  .entrypoint
  .maxstack  2
  .locals init ([0] string str)

  IL_0000:  nop
  IL_0001:  ldstr      "Ajay"
  IL_0006:  stloc.0
  IL_0007:  ldstr      "Hello" 
  IL_000c:  ldloc.0
  IL_000d:  call       string 
  IL_0012:  call       void 
  IL_0017:  nop
  IL_0018:  ret
} 

We can transform the label’s information to more descriptive information. It doesn’t matter what information we are putting in the label because they are optional as:

Nothing_1     :  nop
Load String   :  ldstr      "Ajay"
Memory_Loca1  :  stloc.0
Load Constant :  ldstr      "Hello" 
Memory_Loca2  :  ldloc.0
Print_console :  call       string 
Call Method   :  call       void 
Nothing_2     :  nop
Leave Function:  ret

MSIL Opcodes

This section will explain various MSIL instructions which are generally termed as Opcodes (operation codes). Some of the instructions already appeared in the previous article sample codes, but they have not been reviewed in detail so far. Opcodes typically, a CIL token used to build the implementation logic, i.e. if you need to load a string variable into memory, you have to use ldstr opcode rather than friendly Load Function. The complete set of CIL opcodes can be grouped into three broad segments as the following:

  • Retrieve Instructions
  • Control Instructions
  • Operations Instructions

Retrieve (Store) Instruction

Opcode Description
Starg Move a value from the stack to a method argument
Stloc Extract current value from memory and place them into local variable
Stobj Copies value from memory to a specific type memory address location
Pop Remove the value currently on the top of stack

Control Instruction

Opcode Description
Ret This opcode is used to exit a method.
Call Used to call another method on a given type.
Add,sub,div,mul,rem These allow adding, subtract, dividing and multiplying two values.
Box, unbox Convert value type to reference type and vice-versa
And, or,not, xor Perform bitwise operation
Ceg,clt,cgt Compare two value ( equality, less than, greater than)
Blt,bgt,beg,ble Control branching logic by placing break code ( if less than, greater than , if equal and if less than or equal to)

Operation (load) Instruction

Opcode Description
Ldstr ldstr(Load the string) loads string value onto the memory (stack)
Ldarg Ldarg (Load argument) loads a method arguments onto the memory
Ldc Ldc (load Constant) loads constants value onto memory
Ldobj Ldobj (load object) place objects value into memory

Other Instruction

Opcode Description
brtrue.s (Branch True) If value is non-zero, execute instruction
Brfalse.s Branch False) if branch condition is false then continue
br.s Jump to another instruction
Castclass Cast an instance to different type
Throw Throw an application exception
Rethrow Broadcast an exception
Break Insert a debugger breakpoint
Callvirt For calling virtual methods

Details Analysis of Opcode Instruction

We have concentrated on individual opcode instructions up till now. In order to understand the each opcode instruction’smeaning in detail, we are presenting some complex sample code which encapsulates numerous tasks such as executing a loop, creating new class types etc…basically our prime motive is to encounter multiple instructions sets.

The following C# code performs addition of two local integer variables;

public int Operation(int a,int b)
 {
    return (a + b);
 }

Now the aforementioned code will convert into its corresponding CIL code, and will be interpreted in opcode terminology as following:

.method public hidebysig instance int32 Operation(int32 a,int32 b) cil managed
  {
    .maxstack  2
    .locals init ([0] int32 a,[0] int32 b)    // Initialize the Local variable “a” and “b”
    IL_0000:  nop                                   // Blank Instruction, no operation
    IL_0001:  ldarg.1                              // Here, Loading “a” into Memory
    IL_0002:  ldarg.2                              // Here, Loading “b” into Memory
    IL_0003:  add                                   // Performing Addition of a and b
    IL_0004:  stloc.0                               // Store this value at index ‘0’
    IL_0005:  br.s       IL_0007                // Jump to IL_0007 instruction

    IL_0007:  ldloc.0                               // Load this value at index ‘0’
    IL_0008:  ret                                     // exiting from Method
  }

Branching

The iteration is performed using “for”, “for each” and “while” in the loop construct for C# programming language. Here, the following C# code simply is executed for loop till 7 and performs the addition of all numbers from 1 to 5 until the loop local variable reaches to 5 as the following;

public int braching()
  {
    int x = 0;
    for (int I =0;i<7;i++)
      {
        x = x + i;
        if (i == 5)
           break;
      }
            return x;
   }

Here, the””blt”, “br” and “bgt” opcodes are used to control breaks in the flow when some condition has been met. Here, the CIL opcode labels would be interpreted as the following:

Want to learn more?? The InfoSec Institute Reverse Engineering course teaches you everything from reverse engineering malware to discovering vulnerabilities in binaries. These skills are required in order to properly secure an organization from today's ever evolving threats. In this 5 day hands-on course, you will gain the necessary binary analysis skills to discover the true nature of any Windows binary. You will learn how to recognize the high level language constructs (such as branching statements, looping functions and network socket code) critical to performing a thorough and professional reverse engineering analysis of a binary. Some features of this course include:

  • CREA Certification
  • 5 days of Intensive Hands-On Labs
  • Hostile Code & Malware analysis, including: Worms, Viruses, Trojans, Rootkits and Bots
  • Binary obfuscation schemes, used by: Hackers, Trojan writers and copy protection algorithms
  • Learn the methodologies, tools, and manual reversing techniques used real world situations in our reversing lab.
.method public hidebysig instance void braching() cil managed
  {
    .maxstack  2
    .locals init ([0] int32 x, [1] int32 i, [2] bool CS$4$0000)  // Initialize the Local variable “x” and “i”
    IL_0000:  nop                                                                  // No Instruction
    IL_0001:  ldc.i4.0                                                             // Load “x” value into memory
    IL_0002:  stloc.0                                                              // Store ‘x’ value at index ‘0’
    IL_0003:  ldc.i4.0                                                             // Load “i” value into memory
    IL_0004:  stloc.1                                                              // Store ‘i’ value at index ‘1’
    IL_0005:  br.s       IL_001e                                                // Jump to IL_001e instruction

    IL_0007:  nop                                                                  // No Instruction
    IL_0008:  ldloc.0                                                              // Load Value of ‘x’ variable at index ‘0’
    IL_0009:  ldloc.1                                                              // Load Value of ‘i’ variable at index ‘1’
    IL_000a:  add                                                                   // Add current value on the memory at index ‘0’
    IL_000b:  stloc.0                                                               // Store addition value in the local variable 0
    IL_000c:  ldloc.1                                                               // Load value of local variable 1 in memory
    IL_000d:  ldc.i4.5                                                              // Load Integer value 5 into memory
    IL_000e:  ceq                                                                    //Test the Equality
    IL_0010:  ldc.i4.0                                                              // Load integer value 0 into memory
    IL_0011:  ceq                                                                   // compare two variables
    IL_0013:  stloc.2                                                               // Retrieve value from memory and store in variable 2
    IL_0014:  ldloc.2                                                               // Load value of local variable 2 on memory
    IL_0015:  brtrue.s   IL_0019                                              // branch to IL_0019

    IL_0017:  br.s       IL_0026                                                 // Jump to IL_0026 instruction

    IL_0019:  nop                                                                    // No Instruction
    IL_001a:  ldloc.1                                                                // Load value of local variable 1 on memory
    IL_001b:  ldc.i4.1                                                               // Load Integer value 0 into memory
    IL_001c:  add                                                                    // Perform Addition
    IL_001d:  stloc.1                                                                // Store Addition value in the local variable 1
    IL_001e:  ldloc.1                                                                // Load value of local variable 1 in memory
    IL_001f:  ldc.i4.7                                                                // Load integer value 7 into memory
    IL_0020:  clt                                                                      // compare less than
    IL_0022:  stloc.2                                                                // Get value from stack and store in variable 2
    IL_0023:  ldloc.2                                                                // Load the value of 2 in memory
    IL_0024:  brtrue.s   IL_0007                                               // branch to IL_0007 ( if the integer value is non-zero)

    IL_0026:  ldloc.0                                                                // Load the value of 0 in memory
    IL_0027:  call       void [mscorlib]System.Console::WriteLine(int32)     // Calling Console.WriteLine() method
    IL_002c:  nop                                                                                        // No Operation
    IL_002d:  ret                                                                                         // Exit from the method
  }

Boxing

Boxing is the process of explicitly assigning a value type to a Reference type (System. Object). When we box a value, the CLR allocates a new object on the heap and copies the values of 10 into instance. The opposite operation is unboxing which converts a value held in reference, back into corresponding value types as the following:

static void BoxUnbox()
   {
     int x = 10;
     //Boxed
     object bObj = x;
     //Unboxed
     int y = (int)bObj;
     Console.WriteLine(y);
   }

If you examine your compile code using ILDASM, you would encounter boxing and unboxing entries in the CIL code as the following:

.method private hidebysig static void  BoxUnbox() cil managed
{
  // Code size       26 (0x1a)
  .maxstack  1
  .locals init ([0] int32 x,[1] object bObj,[2] int32 y)                             // Initialize the Local variable “x”, “bObj” and “y”
  IL_0000:  nop                                                                                    // Blank Instruction, no operation
  IL_0001:  ldc.i4.s   10                                                                         // Load Integer value 10 into memory
  IL_0003:  stloc.0                                                                                // Store “x” value into local variable 0
  IL_0004:  ldloc.0                                                                                // Load the value of 0 onto memory
  IL_0005:  box        [mscorlib]System.Int32                                   //Boxing (value to object)
  IL_000a:  stloc.1                                                                                // Store bObj value into local variable 1
  IL_000b:  ldloc.1                                                                                // Load the value of 1 onto memory
  IL_000c:  unbox.any  [mscorlib]System.Int32                              //Unboxing (object to value)
  IL_0011:  stloc.2                                                                                 // Store “y” value into local variable 2
  IL_0012:  ldloc.2                                                                                 // Load the value of 2 onto memory
  IL_0013:  call       void [mscorlib]System.Console::WriteLine(int32)   // Print y value via WriteLine()
  IL_0018:  nop                                                                                      // Blank Instruction, no operation
  IL_0019:  ret                                                                                        // exiting from Method
}

Interface

Interface can be defined in the MSIL using the interface keyword directly. Fields are not allowed in interface and member function must be public, abstract and virtual. A class uses the implemented keyword to list the interface that must be implemented as following:

.assembly CILComplexTest
{
}
.assembly extern mscorlib
{
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )
  .ver 4:0:0:0
}
// Interface Definition
.class interface public abstract auto ansi CILComplexTest.Repository
{
  .method public hidebysig newslot abstract virtual  instance void  Display() cil managed
  {
  } // end of method Repository::Display

} // end of class CILComplexTest.Repository

// Display() method
.class public auto ansi beforefieldinit CILComplexTest.test  extends [mscorlib]System.Object
                                                                 implements CILComplexTest.Repository
{
  .method public hidebysig newslot virtual final  instance void  Display() cil managed
  {
    // Code size       13 (0xd)
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "Hello"
    IL_0006:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000b:  nop
    IL_000c:  ret
  } // end of method test::Display

// Main class
.class private auto ansi beforefieldinit CILComplexTest.Program extends [mscorlib]System.Object
{
  .method private hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    // Code size       13 (0xd)
    .maxstack  8
    IL_0000:  nop
    IL_0001:  newobj     instance void CILComplexTest.test::.ctor()
    IL_0006:  call       instance void CILComplexTest.test::Display()
    IL_000b:  nop
    IL_000c:  ret
  } // end of method Program::Main
//constructor
  .method public hidebysig specialname rtspecialname  instance void  .ctor() cil managed
  {
    // Code size       7 (0x7)
    .maxstack  8
    IL_0000:  ldarg.0
    IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
    IL_0006:  ret
  } // end of method Program::.ctor
}

MSIL Code Generation

The .NET framework offers a utility ILDASM.exe to convert the existing C# code into MSIL code in order to spare the hassle of manually writing CIL code which is deemed as one the most error-prone tasks, because each set of instructions are bizarre in terms of syntax specification and stipulates different meanings.

Suppose we are writing a program using CIL opcode instruction in which we are simply flashing a “Hello Ajay” message over the screen. Despite having a simple nature of such programming, there are still lots of complications when we opt MSIL programming language as a medium to execute instruction. Because MSIL opcode instructions are not in the English language format. However, there is a trick, first write such instruction code implementation by using user friendly C# language and compile this project file, then its corresponding executable is created in the Bin/Debug folder.

using System;

namespace CILComplexTest
{
    class xyz
    {
        private string msg;
        public xyz(string msg)
        {
            this.msg = msg;
        }

        public string display()
        {
            return "Hello " + msg;
        }

    }
    class Program
    {
        static void Main(string[] args)
        {
            xyz obj = new xyz("Ajay");
            Console.WriteLine(obj.display());

        }
    }
}

Now, open the Visual Studio Command prompt and go the project Bin/Debug folder and execute this command in order to convert this existing C# code instruction into MSIL code as the following;

ILDASM CILComplexTest.exe /out:test.il


Notice that test.il file is created in the Bin/Debug folder which has the same set implementations instructions as its C# counterpart code. Now just open this file using any editor and compile it using the ILASM utility. Here’s the automatically generated IL code as the following:

//  Microsoft (R) .NET Framework IL Disassembler.  Version 4.0.30319.1
//  Copyright (c) Microsoft Corporation.  All rights reserved.
// Metadata version: v4.0.30319
.assembly extern mscorlib
{
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )                         // .zV.4..
  .ver 4:0:0:0
}
.assembly CILComplexTest
{
  .custom instance void [mscorlib]System.Runtime.Versioning.TargetFrameworkAttribute::.ctor(string) = ( 01 00 29 2E 4E 45 54 46 72 61 6D 65 77 6F 72 6B   // ..).NETFramework
                                                                                                        2C 56 65 72 73 69 6F 6E 3D 76 34 2E 30 2C 50 72   // ,Version=v4.0,Pr
                                                                                                        6F 66 69 6C 65 3D 43 6C 69 65 6E 74 01 00 54 0E   // ofile=Client..T.
                                                                                                        14 46 72 61 6D 65 77 6F 72 6B 44 69 73 70 6C 61   // .FrameworkDispla
                                                                                                        79 4E 61 6D 65 1F 2E 4E 45 54 20 46 72 61 6D 65   // yName..NET Frame
                                                                                                        77 6F 72 6B 20 34 20 43 6C 69 65 6E 74 20 50 72   // work 4 Client Pr
                                                                                                        6F 66 69 6C 65 )                                  // ofile
  .custom instance void [mscorlib]System.Reflection.AssemblyTitleAttribute::.ctor(string) = ( 01 00 0E 43 49 4C 43 6F 6D 70 6C 65 78 54 65 73   // ...CILComplexTes
                                                                                              74 00 00 )                                        // t..
  .custom instance void [mscorlib]System.Reflection.AssemblyDescriptionAttribute::.ctor(string) = ( 01 00 00 00 00 )
  .custom instance void [mscorlib]System.Reflection.AssemblyConfigurationAttribute::.ctor(string) = ( 01 00 00 00 00 )
  .custom instance void [mscorlib]System.Reflection.AssemblyCompanyAttribute::.ctor(string) = ( 01 00 00 00 00 )
  .custom instance void [mscorlib]System.Reflection.AssemblyProductAttribute::.ctor(string) = ( 01 00 0E 43 49 4C 43 6F 6D 70 6C 65 78 54 65 73   // ...CILComplexTes
                                                                                                74 00 00 )                                        // t..
  .custom instance void [mscorlib]System.Reflection.AssemblyCopyrightAttribute::.ctor(string) = ( 01 00 12 43 6F 70 79 72 69 67 68 74 20 C2 A9 20   // ...Copyright ..
                                                                                                  20 32 30 31 33 00 00 )                            //  2013..
  .custom instance void [mscorlib]System.Reflection.AssemblyTrademarkAttribute::.ctor(string) = ( 01 00 00 00 00 )
  .custom instance void [mscorlib]System.Runtime.InteropServices.ComVisibleAttribute::.ctor(bool) = ( 01 00 00 00 00 )
  .custom instance void [mscorlib]System.Runtime.InteropServices.GuidAttribute::.ctor(string) = ( 01 00 24 31 39 63 36 61 36 65 34 2D 61 64 63 65   // ..$19c6a6e4-adce
                                                                                                  2D 34 33 38 65 2D 61 38 37 31 2D 32 36 62 65 32   // -438e-a871-26be2
                                                                                                  31 37 31 33 61 33 63 00 00 )                      // 1713a3c..
  .custom instance void [mscorlib]System.Reflection.AssemblyFileVersionAttribute::.ctor(string) = ( 01 00 07 31 2E 30 2E 30 2E 30 00 00 )             // ...1.0.0.0..

  // --- The following custom attribute is added automatically, do not uncomment -------
  //  .custom instance void [mscorlib]System.Diagnostics.DebuggableAttribute::.ctor(valuetype [mscorlib]System.Diagnostics.DebuggableAttribute/DebuggingModes) = ( 01 00 07 01 00 00 00 00 )

  .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilationRelaxationsAttribute::.ctor(int32) = ( 01 00 08 00 00 00 00 00 )
  .custom instance void [mscorlib]System.Runtime.CompilerServices.RuntimeCompatibilityAttribute::.ctor() = ( 01 00 01 00 54 02 16 57 72 61 70 4E 6F 6E 45 78   // ....T..WrapNonEx
                                                                                                             63 65 70 74 69 6F 6E 54 68 72 6F 77 73 01 )       // ceptionThrows.
  .hash algorithm 0x00008004
  .ver 1:0:0:0
}
.module CILComplexTest.exe
// MVID: {631F60E4-6E43-4355-BC70-DAF16F1FE33A}
.imagebase 0x00400000
.file alignment 0x00000200
.stackreserve 0x00100000
.subsystem 0x0003       // WINDOWS_CUI
.corflags 0x00000003    //  ILONLY 32BITREQUIRED
// Image base: 0x003E0000

// =============== CLASS MEMBERS DECLARATION ===================

.class private auto ansi beforefieldinit CILComplexTest.xyz
       extends [mscorlib]System.Object
{
  .field private string msg
  .method public hidebysig specialname rtspecialname
          instance void  .ctor(string msg) cil managed
  {
    // Code size       17 (0x11)
    .maxstack  8
    IL_0000:  ldarg.0
    IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
    IL_0006:  nop
    IL_0007:  nop
    IL_0008:  ldarg.0
    IL_0009:  ldarg.1
    IL_000a:  stfld      string CILComplexTest.xyz::msg
    IL_000f:  nop
    IL_0010:  ret
  } // end of method xyz::.ctor

  .method public hidebysig instance string
          display() cil managed
  {
    // Code size       22 (0x16)
    .maxstack  2
    .locals init (string V_0)
    IL_0000:  nop
    IL_0001:  ldstr      "Hello "
    IL_0006:  ldarg.0
    IL_0007:  ldfld      string CILComplexTest.xyz::msg
    IL_000c:  call       string [mscorlib]System.String::Concat(string,
                                                                string)
    IL_0011:  stloc.0
    IL_0012:  br.s       IL_0014

    IL_0014:  ldloc.0
    IL_0015:  ret
  } // end of method xyz::display

} // end of class CILComplexTest.xyz

.class private auto ansi beforefieldinit CILComplexTest.Program
       extends [mscorlib]System.Object
{
  .method private hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    // Code size       31 (0x1f)
    .maxstack  2
    .locals init (class CILComplexTest.xyz V_0)
    IL_0000:  nop
    IL_0001:  ldstr      "Ajay"
    IL_0006:  newobj     instance void CILComplexTest.xyz::.ctor(string)
    IL_000b:  stloc.0
    IL_000c:  ldloc.0
    IL_000d:  callvirt   instance string CILComplexTest.xyz::display()
    IL_0012:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_0017:  nop
    IL_0018:  call       valuetype [mscorlib]System.ConsoleKeyInfo [mscorlib]System.Console::ReadKey()
    IL_001d:  pop
    IL_001e:  ret
  } // end of method Program::Main

  .method public hidebysig specialname rtspecialname
          instance void  .ctor() cil managed
  {
    // Code size       7 (0x7)
    .maxstack  8
    IL_0000:  ldarg.0
    IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
    IL_0006:  ret
  } // end of method Program::.ctor

} // end of class CILComplexTest.Program

Summary

This article provides an overview of the various CIL Data type syntax and opcode instructions. We have seen detail analysis of each instruction opcdoe meaning. We have also looked at complex type codes such as: boxing, unboxing, branching, interface in form of CIL opcodes. Finally, you took an introductory look at the process of conversion existing C# source code file to MSIL opcode instruction using ILDASM utility.