Malware analysis

C Code in Assembly

Srinivas
September 16, 2019 by
Srinivas

Reverse engineering analysts have a good grasp of C code language and how it’s converted into assembly listings. C code was designed to function as a short form of assembly language, which, despite being time-consuming to code, had inherent efficiencies. C code was able to capitalize on some of these efficiencies by employing code constructs. 

This article is an overview of C code in assembly, including variables and “if” statements, “for” and “while” loops, switch statements, arrays, structs, linked lists, stacks and heaps.

Variables

Variables are used in code to hold values. Based on where the variable is declared, variables are of two types — local variables and global variables. Values stored in a local variable are accessible within a function, whereas values stored in global variables can be accessed from anywhere in the program. 

The following C program shows how local and global variables are used in a C program.

#include <stdio.h>

int a = 10; /**global variable**/

void main()

{

int b = 20; /**local variable**/

a = a+b; 

printf("The new value of a is %dn", a);

}

The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded.

“If” statements

“If” statements are commonly used in C programming. They are used to change the control flow based on certain conditions. The following code example shows an if condition being used in a C program.

#include <stdio.h>

void main()

{

int a = 30;

int b = 20;

if (a > b){

printf("a is greater than bn");

}

else{

printf("b is greater than an");

}

}

The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded.

Loops

Loops are used in programming for executing repetitive tasks. They are used for executing a block of statements repeatedly until a given condition returns false. “For” loops have the following syntax:

for(initialization, condition, increment/decrement)

{

   //code to be executed until the condition fails.

}

The following program is an example of a “for” loop in C.

#include <stdio.h>

void main()

{

int i;

for (i=0; i<7; i++){

printf("value of a is %dn", i);

}

}

The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded.

“While” loops are another commonly used concept in programming and have a purpose similar to for loops. They are used for executing a block of statements repeatedly until a given condition fails. “While” loops have the following syntax.

while(condition) {

   //code to be executed until the condition fails

}

The following is an example of a while loop in C.

#include <stdio.h>

void main()

{

int i=0;

while(i<7){

printf("value of a is %dn", i);

i++;

}

}

When going through the disassembly of malware samples, one should be able to identify these for and while loops, as they are common in malware.

The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded.

Switch statements

Another commonly used concept in C is the switch statement. Switch statements can be used to write multiple code blocks, which are written against case labels and execute one among them. This is done by evaluating an expression and comparing the output with the values of each case label.

The following is the syntax of switch statements in C.

switch( expression )

{

    case value 1:

    //statements to be executed

    break;

    case value 2:

    //statements to be executed

    break;

    case value 3:

    //statements to be executed

    break;

    default:

    //statements to be executed

    break;

}

The following code is an example of how switch statements can be used in C programming.

#include <stdio.h>

void main()

{

int i = 3;

switch(i)

{

case 1: printf("Value entered is 1n");

     break;

case 2: printf("Value entered is 2n");

     break;

case 3: printf("Value entered is 3n");

     break;

default: printf("Value out of rangen");  

}

}

The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded.

Arrays and structs

Arrays and structs are used by programmers to store multiple items. They both operate in a similar fashion, but arrays are used to store elements of the same type while structs can have elements of different types. 

Malware authors can use arrays and structs in their code and it is important to be able to identify these code constructs when examining the disassembly of an executable.

The following snippet is an example of how arrays can be used in C programming.

#include <stdio.h>

int arr[] = {5,8,7,1};

void main()

{

int i;

for(i=0; i<4; i++){

printf("value from array is %dn", arr[i]);

}

}

The following snippet is an example of how structs can be used in C programming.

#include <stdio.h>

#include <string.h>

struct car

{

 int id;

 char brand[10];

};

void main()

{

struct car entry = {0};

entry.id = 1;

strcpy(entry.brand, "Audi");

printf("The brand is %sn", entry.brand);

}

Stack and heap

It is important to be aware of how stack and heap are used. Stack is used for static memory allocation and heap is used for dynamic memory allocation. It is also important to understand that stack is used when function calls are made. Function arguments and local variables are pushed onto the stack before the function definition is executed.

Become a certified reverse engineer!

Become a certified reverse engineer!

Get live, hands-on malware analysis training from anywhere, and become a Certified Reverse Engineering Analyst.

Conclusion

Reverse engineering malware requires analysts to understand how C code in assembly is linked to machine instructions. They need a foundation of knowledge that includes the purpose and capabilities of C code constructs like variables and “if” statements, switch statements, arrays, structs, linked lists, stacks, heaps, and in particular, which “for” and “while” loops are commonly used by malware. 

This article has provided an overview of C code programming concepts alongside coding example snippets. Regardless of skill level, knowledge and understanding of C code in assembly will prove fundamentally useful to students and professionals alike in their ongoing code analysis. 

Sources

  1. Brian W. Kernighan and Dennis M. Ritchie, "C Programming Language, 2nd Edition," Prentice Hall, April 1988
  2. Michael Sikorski and Andrew Honig, "Practical Malware Analysis," No Starch Press, February 2012
  3. Reverse Engineering for Beginners, Dennis Yurichev
Srinivas
Srinivas

Srinivas is an Information Security professional with 4 years of industry experience in Web, Mobile and Infrastructure Penetration Testing. He is currently a security researcher at Infosec Institute Inc. He holds Offensive Security Certified Professional(OSCP) Certification. He blogs atwww.androidpentesting.com. Email: srini0x00@gmail.com