Secure coding

Debugging your first x86 program

Srinivas
February 15, 2021 by
Srinivas

Debugging the compiled programs is one important aspect of learning x86 assembly language. When working with assembly programs, the only way to step through every single instruction written in the code is to debug the program using a debugger.

GDB is one of the most popular debuggers available for debugging Linux-based executables. GDB is also extensively used in exploit development and reverse engineering. This article focuses on understanding how to use GDB to step through the instructions of a given x86 assembly program.

See the previous article in the series, How to build a program and execute an application entirely built in x86 assembly.

The target executable

The readers will be introduced to using GDB in the later sections of the article. So, we will prepare a simple binary written in x86 assembly so we can use GDB against it to understand how GDB can be used to debug binaries. Following is the program we will use.

global _start

_start:

mov eax, 8

mov eax, 0xa

mov ebx, eax

mov ecx, [esp]

We created a file named mov.nasm, and it starts with a directive called global, which tells our linker where the entry point of this program is. We are specifying that the entry point of this program is _start.  The first instruction MOV EAX,8 moves the value 8 into the register EAX.

In the next instruction, we are moving 0xa into the EAX register. We are telling the program that we are moving a hex value that's decimal 10. Next, using the MOV EBX, EAX instruction, we are trying to move the value of a register into another register. Lastly, the MOV ECX, [ESP] instruction will essentially move the value which is pointed by the register ESP.

So, if we specify the instruction MOV ECX, [ESP], it will try to pick the address of ESP and it will move the value that's pointed by this ESP register into ECX.

Now let's use nasm to assemble this program. Let's type the following command.

nasm mov.nasm -o mov.o -f elf32

The format is going to be elf32 and the output file will be mov.o. Now let's link it using ld. This can be done using the following command.

ld mov.o -o mov -m elf_i386

mov is going to be the final binary. We will debug this.

Debugging using GDB and GEF

To be able to examine the registers and the values that are being moved into registers, let's open this program using gdb using the following command.

gdb ./mov

It looks as follows.

$ gdb ./mov

GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1

Copyright (C) 2020 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.

Type "show copying" and "show warranty" for details.

This GDB was configured as "x86_64-linux-gnu".

Type "show configuration" for configuration details.

For bug reporting instructions, please see:

<http://www.gnu.org/software/gdb/bugs/>.

Find the GDB manual and other documentation resources online at:

<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".

Type "apropos word" to search for commands related to "word"...

GEF for linux ready, type `gef' to start, `gef config' to configure

78 commands loaded for GDB 9.1 using Python engine 3.8

[*] 2 commands could not be loaded, run `gef missing` to know why.

Reading symbols from ./mov...

(No debugging symbols found in ./mov)

gef➤

We are seeing GEF terminal instead of a plain GDB terminal because GEF for gdb has been installed in this case. This makes our life so easy while debugging a program. GEF is an extension that automates a variety of commonly used GDB commands and displays the results with a Graphical User Interface appearance.

Setting up a breakpoint

Now let us set up a breakpoint at the entry point of this program. Type the following.

gef➤  break _start

Breakpoint 1 at 0x8049000

gef➤

_start is the entry point of this program and we are using the command break to set up a breakpoint at the entry point. Next, type run so the program will run and it will pause the execution at the entry point because we did set up a breakpoint.

gef➤  run

Starting program: /home/dev/x86/mov

Breakpoint 1, 0x08049000 in _start ()

[ Legend: Modified register | Code | Heap | Stack | String ]

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────

$eax   : 0x0

$ebx   : 0x0

$ecx   : 0x0

$edx   : 0x0

$esp   : 0xffffd220  →  0x00000001

$ebp   : 0x0

$esi   : 0x0

$edi   : 0x0

$eip   : 0x08049000  →  <_start+0> mov eax, 0x8

$eflags: [zero carry parity adjust sign trap INTERRUPT direction overflow resume virtualx86 identification]

$cs: 0x0023 $ss: 0x002b $ds: 0x002b $es: 0x002b $fs: 0x0000 $gs: 0x0000

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────

0xffffd220│+0x0000: 0x00000001 ← $esp

0xffffd224│+0x0004: 0xffffd3d3  →  "/home/dev/x86/mov"

0xffffd228│+0x0008: 0x00000000

0xffffd22c│+0x000c: 0xffffd3e5  →  "SHELL=/bin/bash"

0xffffd230│+0x0010: 0xffffd3f5  →  "SESSION_MANAGER=local/x86-64:@/tmp/.ICE-unix/1721,[...]"

0xffffd234│+0x0014: 0xffffd447  →  "QT_ACCESSIBILITY=1"

0xffffd238│+0x0018: 0xffffd45a  →  "COLORTERM=truecolor"

0xffffd23c│+0x001c: 0xffffd46e  →  "XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg"

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:32 ────

0x8048ffa                  add    BYTE PTR [eax], al

0x8048ffc                  add    BYTE PTR [eax], al

0x8048ffe                  add    BYTE PTR [eax], al

→  0x8049000 <_start+0>       mov    eax, 0x8

0x8049005 <_start+5>       mov    eax, 0xa

0x804900a <_start+10>      mov    ebx, eax

0x804900c <_start+12>      mov    ecx, DWORD PTR [esp]

0x804900f                  add    BYTE PTR [eax], al

0x8049011                  add    BYTE PTR [eax], al

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────

[#0] Id 1, Name: "mov", stopped 0x8049000 in _start (), reason: BREAKPOINT

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────

[#0] 0x8049000 → _start()

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

gef➤

There is a breakpoint and the program execution is paused. Now there are few things that we would need to observe in the preceding output.

  1. Code - This is the area where we can see what instruction is going to be executed next.
  2. Stack - This area shows the stack. It is good to know where stack is in the output as stack will be used when dealing with subroutines.
  3. Registers - This section shows all the registers and the values currently stored in them.

When debugging programs, we predominantly use registers and instructions in the code section. We may also use stack occasionally.

The EIP register

Now, let's take a look at the EIP register.

$eip   : 0x08049000  →  <_start+0> mov eax, 0x8

The value in the EIP register is 0x08049000. Let us also take a look at the address of our first instruction that is about to be executed.

→  0x8049000 <_start+0>       mov    eax, 0x8

0x8049005 <_start+5>       mov    eax, 0xa

0x804900a <_start+10>      mov    ebx, eax

0x804900c <_start+12>      mov    ecx, DWORD PTR [esp]

0x804900f                  add    BYTE PTR [eax], al

0x8049011                  add    BYTE PTR [eax], al

If you look at this address, this is the same as what we have seen in the EIP register. What it means is EIP always holds the address of the next instruction to be executed. Now let's try to execute this instruction and see what happens to see the value of EAX.

$eax   : 0x0     

$ebx   : 0x0

$ecx   : 0x0

$edx   : 0x0

$esp   : 0xffffd220  →  0x00000001

$ebp   : 0x0

$esi   : 0x0

$edi   : 0x0

$eip   : 0x08049000  →  <_start+0> mov eax, 0x8

Currently, it's zero. If the first instruction in our program gets executed, EAX should contain 8. Let's type si and hit enter. si command is used to do a single step, which means execute one instruction. Let us see what happened to the EAX register.

$eax   : 0x8       

$ebx   : 0x0

$ecx   : 0x0

$edx   : 0x0

$esp   : 0xffffd220  →  0x00000001

$ebp   : 0x0

$esi   : 0x0

$edi   : 0x0

$eip   : 0x08049005  →  <_start+5> mov eax, 0xa

If you observe, the value 0x8 is moved into the register EAX.

Setting up breakpoints at a specific address

Now we can type si to execute the next instruction but I'd like to show you another feature.

Now we have already set up one breakpoint earlier at the entry point of this program. There is another way to set up breakpoints i.e using the address of an instruction. For instance, let's say we want to set up a breakpoint at the address 0x804900c. When the program hits this particular address it pauses the execution. So, let's try to do that. Following is the code section with the address highlighted.

→  0x8049000 <_start+0>       mov    eax, 0x8

0x8049005 <_start+5>       mov    eax, 0xa

0x804900a <_start+10>      mov    ebx, eax

    0x804900c <_start+12>      mov    ecx, DWORD PTR [esp]

0x804900f                  add    BYTE PTR [eax], al

0x8049011                  add    BYTE PTR [eax], al

Following is the way to set up a breakpoint at a specific address.

gef➤  break *0x804900c

Breakpoint 2 at 0x804900c

gef➤

We are required to put an asterisk and then we will type the address. This is the way to set up a breakpoint using an address.  As we can observe in the preceding excerpt, breakpoint 2 is now set. Now let us just continue executing this program by typing c or continue and observe what happens.

    0x8049005 <_start+5>       mov    eax, 0xa

0x804900a <_start+10>      mov    ebx, eax

 →  0x804900c <_start+12>      mov    ecx, DWORD PTR [esp]

0x804900f                  add    BYTE PTR [eax], al

0x8049011                  add    BYTE PTR [eax], al

0x8049013                  add    BYTE PTR [eax], al

0x8049015                  add    BYTE PTR [eax], al

0x8049017                  add    BYTE PTR [eax], al

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────

[#0] Id 1, Name: "mov", stopped 0x804900c in _start (), reason: BREAKPOINT

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────

[#0] 0x804900c → _start()

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

gef➤

 

As we can see in the text highlighted, the program stopped executing because of a breakpoint.

Listing all breakpoints:

We can see all the breakpoints using the GDB command info breakpoints.

gef➤  info breakpoints

Num     Type           Disp Enb Address    What

1       breakpoint     keep y   0x08049000 <_start>

breakpoint already hit 1 time

2       breakpoint     keep y   0x0804900c <_start+12>

breakpoint already hit 1 time

gef➤

As we can observe in the output, we are able to see how many breakpoints are currently active. It shows that there are two breakpoints and we can also see the history of how many times each breakpoint is hit. In our case, each breakpoint is hit once.

Let's see what happened to the EAX register now.

$eax   : 0xa

$ebx   : 0xa

$ecx   : 0x0

$edx   : 0x0

$esp   : 0xffffd220  →  0x00000001

$ebp   : 0x0

$esi   : 0x0

$edi   : 0x0

$eip   : 0x0804900c  →  <_start+12> mov ecx, DWORD PTR [esp]

The value 8 is replaced with the new value 0xa. Additionally, the instruction mov ebx, eax is also executed resulting in the value 0xa in register RBX. Now let's also execute the next instruction.

→  0x804900c <_start+12>      mov    ecx, DWORD PTR [esp]

This instruction moves the value that's being pointed by the ESP register into ECX. Let us examine the stack to better understand this.

0xffffd220│+0x0000: 0x00000001 ← $esp

0xffffd224│+0x0004: 0xffffd3d3  →  "/home/dev/x86/mov"

0xffffd228│+0x0008: 0x00000000

0xffffd22c│+0x000c: 0xffffd3e5  →  "SHELL=/bin/bash"

0xffffd230│+0x0010: 0xffffd3f5  →  "SESSION_MANAGER=local/x86-64:@/tmp/.ICE-unix/1721,[...]"

0xffffd234│+0x0014: 0xffffd447  →  "QT_ACCESSIBILITY=1"

0xffffd238│+0x0018: 0xffffd45a  →  "COLORTERM=truecolor"

0xffffd23c│+0x001c: 0xffffd46e  →  "XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg"

What is ESP pointing to? If you look at the stack, ESP is pointing to the value available on the top of the stack. So, this value is going to be moved into ECX after executing the current instruction.

Let's try to do a single step and see what happens. As mentioned earlier, It is possible to do a single step by typing si. If the previous command that was typed was si, we can just hit enter instead of typing si again. So the previously typed command will be re-executed.

Let's observe the value of ecx.

$eax   : 0xa

$ebx   : 0xa

$ecx   : 0x1   

$edx   : 0x0

$esp   : 0xffffd220  →  0x00000001

$ebp   : 0x0

$esi   : 0x0

$edi   : 0x0

$eip   : 0x0804900f  →   add BYTE PTR [eax], al

As expected, the value 0x1 which was being pointed by ESP, is moved to ECX.

Learn Secure Coding

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

Conclusion

As we have seen in this article, using GDB is useful in debugging programs written in x86 assembly.

We used a GDB extension called GEF to simplify debugging. We have discussed some of the most common use cases of GDB that can come in handy when debugging ELF executables. We have seen a variety of concepts such as setting up breakpoints, examining registers and stack, the use of EIP register and listing available breakpoints.

Next, you'll learn how to use the ObjDump tool with x86.

Sources

Srinivas
Srinivas

Srinivas is an Information Security professional with 4 years of industry experience in Web, Mobile and Infrastructure Penetration Testing. He is currently a security researcher at Infosec Institute Inc. He holds Offensive Security Certified Professional(OSCP) Certification. He blogs atwww.androidpentesting.com. Email: srini0x00@gmail.com