Hacking

Stack analysis with GDB

April 30, 2013 by Dawid Czagan

This article describes the stack. GDB is used to analyze its memory. One needs to know this subject to play with low-level security.

Environment: x86, Linux, GCC, GDB.

Registers

The following registers are mentioned in the article:

  • ESP (points to the top of the stack)
  • EBP (is used as a reference when accessing local variables and arguments of the function)
  • EIP (points to the address of the next instruction)

Stack

When the function is called, the following items are pushed on the stack (in the order of appearance):

  • arguments of the function (in reverse order)
  • return address
  • current EBP

Then, local variables of the function are pushed on the stack.

Program

A simple program was written in C in order to analyze the stack. Two numbers are sent to the function, which adds them and returns their sum.

[c]
#include

int add_numbers(int n1, int n2)
{
int sum=n1+n2;
return sum;
}

int main()
{
int n1=1;
int n2=2;
int sum;

sum=add_numbers(n1,n2);
printf(“The sum of 1 and 2 is %dn”,sum);

return 0;
}
[/c]

Let’s compile the code.

[plain]
dawid@lab:~$ gcc -g stack_analysis.c
[/plain]

Assembly code

Let’s start GDB and analyze the assembly code of main() and add_numbers():

[plain]
dawid@lab:~$ gdb -q ./a.out
Reading symbols from /home/dawid/a.out…done.
(gdb) disass main
Dump of assembler code for function main:
0x080483fb <+0>: push ebp
0x080483fc <+1>: mov ebp,esp
0x080483fe <+3>: and esp,0xfffffff0
0x08048401 <+6>: sub esp,0x20
0x08048404 <+9>: mov DWORD PTR [esp+0x1c],0x1
0x0804840c <+17>: mov DWORD PTR [esp+0x18],0x2
0x08048414 <+25>: mov eax,DWORD PTR [esp+0x18]
0x08048418 <+29>: mov DWORD PTR [esp+0x4],eax
0x0804841c <+33>: mov eax,DWORD PTR [esp+0x1c]
0x08048420 <+37>: mov DWORD PTR [esp],eax
0x08048423 <+40>: call 0x80483e4 <add_numbers>
0x08048428 <+45>: mov DWORD PTR [esp+0x14],eax
0x0804842c <+49>: mov eax,0x8048510
0x08048431 <+54>: mov edx,DWORD PTR [esp+0x14]
0x08048435 <+58>: mov DWORD PTR [esp+0x4],edx
0x08048439 <+62>: mov DWORD PTR [esp],eax
0x0804843c <+65>: call 0x804831c <printf@plt>
0x08048441 <+70>: mov eax,0x0
0x08048446 <+75>: leave
0x08048447 <+76>: ret
End of assembler dump.
(gdb) disass add_numbers
Dump of assembler code for function add_numbers:
0x080483e4 <+0>: push ebp
0x080483e5 <+1>: mov ebp,esp
0x080483e7 <+3>: sub esp,0x10
0x080483ea <+6>: mov eax,DWORD PTR [ebp+0xc]
0x080483ed <+9>: mov edx,DWORD PTR [ebp+0x8]
0x080483f0 <+12>: lea eax,[edx+eax*1]
0x080483f3 <+15>: mov DWORD PTR [ebp-0x4],eax
0x080483f6 <+18>: mov eax,DWORD PTR [ebp-0x4]
0x080483f9 <+21>: leave
0x080483fa <+22>: ret
End of assembler dump.
[/plain]

This assembly code will be referenced in the article.

Breakpoints

Let’s add some breakpoints.

[c]
(gdb) list main
1 #include
2
3 int add_numbers(int n1, int n2)
4 {
5 int sum=n1+n2;
6 return sum;
7 }
8
9 int main()
10 {
11 int n1=1;
12 int n2=2;
13 int sum;
14
15 sum=add_numbers(n1,n2);
16 printf(“The sum of 1 and 2 is %dn”,sum);
17
18 return 0;
19 }
(gdb) break 15
Breakpoint 1 at 0x8048414: file stack_analysis.c, line 15.
(gdb) break add_numbers
Breakpoint 2 at 0x80483ea: file stack_analysis.c, line 5.
(gdb) break 6
Breakpoint 3 at 0x80483f6: file stack_analysis.c, line 6.
(gdb) break 16
Breakpoint 4 at 0x804842c: file stack_analysis.c, line 16.
[/c]

Breakpoint 1: set before pushing the arguments of add_numbers() on the stack

Breakpoint 2: set after the prolog of add_numbers(). The prolog is:

[plain]
0x080483e4 <+0>: push ebp
0x080483e5 <+1>: mov ebp,esp
0x080483e7 <+3>: sub esp,0x10
[/plain]

Breakpoint 3: set before leaving add_numbers()

Breakpoint 4: set after leaving add_numbers().

Between breakpoints 3 and 4 the epilog of add_numbers() is executed. The epilog is:

[plain]
0x080483f9 <+21>: leave
0x080483fa <+22>: ret
[/plain]

Breakpoint 1 – analysis

Let’s run the program and analyze ESP, EBP and EIP.

[plain]
(gdb) run
Starting program: /home/dawid/a.out

Breakpoint 1, main () at stack_analysis.c:15
15 sum=add_numbers(n1,n2);
(gdb) i r esp ebp eip
esp 0xbffff420 0xbffff420
ebp 0xbffff448 0xbffff448
eip 0x8048414 0x8048414 <main+25>
[/plain]

ESP is smaller than EBP, because the stack grows in the direction of smaller addresses. As it can be seen in the assembly code, EIP points to pushing on the stack the second argument of add_function().

Please notice that the next instruction after leaving add_numbers() is at the address 0x08048428 (see the assembly code). This is the return address.

Breakpoint 2 – analysis

Let’s continue the program and check ESP, EBP and EIP after the prolog of add_numbers(). Moreover, let’s analyze the memory starting from the top of the stack in the direction of higher addresses.

GDB-figure-8

As it can be seen in the underlined code, the following items have been pushed on the stack (in the order of appearance):

0x00000002 (second argument of add_numbers())

0x00000001 (first argument of add_numbers())

0x08048428 (address of the next instruction after leaving add_numbers() – the return address)

0xbffff448 (current EBP – the one from main())

After pushing current EBP on the stack, ESP has been copied to EBP, which is used as a reference in add_numbers() when accessing arguments and local variable of this function (see the assembly code).

EIP points to the address of the next instruction after the prolog of add_numbers().

Breakpoint 3 – analysis

Let’s continue the program and analyze the memory before leaving add_numbers().

GDB-figure-9

In the meantime, the sum of arguments of add_numbers() has been calculated and pushed on the stack (underlined).

Breakpoint 4 – analysis

Let’s continue the program and analyze ESP, EBP, and EIP after leaving add_numbers().

[plain]
(gdb) cont
Continuing.

Breakpoint 4, main () at stack_analysis.c:16
16 printf(“The sum of 1 and 2 is %dn”,sum);
(gdb) i r esp ebp eip
esp 0xbffff420 0xbffff420
ebp 0xbffff448 0xbffff448
eip 0x804842c 0x804842c <main+49>
[/plain]

In the meantime, the epilog of add_numbers() has been executed and the control returned to main(). EBP has been popped off the stack and points to the previous address (the one before calling add_numbers() – see breakpoint 1). The return address (0x08048428) has also been popped off the stack and the instruction at this address was executed. EIP points to the address of the next instruction. ESP points to the previous address (the one before calling add_numbers() – see breakpoint 1).

Summary

This article described the stack. GDB was used to analyze its memory. The intention was to write an introductory text for those who want to study how buffer overflow works.

Posted: April 30, 2013
Articles Author
Dawid Czagan
View Profile

Dawid Czagan (@dawidczagan) has found security vulnerabilities in Google, Yahoo, Mozilla, Microsoft, Twitter, BlackBerry and other companies. Due to the severity of many bugs, he received numerous awards for his findings. Dawid is founder and CEO at Silesia Security Lab, which delivers specialized security auditing services with a results-driven approach. He also works as Security Architect at Future Processing. Dawid shares his bug hunting experience in his workshop entitled "Hacking web applications - case studies of award-winning bugs in Google, Yahoo, Mozilla and more". To find out about the latest in Dawid's work, you are invited to visit his blog (https://silesiasecuritylab.com/blog) and follow him on Twitter (@dawidczagan).

5 responses to “Stack analysis with GDB”

  1. Dawid Czagan says:

    Errata (I. source codes in sections 4 and 6 – there shoud be #include instead of #include; II. the underlined items in section 8 – memory analysis, not a code)

    • Dawid Czagan says:

      Once again: Errata (I. source codes in sections 4 and 6 – there should be #include &ltstdio.h&gt instead of #include; II. the underlined items in section 8 – memory analysis, not a code)

      • Dawid Czagan says:

        Once again, sorry: Errata (I. source codes in sections 4 and 6 – there should be #include <stdio.h> instead of #include; II. the underlined items in section 8 – memory analysis, not a code)

  2. sam savicj says:

    why you used command x/20xw $esp ? why not use x/30xw $esp or x/50xw $esp ????

    • kuch nahin ho sakta :( says:

      It toatally depends on what you want to see..
      x/20xw $esp ==> just means that show me (or examine) 20 words(usually 1 word-> 4 bytes) after the location pointed to by esp.
      You could have very well said x/30xw $esp or x/50xw $esp.. if you wish..
      The point is just to make sense of what you are seeing after this command.

Leave a Reply

Your email address will not be published. Required fields are marked *