Secure coding

How to exploit Buffer Overflow

Srinivas
August 31, 2020 by
Srinivas

This article provides an overview of buffer overflow vulnerabilities and how they can be exploited. Buffer overflows are commonly seen in programs written in various programming languages. 

While there are other programming languages that are susceptible to buffer overflows, C and C++ are popular for this class of attacks. In this article, we’ll explore some of the reasons for buffer overflows and how someone can abuse them to take control of the vulnerable program.

Learn Secure Coding

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

What is buffer overflow?

Buffer overflow is a class of vulnerability that occurs due to the use of functions that do not perform bounds checking. In simple words, it occurs when more data is put into a fixed-length buffer than the buffer can handle. 

It’s better explained using an example. So let’s take the following program as an example.

#include<stdio.h>

#include<string.h>

void vuln_func(char *input);

int main(int argc, char *argv[])

{

if(argc>1)

vuln_func(argv[1]);

}

void vuln_func(char *input)

{

char buffer[256];

strcpy(buffer, input);

}

 

This is a simple C program which is vulnerable to buffer overflow. If you look closely, we have a function named vuln_func, which is taking a command-line argument. This argument is being passed into a variable called input, which in turn is being copied into another variable called buffer, which is a character array with a length of 256.

However, we are performing this copy using the strcpy function. This function doesn't perform any bounds checking implicitly; thus, we will be able to write more than 256 characters into the variable buffer and buffer overflow occurs. If this overflowing buffer is written onto the stack and if we can somehow overwrite the saved return address of this function, we will be able to control the flow of the entire program. That's the reason why this is called a stack-based buffer overflow.

Types of buffer overflow

We have just discussed an example of stack-based buffer overflow. However, a buffer overflow is not limited to the stack. The following are some of the common buffer overflow types.

Stack-based buffer overflow

When a user-supplied buffer is stored on the stack, it is referred to as a stack-based buffer overflow. As mentioned earlier, a stack-based buffer overflow vulnerability can be exploited by overwriting the return address of a function on the stack. 

Heap-based buffer overflow

When a user-supplied buffer is stored on the heap data area, it is referred to as a heap-based buffer overflow. Heap overflows are relatively harder to exploit when compared to stack overflows. The successful exploitation of heap-based buffer overflow vulnerabilities relies on various factors, as there is no return address to overwrite as with the stack-based buffer overflow technique. The user-supplied buffer often overwrites data on the heap to manipulate the program data in an unexpected manner. 

Understanding debuggers

Understanding how to use debuggers is a crucial part of exploiting buffer overflows. When writing buffer overflow exploits, we often need to understand the stack layout, memory maps, instruction mnemonics, CPU registers and so on. A debugger can help with dissecting these details for us during the debugging process.

In the Windows environment, OllyDBG and Immunity Debugger are freely available debuggers. GNU Debugger (GDB) is the most commonly used debugger in the Linux environment.

Exploit mitigation techniques

To be able to exploit a buffer overflow vulnerability on a modern operating system, we often need to deal with various exploit mitigation techniques such as stack canaries, data execution prevention, address space layout randomization and more. To keep it simple, let’s proceed with disabling all these protections.

For the purposes of understanding buffer overflow basics, let’s look at a stack-based buffer overflow.

Crashing and analyzing core dumps

In this section, let's explore how one can crash the vulnerable program to be able to write an exploit later. The following makefile can be used to compile this program with all the exploit mitigation techniques disabled in the binary.

all:

gcc -fno-stack-protector vulnerable.c -o vulnerable -z execstack -D_FORTIFY_SOURCE=0

clean:

rm vulnerable

We are simply using gcc and passing the program vulnerable.c as input. We are producing the binary vulnerable as output.

Let’s disable ASLR by writing the value 0 into the file /proc/sys/kernel/randomize_va_space. This looks like the following:

sudo bash -c "echo 0 > /proc/sys/kernel/randomize_va_space"

Now we are fully ready to exploit this vulnerable program.

Let's compile it and produce the executable binary. To do this, run the command make and it should create a new binary for us.

$ make $

We should have a new binary in the current directory. Let’s run the file command against the binary and observe the details.

$ file vulnerable

vulnerable: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=9e7fbfc60186b8adfb5cab10496506bb13ae7b0a, for GNU/Linux 3.2.0, not stripped

$

As we can see, it's an ELF and 64-bit binary. Let’s run the binary with an argument.

$ ./vulnerable test $

Nothing happens. This is intentional: it doesn't do anything apart from taking input and then copying it into another variable using the strcpy function. 

Crashing the program

Now let's see how we can crash this application. We're going to create a simple perl program. So we can use it as a template for the rest of the exploit.

Let's create a file called exploit1.pl and simply create a variable. Let's give it three hundred "A"s. We want to produce 300 characters using this perl program so we can use these three hundred "A"s in our attempt to crash the application.

exploit1.pl

#!/usr/bin/perl

$| = 1;

$junk = "A" x 300;

print $junk;

Let us also ensure that the file has executable permissions.  

chmod +x exploit1.pl

Now, let's write the output of this file into a file called payload1. 

$ ./exploit1.pl > payload1

Let’s simply run the vulnerable program and pass the contents of payload1 as input to the program. 

$ ./vulnerable $(cat payload1)

Segmentation fault (core dumped)

$

As you can see, there is a segmentation fault and the application crashes. Now let's type ls and check if there are any core dumps available in the current directory.

$ ./vulnerable $(cat payload1)

Segmentation fault (core dumped)

$

$

$ ls

exploit1.pl  Makefile  payload1  vulnerable  vulnerable.c

$

If you notice, in the current directory there is nothing like a crash dump. There are no new files created due to the segmentation fault. Let’s enable core dumps so we can understand what caused the segmentation fault. 

$ ulimit -c unlimited

 

This should enable core dumps. Now, let’s crash the application again using the same command that we used earlier. Type ls once again and you should see a new file called core.

$ ./vulnerable $(cat payload1)

Segmentation fault (core dumped)

$

$

$ ls

core  exploit1.pl  Makefile  payload1  vulnerable*  vulnerable.c

$

This file is a core dump, which gives us the situation of this program and the time of the crash. We can use this core file to analyze the crash. Let’s see how we can analyze the core file using gdb

$ gdb -q -core core 

GEF for linux ready, type `gef' to start, `gef config' to configure

75 commands loaded for GDB 9.1 using Python engine 3.8

[*] 5 commands could not be loaded, run `gef missing` to know why.

[New LWP 34966]

[!] './vulnerable AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' not found/readable

[!] Failed to get file debug information, most of gef features will not work

Core was generated by `./vulnerable AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'.

Program terminated with signal SIGSEGV, Segmentation fault.

#0  0x00005555555551ad in ?? ()

gef➤ 

 

If you look at this gdb output, it shows that the long input has overwritten RIP somewhere. (RIP is the  register that decides which instruction is to be executed.)

If you notice the next instruction to be executed, it is at the address 0x00005555555551ad, which is probably not a valid address. That's the reason why the application crashed. As I mentioned earlier, we can use this core dump to analyze the crash.  We can also type info registers to understand what values each register is holding and at the time of crash.

gef➤  info registers

rax            0x7fffffffdd60      0x7fffffffdd60

rbx            0x5555555551b0      0x5555555551b0

rcx            0x80008             0x80008

rdx            0x414141            0x414141

rsi            0x7fffffffe3e0      0x7fffffffe3e0

rdi            0x7fffffffde89      0x7fffffffde89

rbp            0x4141414141414141  0x4141414141414141

rsp            0x7fffffffde68      0x7fffffffde68

r8             0x0                 0x0

r9             0x7ffff7fe0d50      0x7ffff7fe0d50

r10            0x0                 0x0

r11            0x0                 0x0

r12            0x555555555060      0x555555555060

r13            0x7fffffffdf70      0x7fffffffdf70

r14            0x0                 0x0

r15            0x0                 0x0

rip            0x5555555551ad      0x5555555551ad

eflags         0x10246             [ PF ZF IF RF ]

cs             0x33                0x33

ss             0x2b                0x2b

ds             0x0                 0x0

es             0x0                 0x0

fs             0x0                 0x0

gs             0x0                 0x0

gef➤ 

As I mentioned, RIP is actually overwritten with  0x00005555555551ad and we should notice some characters from our junk, which are 8 As in the RBP register. This is how core dumps can be used.

Let's run the program itself in gdb by typing gdb ./vulnerable and disassemble main using disass main

gef➤  disass main

Dump of assembler code for function main:

   0x0000000000001149 <+0>: endbr64 

   0x000000000000114d <+4>: push   rbp

   0x000000000000114e <+5>: mov    rbp,rsp

   0x0000000000001151 <+8>: sub    rsp,0x10

   0x0000000000001155 <+12>: mov    DWORD PTR [rbp-0x4],edi

   0x0000000000001158 <+15>: mov    QWORD PTR [rbp-0x10],rsi

   0x000000000000115c <+19>: cmp    DWORD PTR [rbp-0x4],0x1

   0x0000000000001160 <+23>: jle    0x1175 <main+44>

   0x0000000000001162 <+25>: mov    rax,QWORD PTR [rbp-0x10]

   0x0000000000001166 <+29>: add    rax,0x8

   0x000000000000116a <+33>: mov    rax,QWORD PTR [rax]

   0x000000000000116d <+36>: mov    rdi,rax

   0x0000000000001170 <+39>: call   0x117c <vuln_func>

   0x0000000000001175 <+44>: mov    eax,0x0

   0x000000000000117a <+49>: leave  

   0x000000000000117b <+50>: ret    

End of assembler dump.

gef➤  

This is the disassembly of our main function. If you notice, within the main program, we have a function called vuln_func. Let us disassemble that using disass vuln_func.

gef➤  disass vuln_func

Dump of assembler code for function vuln_func:

   0x000000000000117c <+0>: endbr64 

   0x0000000000001180 <+4>: push   rbp

   0x0000000000001181 <+5>: mov    rbp,rsp

   0x0000000000001184 <+8>: sub    rsp,0x110

   0x000000000000118b <+15>: mov    QWORD PTR [rbp-0x108],rdi

   0x0000000000001192 <+22>: mov    rdx,QWORD PTR [rbp-0x108]

   0x0000000000001199 <+29>: lea    rax,[rbp-0x100]

   0x00000000000011a0 <+36>: mov    rsi,rdx

   0x00000000000011a3 <+39>: mov    rdi,rax

   0x00000000000011a6 <+42>: call   0x1050 <strcpy@plt>

   0x00000000000011ab <+47>: nop

   0x00000000000011ac <+48>: leave  

   0x00000000000011ad <+49>: ret    

End of assembler dump.

gef➤ 

If you notice the disassembly of vuln_func, there is a call to strcpy@plt within this function. 

Now run the program by passing the contents of payload1 as input.

gef➤  r  $(cat payload1)

Starting program: /home/dev/x86_64/simple_bof/vulnerable $(cat payload1)

Program received signal SIGSEGV, Segmentation fault.

0x00005555555551ad in vuln_func ()

[ Legend: Modified register | Code | Heap | Stack | String ]

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────

$rax   : 0x00007fffffffdd00  →  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"

$rbx   : 0x00005555555551b0  →  <__libc_csu_init+0> endbr64 

$rcx   : 0x20000           

$rdx   : 0x11              

$rsp   : 0x00007fffffffde08  →  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"

$rbp   : 0x4141414141414141 ("AAAAAAAA"?)

$rsi   : 0x00007fffffffe3a0  →  "AAAAAAAAAAAAAAAAA"

$rdi   : 0x00007fffffffde1b  →  "AAAAAAAAAAAAAAAAA"

$rip   : 0x00005555555551ad  →  <vuln_func+49> ret 

$r8    : 0x0               

$r9    : 0x00007ffff7fe0d50  →   endbr64 

$r10   : 0x0               

$r11   : 0x0               

$r12   : 0x0000555555555060  →  <_start+0> endbr64 

$r13   : 0x00007fffffffdf10  →  0x0000000000000002

$r14   : 0x0               

$r15   : 0x0               

$eflags: [zero carry parity adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification]

$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000 

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────

0x00007fffffffde08│+0x0000: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" ← $rsp

0x00007fffffffde10│+0x0008: "AAAAAAAAAAAAAAAAAAAAAAAAAAAA"

0x00007fffffffde18│+0x0010: "AAAAAAAAAAAAAAAAAAAA"

0x00007fffffffde20│+0x0018: "AAAAAAAAAAAA"

0x00007fffffffde28│+0x0020: 0x00007f0041414141 ("AAAA"?)

0x00007fffffffde30│+0x0028: 0x00007ffff7ffc620  →  0x0005042c00000000

0x00007fffffffde38│+0x0030: 0x00007fffffffdf18  →  0x00007fffffffe25a  →  "/home/dev/x86_64/simple_bof/vulnerable"

0x00007fffffffde40│+0x0038: 0x0000000200000000

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────

   0x5555555551a6 <vuln_func+42>   call   0x555555555050 <strcpy@plt>

   0x5555555551ab <vuln_func+47>   nop    

   0x5555555551ac <vuln_func+48>   leave  

 → 0x5555555551ad <vuln_func+49>   ret    

[!] Cannot disassemble from $PC

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────

[#0] Id 1, Name: "vulnerable", stopped 0x5555555551ad in vuln_func (), reason: SIGSEGV

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────

[#0] 0x5555555551ad → vuln_func()

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

gef➤  

In the current environment, a GDB extension called GEF is installed. It shows many interesting details, like a debugger with GUI.

Now if you look at the output, this is the same as we have already seen with the coredump. 8 As are overwriting RBP. But we have passed 300 As and we don't know which 8 are among those three hundred As overwriting RBP register. 

When exploiting buffer overflows, being able to crash the application is the first step in the process. Using this knowledge, an attacker will begin to understand the exact offsets required to overwrite RIP register to be able to control the flow of the program.

Conclusion

In this article, we discussed what buffer overflow vulnerabilities are, their types and how they can be exploited. We also analyzed a vulnerable application to understand how crashing an application generates core dumps, which will in turn be helpful in developing a working exploit. In the next article, we will discuss how we can use this knowledge to exploit a buffer overflow vulnerability.

Learn Secure Coding

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

 

Sources

  1. Buffer Overflow, OWASP
  2. Stack-Based Buffer Overflow Attacks: Explained and Examples, Rapid7
  3. What Is a Buffer Overflow, Acunetix
Srinivas
Srinivas

Srinivas is an Information Security professional with 4 years of industry experience in Web, Mobile and Infrastructure Penetration Testing. He is currently a security researcher at Infosec Institute Inc. He holds Offensive Security Certified Professional(OSCP) Certification. He blogs atwww.androidpentesting.com. Email: srini0x00@gmail.com