Secure coding

Introduction to printing and format strings

Srinivas
September 21, 2020 by
Srinivas

This article provides an overview of how printing functions work and how format strings are used to format the data being printed. Developers often use print functions for a variety of reasons such as displaying data to the users and printing debug messages. While these print functions appear to be innocent, they can cause serious damage if proper care is not taken while using them.

Let us understand some of the print function concepts in this article, which are foundational to understand print-related vulnerabilities such as format string vulnerabilities. 

Learn Secure Coding

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

Format functions

In a later section, we are going to discuss how format strings are used with format functions. But following is a list of commonly used format functions.

  • fprint - Writes the printf to a file
  • printf - Output a formatted string
  • sprintf - Prints into a string
  • snprintf - Prints into a string checking the length
  • vfprintf - Prints the a va_arg structure to a file
  • vprintf - Prints the va_arg structure to stdout
  • vsprintf - Prints the va_arg to a string
  • vsnprintf - Prints the va_arg to a string checking the length

Understanding printf

To better understand Format String vulnerabilities, let us first understand how print family of functions work by taking printf function in C language as an example.

Let us begin by considering the following C program as an example. 

test1.c

#include<stdio.h>

void main(){

int a = 100;

float b = 2.3;

int *c;    

        c = &a; 

printf("%d, %f, %p n", a,b,c);

}

The above C Program contains one printf function with multiple format specifiers namely %d, %f, %p.

When the printf function is executed with a format specifier, it prints data as specified by the format specifier. Let us take the following printf function as an example.

printf("%d", num);

When this printf function gets executed, it prints the value of variable num as an integer since %d is for signed integers in decimal. It should be noted that this data will be retrieved from the stack. Similarly, if the format specifier is changed to %p, the data is going to be printed as a hex value. 

Following are some of the commonly used format specifiers:

%d - used for signed integers in decimal

%f - used for float values

%c - used for character

%s - used for printing string data pointed by an address

%x - used for hexadecimal representation

%p - used for pointers

This should give some basic understanding of how format specifiers are used in printf function when printing data. Now, let us understand what happens when printf in the above program gets executed.

When printf function is executed in the preceding program, the following events occur:

  1. The data available in variable a (on the stack), will replace the format specifier %d and an integer value (100) will be printed.
  2. The data available in variable b (on the stack), will replace the format specifier %f and a float value (2.3) will be printed.
  3. The data available in variable c (on the stack), will replace the format specifier %p and the address of variable a, which is a pointer to the data stored in variable a will be printed.

Let us test our theory by compiling and executing this program. The following gcc command can be used on a Linux machine to compile the program.

$ gcc test1.c -o test1

Let us run the program and we should see the following output.

$ ./test1

100, 2.300000, 0x7ffc721d3278 

$

As expected, the variables a and b have decimal and float values respectively and variable c has a pointer, which is an address pointing to the value of a.

Some format specifiers give the programmer granular control on the format of the data being printed. For example, the value of the variable b is displayed as 2.300000. If we want to print 2.3 instead, we can update the format specifier for variable b as shown in the following code snippet. 

test2.c

#include<stdio.h>

void main(){

int a = 100;

float b = 2.3;

int *c;    

      c = &a; 

printf("%d, %1.1f, %p n", a,b,c);

}

Let us compile the program using the following gcc command.

$ gcc test2.c -o test2

Running the output binary test2 produces the float value 2.3 instead of 2.300000

$ ./test2

100, 2.3, 0x7ffedefab588

$

Printing hexadecimal instead of decimal

Variable a contains the value 100, which is a decimal value. Format specifiers also provide us an advantage of printing the hexadecimal equivalent of it without explicitly converting it in the program. The following example shows how 0x64 will be printed instead of decimal 100 just by changing the format specifier from %d to %p.

test3.c

#include<stdio.h>

void main(){

int a = 100;

float b = 2.3;

int *c;

        c = &a; 

printf("%p, %1.1f, %p n", a,b,c);

}

Let us compile the program using the following gcc command.

$ gcc test3.c -o test3

Running the output binary test3 prints the hexadecimal value 0x64 instead of decimal 100.

$ ./test3

0x64, 2.3, 0x7ffc50a3dae8  

$

Printing strings and their addresses

So far, we have explored how numbers are printed using printf. Now let us add two string variables to our program and understand how strings are printed using the printf function. Following is the program.

test4.c

#include<stdio.h>

void main(){

int a = 100;

float b = 2.3;

int *c;

        c = &a; 

char d[] = "demo";

char *e = d;

printf("%d, %1.1f, %p, %s, %s n", a,b,c,d,e);

}

 

The preceding program contains a character array d, which contains the string demo. We have then created another variable named e, which is a pointer to the character array. Essentially, both the variables should contain the same string value, when printed. As we can see in the printf statement, %s is used as the format specifier to print these string values.  Let us compile the program using the following gcc command.

$ gcc test4.c -o test4

Run the output binary test4 and we should see the string values being printed.

$ ./test4

100, 2.3, 0x7ffcbc0692f8, demo, demo 

$

It is also possible to print the addresses of these strings just by changing the format specifier from %s to %p. We can update the C program to print these pointers as shown below.

test5.c

#include<stdio.h>

void main(){

int a = 100;

float b = 2.3;

int *c;

        c = &a; 

char d[] = "demo";

char *e = d;

printf("%d, %1.1f, %p, %p, %p n", a,b,c,d,e);

}

Compile the program using the following gcc command.

$ gcc test5.c -o test5

Run the output binary test5 and we should see the addresses of the two string variables d and e.

$ $ ./test5

100, 2.3, 0x7ffc21fc8fa8, 0x7ffc21fc8fc3, 0x7ffc21fc8fc3

$

As we can notice, both the addresses are the same.

Learn Secure Coding

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

Conclusion

This article has provided a detailed explanation of how format strings can be used to format the data being printed by making use of format specifiers within the printf function. Understanding printf function is a foundation for understanding Format String class vulnerabilities, which will be discussed in the upcoming articles.

 

Sources

Srinivas
Srinivas

Srinivas is an Information Security professional with 4 years of industry experience in Web, Mobile and Infrastructure Penetration Testing. He is currently a security researcher at Infosec Institute Inc. He holds Offensive Security Certified Professional(OSCP) Certification. He blogs atwww.androidpentesting.com. Email: srini0x00@gmail.com