What are Race Conditions?
Introduction to Race Conditions
Parallelized applications have a number of different benefits. They’re able to run faster, support multiple different users, and better make use of the resources available with multi-core processors.
However, the use of parallel processing can also create potential security issues. With a multi-threaded application or one where multiple instances of an application may share some state, there is the potential for race condition vulnerabilities.
The Dining Philosophers Problem
The Dining Philosophers Problem is a thought experiment in computer science. It is intended to demonstrate the potential for race conditions and deadlock within a parallelized application.
In the Dining Philosophers, there are five philosophers eating together as shown in the image above. Each of these philosophers shares a utensil with each of their neighbors at the table, and a philosopher must control both of the adjacent utensils in order to each (think chopsticks).
The problem is that a philosopher with control over a utensil will not relinquish it until they are done eating. This creates the potential for deadlock because it is possible to reach a state where no philosopher at the table is capable of eating. For example, every philosopher may have control over the utensil to their left but cannot eat without access to the utensil on their right (which is controlled by a different philosopher and won’t be given up).
A solution to the Dining Philosophers Problem is an algorithm that ensures that the system cannot fall into deadlock. Doing so requires making the system “thread safe” or secure against race condition vulnerabilities.
Types of Race Conditions
A race condition is anything within an application where the order in which instructions are executed impacts the result. For example, say that thread 1 has instructions A and B and thread 2 has instructions C and D.
While these instructions may be run one after another within a thread, the fact that the application is parallelized means that two adjacent instructions could be separated by one or more instructions from a different threat. If, for example, the sequences A, B, C, D and A, C, B, D produce different end results, then the threads contain a race condition.
These types of race conditions can occur for a number of different reasons within an application. Two examples of common race condition vulnerabilities include time-of-check time-of-use and signal handler race conditions.
A time-of-check time-of-use (TOCTOU) race condition can exist in applications that test a particular condition and then act upon it. For example, an application may test to see if a particular file exists and then attempt to read from or write to it.
While this is best practice from an error-handling perspective, it can create a race condition in a parallelized application. If another thread or application manages to delete the file between the check and the use, then the application will be attempting to read from or write to a file that no longer exists.
An example of a TOCTOU race condition is shown in the image above. This image includes a function called popElement which is designed to check if a vector contains any elements, and, if so, pop one and then remove it from the vector.
The issue here is that, if the application is multi-threaded, it is possible that a vector containing elements at the if statement may be empty at the v.back or v.pop_back statements if another thread of execution popped an element between the if and later statements. This can cause out of bounds errors with the vector.
Applications are generally designed to run synchronously. This means that, after one operation is performed, the next one is predictable (i.e. the next operation in the file or the target of a jump).
However, it is possible to have asynchronous code within an application. For example, signal handlers are designed to watch for particular events to occur, and, if they do, to run certain code. Since these events are unpredictable, it isn’t possible to predict when the signal handling code will run either.
The image above shows an example of code containing signal handling code. If certain signals are raised (SIGHUP or SIGTERM), then the sh function is called. This code frees some global variables, causes the application to sleep, then exits.
This code becomes a problem if both of the signals in question are raised. If this occurs, then the variables freed in the sh function are freed twice, creating a double-free vulnerability.
Safely Implementing Parallelized Applications
The example code shown in this article contains race condition vulnerabilities because it unsafely implements parallel processing. The code assumes that certain operations will be atomic (i.e. have adjacent lines of code run without interruption) and that only one of a set of asynchronous code blocks will be run.
Fixing these vulnerabilities requires the applications to be designed to be threadsafe. This requires either eliminating these assumptions or enforcing them using mutexes or atomic operations.