Databases and the structured query language

A database is designed to store data in an understandable and easily accessible way. Data is organized into tables, rows and columns, making it easy to find a particular value within a database if you know where to look.

The structured query language (SQL) is designed to make data within a database easy to access and modify. SQL queries describe where to look for a particular piece of data, the parameters that identify the particular pieces of data that the user wants and what to do with the data once it is found.

The dangers of untrusted input in SQL

While SQL is a very useful tool, it can also be dangerous. The reason for this is that, in many contexts, an SQL query is designed to use untrusted data provided by a user. For example, a website may store a user’s account data in a database. To access their data, the user provides a username and password, which are compared to a copy stored within the database. If the credentials match, then the user can access other information stored in the database for their account.

The use of untrusted data in SQL queries is dangerous because of how SQL queries are structured and how they are commonly used. Like many languages, SQL queries are designed to intermingle commands and data. For example, the above code sample includes the command SELECT * FROM cust WHERE id=X is an SQL command looking for a record with a particular user identifier. Everything but the value X is provided by the developer to describe the desired action, while a user provides the id value of X to point to their particular record.

This becomes a problem if a user provides an input designed to change the meaning of the command. Consider the case where the user provides an id of 1 OR 1=1. Since SQL commands are often built using string concatenation, this would result in an SQL query of SELECT * FROM user_data WHERE id=1 OR 1=1.

This modified user input changes the meaning of the command completely. The statement 1=1 is a tautology, meaning that it is always true. In SQL, combining any specifier (such as id=1) with a tautology results in a statement that will match every record in the database (since 1=1 is always true). If a malicious user can enter input that changes the meaning of a command, then the confidentiality, integrity and availability of the data in the database is at risk.

Protecting against SQL injection attacks

SQL injection vulnerabilities can put an organization’s systems and the sensitive data of its users at risk. A few options exist for minimizing the probability and impact of an SQL injection vulnerability, including limiting the set of characters that a user’s input can contain and taking advantage of parameterized queries.

Limiting character sets

In SQL, a simple way to separate data and code is to use quotes to surround user-provided data. However, this approach is easily defeated if a malicious user provides an input that results in a properly quoted SQL query.

Developers can protect their applications against this type of attack by using input sanitization to limit the character sets allowed in user input. This can be done by either explicitly allowing certain characters (such as alphanumeric characters) or blocking ones that can be potentially misused (such as single and double quotes).

However, this approach to mitigating SQL injection vulnerabilities is not a perfect solution. Explicitly allowing or blocking certain characters only works if the developer covers all potential exploitation techniques. An oversight may enable an attacker to bypass the protections provided by a blocklist.

Input sanitization also has the problem of false positive detections. The characters that are “dangerous” in SQL may be part of a valid input. For example, blocking single quotation marks (important if using them to delineate SQL commands from data) could make it impossible to access a user record with a name of O’Connor.

Using parameterized queries

SQL injection vulnerabilities exist because most applications build SQL queries through string concatenation. SQL commands and data are intermingled into a single string, which is then sent to the database by the SQL client.

However, this is not the only way to build SQL queries. Many SQL client libraries have the option for parameterized queries, as shown in the code sample above. In this query, the position of the user-provided data is marked in the query string, but the data is sent as a separate argument. This eliminates the potential that part of the data could be interpreted as a command, eliminating the potential for an SQL injection vulnerability.

Using SQL safely

SQL is a useful tool, but it can also be easily exploited. If an SQL query is vulnerable to an injection attack, then an attacker can exploit this vulnerability to read, modify or delete the database.

When using SQL within an application, it is important to do so securely. This means using parameterized queries when possible and performing strong input sanitization if parameterization isn’t possible.

 

Sources

  1. Introduction to SQL, w3schools
  2. MySQL-python 1.2.5, pypi.org
  3. Python MySQL Execute Parameterized Query using Prepared Statement, PYnative

Be Safe

Section Guide

Howard
Poston

View more articles from Howard

As you grow in your cybersecurity career, Infosec Skills is the platform to ensure your skills are scaled to outsmart the latest cyber threats.

Section Guide

Howard
Poston

View more articles from Howard