Saturday, 9 November 2013

SQL Injection In-Depth: Attacks and Prevention Methods

SQL injection attacks are a type of injection attack, in which SQL commands are injected into data-plane input in order to effect the execution of predefined SQL commands (OWASP, 2012). SQL injection attacks pose a serious security threat to Web applications: they allow attackers to obtain unrestricted access to the databases underlying the applications and to the potentially sensitive information these databases contain (Halfond, Viegas, & Alessandro , 2006).
An SQL injection attack consists of inserting or "injecting" some data into an SQL query via the input data from the client to the application. A successful SQL injection exploit can:
  • Read sensitive data from the database
  • Modify database data (Insert/Update/Delete)
  • Execute administration operations on the database (such as shutdown the DBMS)
  • Recover the content of a given file present on the DBMS file system
  • In some cases issue commands to the operating system.
SQL injection errors occur when:
  1. Data enters a program from an untrusted source.
  2. Queries are dynamically constructed
The main consequences are:
  • Confidentiality: Since SQL databases generally hold sensitive data, loss of confidentiality is a frequent problem with SQL Injection vulnerabilities.
  • Authentication: If poor SQL commands are used to check user names and passwords, it may be possible to connect to a system as another user with no previous knowledge of the password.
  • Authorization: If authorization information is held in a SQL database, it may be possible to change this information through the successful exploitation of SQL Injection vulnerabilities.
  • Integrity: Just as it may be possible to read sensitive information, it is also possible to make changes or even delete this information with a SQL Injection attack.

Types of SQL Injection Attacks
In this section, we present and discuss the different kinds of SQL Injection Attacks. The different types of attacks are generally not performed in isolation; many of them are used together or sequentially, depending on the specific goals of the attacker. Note also that there are countless variations of each attack type.
Tautologies
Attack Intent: Bypassing authentication; identifying injectable parameters; extracting data.
Description: The general goal of a tautology-based attack is to inject code in one or more conditional statements so that they always evaluate to true. The most common usages are to bypass authentication pages and extract data. In this type of injection, an attacker exploits an injectable field that is used in a query’s WHERE conditional.
Transforming the conditional into a tautology causes all of the rows in the database table targeted by the query to be returned. In general, for a tautology-based attack to work, an attacker must consider not only the injectable/vulnerable parameters, but also the coding constructs that evaluate the query results. (Halfond, Viegas, & Alessandro , 2006)
Example 1: Bypassing login script.
Query: SELECT name from authors where username = '$_POST[username]’ AND password=’$_POST[password]’;
This query take input from the system user; suppose the user enters:
Username: a’ OR ‘1=1’
Password: a’ OR ‘1=1’
Constructed query: SELECT name from authors where username = ‘a’ OR ‘1=1’ AND password=’a’ OR ‘1=1’
The code injected in the conditional (OR 1=1) transforms the entire WHERE clause into a tautology. The database uses the conditional as the basis for evaluating each row and deciding which to return. Because the condition, the query evaluates to true for each row and returns all of them. This would cause this user to be authenticated as the user whose data is in the first row in the returned result set.
Solution:
    $username = $_POST[username];
    $username = mysqli_real_escape_string ($username);
    mysql_query (SELECT first_name, last_name from authors where username = '$username’);
Illegal/Logically Incorrect Queries
Attack Intent: Identifying injectable parameters; Performing database finger printing; Extracting data.
Description: This attack lets the attacker gather important information about the type and structure of the back-end database of an application. The attack is considered a preliminary, information gathering step for other attacks. The vulnerability leveraged by this attack is that the default error page returned by application servers is often overly descriptive; originally intended to help programmers debug their applications, further helps attackers gain information about the schema of the back-end database. When performing this attack, an attacker tries to inject statements that cause a syntax, type conversion, or logical error into the database. Syntax errors can be used to identify injectable parameters. Type errors can be used to deduce the data types of certain columns or to extract data. Logical errors often reveal the names of the tables and columns that caused the error.
Example 2: Cause a type conversion error that can reveal relevant data.
Password: AND ‘pin: “convert (int, (select top 1 name from sysobjects where xtype=’u’))
Query: SELECT name from authors where username = ‘’ AND password=’’ AND ‘pin = convert (int,(select top 1 name from sysobjects where xtype=’u’))
The query attempts to extract the first user table (xtype=’u’) from the database’s metadata table (assume the application is using Microsoft SQL Server, for which the metadata table is called sysobjects). The query then tries to convert this table name into an integer. Because this is not a legal type conversion, the database throws an error. For Microsoft SQL Server, the default error would be ”Microsoft OLE DB Provider for SQL Server (0x80040E07) Error converting nvarchar value ’CreditCards’ to a column of data type int.”
Two useful pieces of information in this message aids an attacker. First, the attacker can see that the database is an SQL Server database. Second, the error message reveals the value of the string that caused the type conversion to occur. In this case, this value is also the name of the first user-defined table in the database: “CreditCards.” A similar strategy can be used to systematically extract the name and type of each column in the database. Using this information about the schema of the database, an attacker can then create further attacks that target specific pieces of information.
Union Query
Attack Intent: Bypassing Authentication; extracting data.
Description: In union-query attacks, an attacker exploits a vulnerable parameter to change the data set returned for a given query. With this technique, an attacker can trick the application into returning data from a table different than the one that was intended by the developer. Attackers do this by injecting a statement of the form: UNION SELECT <rest of injected query>. Because the attackers completely control the second/injected query, they can use that query to retrieve information from a specified table. The database returns a dataset that is the union of the results of the original first query and the results of the injected second query. One example usage of this multiple attacks is where the attacker uses the logically incorrect query attack to data about a table’s structure then use the union query to get data from this table.
Example 3: Referring to example 2, an attacker could inject the text
Username: ’ UNION SELECT cardNo from CreditCards where acctNo=10032 - -”
Query: SELECT name from authors where username = ‘’ UNION SELECT cardNo from CreditCards where acctNo=10032 -- AND password=’’
Note: It is common technique to force the SQL parser to ignore the rest of the query written by the developer with -- which is the comment sign in SQL.
Assuming that there is no login equal to “”, the original first query returns the null set, whereas the second query returns data from the “CreditCards” table. The database takes the results of these two queries, unions them, and returns them to the application.
Piggy Backed Queries
Attack Intent: Extracting data; Adding or modifying data; Performing DOS; executing remote commands.
Description: In this attack, an attacker tries to inject additional queries into the original query. We distinguish this type from others because, in this case, attackers are not trying to modify the original intended query; instead, they are trying to include new and distinct queries that “piggy-back” on the original query. As a result, the database receives multiple SQL queries which are all executed. This type of attack can be extremely harmful. If successful, attackers can insert virtually any type of SQL command, including stored procedures into the additional queries and have them executed along with the original query. Vulnerability to this type of attack is often dependent on having a database configuration that allows multiple statements to be contained in a single string.
Example 4: The attacker inputs:
Password: “’; drop table users - -”
Query: SELECT name from authors where username = ‘’ AND password=’’ drop table users -- AND pin=123
After completing the first query, the database would recognize the query delimiter (“;”) and execute the injected second query. Dropping the users table would likely destroy valuable information. Other types of queries could insert new users into the database or execute stored procedures. Note that many databases do not require a special character to separate distinct queries, so simply scanning for a query separator is not an effective way to prevent this type of attack.
Solution: Configure the database to block executing multiple statements within a single string.
Stored Procedures
Attack Intent: Performing privilege escalation; performing DOS; Executing remote commands.
Description: SQL Injection Attacks of this type try to execute stored procedures present in the database. Most vendors ship databases with a standard set of stored procedures that extend the functionality of the database and allow for interaction with the operating system. Therefore, once an attacker determines which backend database is in use, SQL Injection Attacks can be crafted to execute stored procedures provided by that specific database. Additionally, because stored procedures are often written in special scripting languages, they can contain other types of vulnerabilities, such as buffer overflows; these vulnerabilities allow attackers to run arbitrary code on the server or escalate their privileges. Here is a stored procedure that checks credentials:
CREATE PROCEDURE DBO.isAuthenticated
@userName varchar2, @pass varchar2, @pin int
AS EXEC ("SELECT accounts FROM users
WHERE login=’" +@userName+ "’ and pass=’" +@password+ "’ and pin=" +@pin);
GO
Example 5: Demonstrates how a parameterized stored procedure can be exploited via an SQL Injection Attack. In the example, we assume that the query string constructed at lines 5, 6 and 7 of our example has been replaced by a call to the stored procedure defined in Figure 2. The stored procedure returns a true/false value to indicate whether the user’s credentials authenticated correctly. To launch an SQL Injection Attack, the attacker simply enters:
Password: ’ ; SHUTDOWN; --
Query: SELECT name from authors where username = ‘Jay’ AND password=’ ’; SHUTDOWN; --
At this point, this attack works like a piggy-back attack. The first query is executed normally, and then the second, malicious query is executed, which results in a database shut down. This example shows that stored procedures can be vulnerable to the same range of attacks as traditional application code.
Inference
Attack Intent: Identifying injectable parameters; Extracting data; Determining database schema.
Description: In this attack, the query is modified to recast it in the form of an action that is executed based on the answer to a true/-false question about data values in the database. In this type of injection, attackers are generally trying to attack a site that has been secured enough so that when an injection has succeeded, there is no usable feedback via database error messages. In this situation, the attacker injects commands into the application and then observes how the application responds. From careful observation, the attacker can deduce not only whether certain parameters are vulnerable, but also additional information about the values in the database. There are two well-known attack techniques that are based on inference:
Blind Injection: Information is inferred from the behavior of the page by asking the server true/-false questions. If the injected statement evaluates to true, the site continues to function normally. If the statement evaluates to false, although there is no descriptive error message, the page differs significantly from the normally-functioning page.
Timing Attacks: A timing attack allows an attacker to gain information from a database by observing timing delays in the response of the database. Attackers structure their injected query in the form of an if/then statement, whose branch predicate corresponds to an unknown about the contents of the database. Along one of the branches, the attacker uses a SQL construct that pause the execution for a known amount of time (e.g. the WAITFOR keyword). By measuring the response time of the database, the attacker can infer which branch was taken in his injection and therefore the answer to the injected question.
Example 6: Identifying injectable parameters using blind injection. Consider two possible injections into the login field.
  • “legalUser’ and 1=0 - -”
  • “legalUser’ and 1=1 - -”
Query 1: SELECT name from authors where username = ’legalUser’ and 1=0 -- ’ AND password=’ ’ AND pin=0;
Query 2: SELECT name from authors where username = ’legalUser’ and 1=1 -- ’ AND password=’ ’ AND pin=0;
Scenario 1: We have a secure application, and the input for login is validated correctly. In this case, both injections would return login error messages, and the attacker would know that the login parameter is not vulnerable.
Scenario 2: We have an insecure application and the login parameter is vulnerable to injection. The attacker submits the first injection and, because it always evaluates to false, the application returns a login error message. The attacker then submits the second query, which always evaluates to true. If in this case there is no login error message, then the attacker knows that the attack went through and that the login parameter is vulnerable to injection.
Example 7:  Using Timing based inference attack to extract a table name from the database.
Username: ‘‘legalusr’ and ASCII(SUBSTRING((select top 1 name from sysobjects),1,1)) > X WAITFOR 5 --’’.
Query:
SELECT name from authors where username = ’legalUser’ ASCII(SUBSTRING((select top 1 name from sysobjects),1,1)) > X WAITFOR 5 -- ’AND password=’ ’ AND pin=0;
Here, the SUBSTRING function extracts the first character of the first table’s name. Using a binary search strategy, the attacker can ask a series of questions about this character. In this case, the attacker is asking if the ASCII value of the character is greater-than or less-than-or-equal-to the value of X. If the value is greater, the attacker knows this by observing an additional 5 second delay in the response of the database. The attacker can then use a binary search by varying the value of X to identify the value of the first character.
Alternate Encodings
Attack Intent: Evading detection.
Description: In this attack, the injected text is modified so as to avoid detection by defensive coding practices and also many automated prevention techniques. This attack type is used in conjunction with other attacks. In other words, alternate encodings do not provide any unique way to attack an application; they are simply an enabling technique that allows attackers to evade detection and prevention techniques and exploit vulnerabilities that might not otherwise be exploitable. These evasion techniques are often necessary because a common defensive coding practice is to scan for certain known “bad characters,” such as single quotes and comment operators.
To evade this defense, attackers have employed alternate methods of encoding their attack strings (e.g., using hexadecimal, ASCII, and Unicode character encoding). Common scanning and detection techniques do not try to evaluate all specially encoded strings, thus allowing these attacks to go undetected. An effective code-based defense against alternate encodings is difficult to implement in practice because it requires developers to consider of all of the possible encodings that could affect a given query string as it passes through the different application layers. Therefore, attackers have been very successful in using alternate encodings to conceal their attack strings.
Example 8: Every type of attack could be represented using an alternate encoding; here we simply provide an example of how mystic an alternatively-encoded attack could appear.
Username: “legalUser’; exec(0x73687574646f776e) - - ”
Query:
SELECT name from authors where username = ’legalUser’; exec(0x73687574646f776e) - - AND password=’ ’;
The stream of numbers in the second part of the injection is the ASCII hexadecimal encoding of the string “SHUTDOWN.” Therefore, when the query is interpreted by the database, it would result in the execution, by the database, of the SHUTDOWN command.
Query: SELECT name from authors where username = ’legalUser’; exec(SHUTDOWN) - - AND password=’ ’;

Preventing SQL Injection Attacks
Techniques to prevent SQL Injection range from development best practices to fully automated frameworks for detecting and preventing SQL Injection Attacks.
Database Design Practices

Limiting Permissions: Limiting Permissions naturally leads to a very effective method of preventing attacks and limiting the damages from possible SQL injection attacks. Some methods to be employed are:

  1. Use database accounts with limited permissions: only give the necessary permissions to each account. Normally an application uses an account to access the database and restrict user operations at the application level. However, if a user uses SQL injection, all of the application level security will be bypassed and the user will gain access to the database with the full privileges of the account the application uses to connect to the database.
  2. Use several database accounts: This would serve to compliment the first precaution taken above. Since many users with different privileges normally uses an application, the account we use to connect to the database would need the necessary access rights for the most privileged user; and if an unprivileged user commits an SQL Injection attack, this user will get full access to the database as the most privileged user. To solve this issue, it is best to have several database accounts with which an application can connect to the database, using different accounts based on the role of the logged in user.
Defensive Coding Practices
The root cause of SQL injection vulnerabilities is insufficient input validation. Therefore, the straightforward solution for eliminating these vulnerabilities is to apply suitable defensive coding practices.
Input type checking: SQL Injection Attacks can be performed by injecting commands into either a string or numeric parameter; a simple check of such inputs can prevent many attacks. For example, in the case of numeric inputs, developers can simply reject any input that contains characters other than numerical digits.
Concealing Error Messages: Injection attacks often depend on the attacker having at least some information about the database schema. As discussed in some of the attacks mentioned previously, an attacker can gain much information through error messages which may tell the attacker quite a lot about the schema. DBMS and Programming Language Database Connectors generally provide clear, informative error messages that are incredibly helpful to programmers, but can also provide information to a malicious user.
After the launch of an application (the application is available to end users), It is best to log errors directly to a table in the database rather than outputting them to end user.
Encoding of inputs: Injection values into a parameter is often accomplished through the use of meta-characters that trick the SQL parser into interpreting user input as SQL tokens. Prohibiting any usage of meta-characters would restrict a non-malicious user’s ability to specify legal inputs that contain such characters. A better solution is to use functions that encode a string in such a way that all meta-characters are specially encoded and interpreted by the database as normal characters. Here is an example in PHP:
$username = $_POST[username];
$username = mysqli_real_escape_string ($username);
mysql_query (SELECT first_name, last_name from authors where username = '$username’);
Positive pattern matching: Developers should establish input validation routines that identify good input as opposed to bad input. This approach is generally called positive validation, as opposed to negative validation, which searches input for forbidden patterns or SQL tokens. Because developers might not be able to envision every type of attack that could be launched against their application, but should be able to specify all the forms of legal input, positive validation is a safer way to check inputs.
Identification of all input sources: Developers must check all inputs to their application; there are many possible sources of input to an application. If used to construct a query, these input sources can be a way for an attacker to introduce an SQL Injection Attack. Simply put, all input sources must be checked.
Although defensive coding practices remain the best way to prevent SQL injection vulnerabilities, their application is problematic in practice. Defensive coding is prone to human error and is not as rigorously and completely applied as automated techniques Moreover, approaches based on defensive coding are weakened by the widespread promotion and acceptance of so-called “pseudoremedies”. We discuss two of the most commonly-proposed pseudo-remedies:
  • The first of such remedies consists of checking user input for SQL keywords, such as “FROM,” “WHERE,” and “SELECT,” and SQL operators, such as the single quote or comment operator. The rationale behind this suggestion is that the presence of such keywords and operators may indicate an attempted SQL Injection Attack. This approach clearly results in a high rate of false positives because in many applications, SQL keywords can be part of a normal text entry, and SQL operators can be used to express formulas or even names (e.g., O’Brian).
  • The second commonly suggested pseudo-remedy is to use stored procedures or prepared statements to prevent SQL Injection Attacks. Unfortunately, stored procedures and prepared statements can also be vulnerable to SQL Injection Attacks unless developers rigorously apply defensive coding guidelines.
Detection and Prevention Techniques
Researchers have proposed a range of techniques to assist developers and compensate for the shortcomings in the application of defensive coding.
Black Box Testing: Huang and colleagues proposed WAVES, a black-box technique for testing Web applications for SQL injection vulnerabilities. The technique uses a Web crawler to identify all points in a Web application that may be vulnerable to SQL Injection Attacks. It then builds attacks that target such points based on a specified list of patterns and attack techniques. WAVES then monitors the application’s response to the attacks and uses machine learning techniques to improve its attack methodology. However, like all black-box and penetration testing techniques, it cannot provide guarantees of completeness.
Static Code Checkers: JDBC-Checker is a technique for statically checking the type correctness of dynamically-generated SQL queries. JDBC-Checker is able to detect one of the root causes of SQLIA vulnerabilities in code - improper type checking of input.
Combined Static and Dynamic Analysis: AMNESIA is a model-based technique that combines static analysis and runtime monitoring. In its static phase, AMNESIA uses static analysis to build models of the different types of queries an application can legally generate at each point of access to the database. In its dynamic phase, AMNESIA intercepts all queries before they are sent to the database and checks each query against the statically built models. Queries that violate the model are identified as SQL Injection Attacks and are prevented from executing on the database. The primary limitation of this technique is that its success is dependent on the accuracy of its static analysis for building query models.
Taint Based Approaches: WebSSARI detects input-validation related errors using information flow analysis. In this approach, static analysis is used to check contaminated flows against preconditions for sensitive functions. The analysis detects the points in which preconditions have not been met and can suggest filters and sanitization functions that can be automatically added to the application to satisfy these preconditions. The WebSSARI system works by considering as sanitized input that has passed through a predefined set of filters. The primary drawbacks of this technique are that it assumes that adequate preconditions for sensitive functions can be accurately expressed using their typing system and that having input passing through certain types of filters is sufficient to consider it not tainted.
New Query Development Paradigms: Two recent approaches, SQL DOM and Safe Query Objects, use encapsulation of database queries to provide a safe and reliable way to access databases. These techniques offer an effective way to avoid the SQL Injection Attack problem by changing the query-building process from an unregulated one that uses string concatenation to a systematic one that uses a type-checked API. Within their API, they are able to systematically apply coding best practices such as input filtering and rigorous type checking of user input. By changing the development paradigm in which SQL queries are created, these techniques eliminate the coding practices that make most SQL Injection Attacks possible. Although effective, these techniques have the drawback that they require developers to learn and use a new programming paradigm or query-development process. Furthermore, because they focus on using a new development process, they do not provide any type of protection or improved security for existing legacy systems.
Intrusion Detection Systems (IDS): A proposed IDS system to prevent SQL Injection is based on a machine learning technique that is trained using a set of typical application queries. The technique builds models of the typical queries and then monitors the application at runtime to identify queries that do not match the model. In the evaluation, it was shown that the system is able to detect attacks with a high rate of success. However, the fundamental limitation of learning based techniques is that they can provide no guarantees about their detection abilities because their success is dependent on the quality of the training set used.
Proxy Filters: Security Gateway is a proxy filtering system that enforces input validation rules on the data flowing to a Web application. Using their Security Policy Descriptor Language (SPDL), developers provide constraints and specify transformations to be applied to application parameters as they flow from the Web page to the application server. Because SPDL is highly expressive, it allows developers considerable freedom in expressing their policies. This approach is human-based and, like defensive programming, requires developers to know not only which data needs to be filtered, but also what patterns and filters to apply to the data.
Instruction Set Randomization: SQLrand is an approach based on instruction-set randomization. SQLrand provides a framework that allows developers to create queries using randomized instructions instead of normal SQL keywords. A proxy filter intercepts queries to the database and de-randomizes the keywords. SQL code injected by an attacker would not have been constructed using the randomized instruction set. Therefore, injected commands would result in a syntactically incorrect query. While this technique can be very effective, it has several practical drawbacks: Firstly, since it uses a secret key to modify instructions, security of the approach is dependent on attackers not being able to discover the key; Secondly, the approach imposes a significant infrastructure overhead because it require the integration of a proxy for the database in the system