SQL injection attacks are a type of injection attack, in which SQL
commands are injected into data-plane input in order to effect the
execution of predefined SQL commands (OWASP, 2012). SQL injection
attacks pose a serious security threat to Web applications: they allow
attackers to obtain unrestricted access to the databases underlying the
applications and to the potentially sensitive information these
databases contain (Halfond, Viegas, & Alessandro , 2006).
An SQL injection attack consists of inserting or "injecting" some
data into an SQL query via the input data from the client to the
application. A successful SQL injection exploit can:
- Read sensitive data from the database
- Modify database data (Insert/Update/Delete)
- Execute administration operations on the database (such as shutdown the DBMS)
- Recover the content of a given file present on the DBMS file system
- In some cases issue commands to the operating system.
SQL injection errors occur when:
- Data enters a program from an untrusted source.
- Queries are dynamically constructed
The main consequences are:
- Confidentiality: Since SQL
databases generally hold sensitive data, loss of confidentiality is a
frequent problem with SQL Injection vulnerabilities.
- Authentication: If poor SQL
commands are used to check user names and passwords, it may be possible
to connect to a system as another user with no previous knowledge of the
password.
- Authorization: If authorization
information is held in a SQL database, it may be possible to change this
information through the successful exploitation of SQL Injection
vulnerabilities.
- Integrity: Just as it may be
possible to read sensitive information, it is also possible to make
changes or even delete this information with a SQL Injection attack.
Types of SQL Injection Attacks
In this section, we present and discuss the different kinds of SQL
Injection Attacks. The different types of attacks are generally not
performed in isolation; many of them are used together or sequentially,
depending on the specific goals of the attacker. Note also that there
are countless variations of each attack type.
Tautologies
Attack Intent: Bypassing authentication; identifying injectable parameters; extracting data.
Description: The general goal of a tautology-based
attack is to inject code in one or more conditional statements so that
they always evaluate to true. The most common usages are to bypass
authentication pages and extract data. In this type of injection, an
attacker exploits an injectable field that is used in a query’s WHERE
conditional.
Transforming the conditional into a tautology causes all of the rows
in the database table targeted by the query to be returned. In general,
for a tautology-based attack to work, an attacker must consider not only
the injectable/vulnerable parameters, but also the coding constructs
that evaluate the query results. (Halfond, Viegas, & Alessandro ,
2006)
Example 1: Bypassing login script.
Query: SELECT name from authors where username = '$_POST[username]’ AND password=’$_POST[password]’;
This query take input from the system user; suppose the user enters:
Username: a’ OR ‘1=1’
Password: a’ OR ‘1=1’
Constructed query:
SELECT name from authors where username = ‘a’ OR ‘1=1’ AND password=’a’ OR ‘1=1’
The code injected in the conditional (OR 1=1) transforms the entire
WHERE clause into a tautology. The database uses the conditional as the
basis for evaluating each row and deciding which to return. Because the
condition, the query evaluates to true for each row and returns all of
them. This would cause this user to be authenticated as the user whose
data is in the first row in the returned result set.
Solution:
$username = $_POST[username];
$username = mysqli_real_escape_string ($username);
mysql_query (SELECT first_name, last_name from authors where username = '$username’);
Illegal/Logically Incorrect Queries
Attack Intent: Identifying injectable parameters; Performing database finger printing; Extracting data.
Description: This attack lets the attacker gather
important information about the type and structure of the back-end
database of an application. The attack is considered a preliminary,
information gathering step for other attacks. The vulnerability
leveraged by this attack is that the default error page returned by
application servers is often overly descriptive; originally intended to
help programmers debug their applications, further helps attackers gain
information about the schema of the back-end database. When performing
this attack, an attacker tries to inject statements that cause a syntax,
type conversion, or logical error into the database. Syntax errors can
be used to identify injectable parameters. Type errors can be used to
deduce the data types of certain columns or to extract data. Logical
errors often reveal the names of the tables and columns that caused the
error.
Example 2: Cause a type conversion error that can reveal relevant data.
Password: AND ‘pin: “convert (int, (select top 1 name from sysobjects where xtype=’u’))
Query: SELECT name from authors where username = ‘’ AND
password=’’ AND ‘pin = convert (int,(select top 1 name from sysobjects
where xtype=’u’))
The query attempts to extract the first user table (xtype=’u’) from
the database’s metadata table (assume the application is using Microsoft
SQL Server, for which the metadata table is called sysobjects). The
query then tries to convert this table name into an integer. Because
this is not a legal type conversion, the database throws an error. For
Microsoft SQL Server, the default error would be ”Microsoft OLE DB
Provider for SQL Server (0x80040E07) Error converting nvarchar value
’CreditCards’ to a column of data type int.”
Two useful pieces of information in this message aids an attacker.
First, the attacker can see that the database is an SQL Server database.
Second, the error message reveals the value of the string that caused
the type conversion to occur. In this case, this value is also the name
of the first user-defined table in the database: “CreditCards.” A
similar strategy can be used to systematically extract the name and type
of each column in the database. Using this information about the schema
of the database, an attacker can then create further attacks that
target specific pieces of information.
Union Query
Attack Intent: Bypassing Authentication; extracting data.
Description: In union-query attacks, an attacker
exploits a vulnerable parameter to change the data set returned for a
given query. With this technique, an attacker can trick the application
into returning data from a table different than the one that was
intended by the developer. Attackers do this by injecting a statement of
the form: UNION SELECT <rest of injected query>. Because the
attackers completely control the second/injected query, they can use
that query to retrieve information from a specified table. The database
returns a dataset that is the union of the results of the original first
query and the results of the injected second query. One example usage
of this multiple attacks is where the attacker uses the logically
incorrect query attack to data about a table’s structure then use the
union query to get data from this table.
Example 3: Referring to example 2, an attacker could inject the text
Username: ’ UNION SELECT cardNo from CreditCards where acctNo=10032 - -”
Query: SELECT name from authors where username = ‘’ UNION SELECT cardNo from CreditCards where acctNo=10032 -- AND password=’’
Note: It is common technique to force the SQL parser to ignore the rest of the query written by the developer with
-- which is the comment sign in SQL.
Assuming that there is no login equal to “”, the original first query
returns the null set, whereas the second query returns data from the
“CreditCards” table. The database takes the results of these two
queries, unions them, and returns them to the application.
Piggy Backed Queries
Attack Intent: Extracting data; Adding or modifying data; Performing DOS; executing remote commands.
Description: In this attack, an attacker tries to
inject additional queries into the original query. We distinguish this
type from others because, in this case, attackers are not trying to
modify the original intended query; instead, they are trying to include
new and distinct queries that “piggy-back” on the original query. As a
result, the database receives multiple SQL queries which are all
executed. This type of attack can be extremely harmful. If successful,
attackers can insert virtually any type of SQL command, including stored
procedures into the additional queries and have them executed along
with the original query. Vulnerability to this type of attack is often
dependent on having a database configuration that allows multiple
statements to be contained in a single string.
Example 4: The attacker inputs:
Password: “’; drop table users - -”
Query: SELECT name from authors where username = ‘’ AND password=’’ drop table users -- AND pin=123
After completing the first query, the database would recognize the
query delimiter (“;”) and execute the injected second query. Dropping
the users table would likely destroy valuable information. Other types
of queries could insert new users into the database or execute stored
procedures. Note that many databases do not require a special character
to separate distinct queries, so simply scanning for a query separator
is not an effective way to prevent this type of attack.
Solution: Configure the database to block executing multiple statements within a single string.
Stored Procedures
Attack Intent: Performing privilege escalation; performing DOS; Executing remote commands.
Description: SQL Injection Attacks of this type try
to execute stored procedures present in the database. Most vendors ship
databases with a standard set of stored procedures that extend the
functionality of the database and allow for interaction with the
operating system. Therefore, once an attacker determines which backend
database is in use, SQL Injection Attacks can be crafted to execute
stored procedures provided by that specific database. Additionally,
because stored procedures are often written in special scripting
languages, they can contain other types of vulnerabilities, such as
buffer overflows; these vulnerabilities allow attackers to run arbitrary
code on the server or escalate their privileges. Here is a stored
procedure that checks credentials:
CREATE PROCEDURE DBO.isAuthenticated
@userName varchar2, @pass varchar2, @pin int
AS EXEC ("SELECT accounts FROM users
WHERE login=’" +@userName+ "’ and pass=’" +@password+ "’ and pin=" +@pin);
GO
Example 5: Demonstrates how a parameterized stored
procedure can be exploited via an SQL Injection Attack. In the example,
we assume that the query string constructed at lines 5, 6 and 7 of our
example has been replaced by a call to the stored procedure defined in
Figure 2. The stored procedure returns a true/false value to indicate
whether the user’s credentials authenticated correctly. To launch an SQL
Injection Attack, the attacker simply enters:
Password: ’ ; SHUTDOWN; --
Query: SELECT name from authors where username = ‘Jay’ AND password=’ ’; SHUTDOWN; --
At this point, this attack works like a piggy-back attack. The first
query is executed normally, and then the second, malicious query is
executed, which results in a database shut down. This example shows that
stored procedures can be vulnerable to the same range of attacks as
traditional application code.
Inference
Attack Intent: Identifying injectable parameters; Extracting data; Determining database schema.
Description: In this attack, the query is modified
to recast it in the form of an action that is executed based on the
answer to a true/-false question about data values in the database. In
this type of injection, attackers are generally trying to attack a site
that has been secured enough so that when an injection has succeeded,
there is no usable feedback via database error messages. In this
situation, the attacker injects commands into the application and then
observes how the application responds. From careful observation, the
attacker can deduce not only whether certain parameters are vulnerable,
but also additional information about the values in the database. There
are two well-known attack techniques that are based on inference:
Blind Injection: Information is inferred from the
behavior of the page by asking the server true/-false questions. If the
injected statement evaluates to true, the site continues to function
normally. If the statement evaluates to false, although there is no
descriptive error message, the page differs significantly from the
normally-functioning page.
Timing Attacks: A timing attack allows an attacker
to gain information from a database by observing timing delays in the
response of the database. Attackers structure their injected query in
the form of an if/then statement, whose branch predicate corresponds to
an unknown about the contents of the database. Along one of the
branches, the attacker uses a SQL construct that pause the execution for
a known amount of time (e.g. the WAITFOR keyword). By measuring the
response time of the database, the attacker can infer which branch was
taken in his injection and therefore the answer to the injected
question.
Example 6: Identifying injectable parameters using blind injection. Consider two possible injections into the login field.
- “legalUser’ and 1=0 - -”
- “legalUser’ and 1=1 - -”
Query 1: SELECT name from authors where username = ’legalUser’ and 1=0 -- ’
AND password=’ ’ AND pin=0;
Query 2: SELECT name from authors where username = ’legalUser’ and 1=1 -- ’
AND password=’ ’ AND pin=0;
Scenario 1: We have a secure application, and the input for
login is validated correctly. In this case, both injections would return
login error messages, and the attacker would know that the login
parameter is not vulnerable.
Scenario 2: We have an insecure application and the login
parameter is vulnerable to injection. The attacker submits the first
injection and, because it always evaluates to false, the application
returns a login error message. The attacker then submits the second
query, which always evaluates to true. If in this case there is no login
error message, then the attacker knows that the attack went through and
that the login parameter is vulnerable to injection.
Example 7: Using Timing based inference attack to extract a table name from the database.
Username: ‘‘legalusr’ and ASCII(SUBSTRING((select top 1 name from sysobjects),1,1)) > X WAITFOR 5 --’’.
Query:
SELECT name from authors where username
= ’legalUser’ ASCII(SUBSTRING((select top 1 name from sysobjects),1,1))
> X WAITFOR 5 -- ’AND password=’ ’ AND pin=0;
Here, the SUBSTRING function extracts the first character of the
first table’s name. Using a binary search strategy, the attacker can ask
a series of questions about this character. In this case, the attacker
is asking if the ASCII value of the character is greater-than or
less-than-or-equal-to the value of X. If the value is greater, the
attacker knows this by observing an additional 5 second delay in the
response of the database. The attacker can then use a binary search by
varying the value of X to identify the value of the first character.
Alternate Encodings
Attack Intent: Evading detection.
Description: In this attack, the injected text is
modified so as to avoid detection by defensive coding practices and also
many automated prevention techniques. This attack type is used in
conjunction with other attacks. In other words, alternate encodings do
not provide any unique way to attack an application; they are simply an
enabling technique that allows attackers to evade detection and
prevention techniques and exploit vulnerabilities that might not
otherwise be exploitable. These evasion techniques are often necessary
because a common defensive coding practice is to scan for certain known
“bad characters,” such as single quotes and comment operators.
To evade this defense, attackers have employed alternate methods of
encoding their attack strings (e.g., using hexadecimal, ASCII, and
Unicode character encoding). Common scanning and detection techniques do
not try to evaluate all specially encoded strings, thus allowing these
attacks to go undetected. An effective code-based defense against
alternate encodings is difficult to implement in practice because it
requires developers to consider of all of the possible encodings that
could affect a given query string as it passes through the different
application layers. Therefore, attackers have been very successful in
using alternate encodings to conceal their attack strings.
Example 8: Every type of attack could be represented
using an alternate encoding; here we simply provide an example of how
mystic an alternatively-encoded attack could appear.
Username: “legalUser’; exec(0x73687574646f776e) - - ”
Query:
SELECT name from authors where username = ’legalUser’; exec(0x73687574646f776e) - - AND password=’ ’;
The stream of numbers in the second part of the injection is the
ASCII hexadecimal encoding of the string “SHUTDOWN.” Therefore, when the
query is interpreted by the database, it would result in the execution,
by the database, of the SHUTDOWN command.
Query: SELECT name from authors where username = ’legalUser’; exec(SHUTDOWN) - - AND password=’ ’;
Preventing SQL Injection Attacks
Techniques to prevent SQL Injection range from development best
practices to fully automated frameworks for detecting and preventing SQL
Injection Attacks.
Database Design Practices
Limiting Permissions: Limiting
Permissions naturally leads to a very effective method of preventing
attacks and limiting the damages from possible SQL injection attacks.
Some methods to be employed are:
- Use database accounts with limited permissions: only give the
necessary permissions to each account. Normally an application uses an
account to access the database and restrict user operations at the
application level. However, if a user uses SQL injection, all of the
application level security will be bypassed and the user will gain
access to the database with the full privileges of the account the
application uses to connect to the database.
- Use several database accounts: This would serve to compliment
the first precaution taken above. Since many users with different
privileges normally uses an application, the account we use to connect
to the database would need the necessary access rights for the most
privileged user; and if an unprivileged user commits an SQL Injection
attack, this user will get full access to the database as the most
privileged user. To solve this issue, it is best to have several
database accounts with which an application can connect to the database,
using different accounts based on the role of the logged in user.
Defensive Coding Practices
The root cause of SQL injection vulnerabilities is insufficient input
validation. Therefore, the straightforward solution for eliminating
these vulnerabilities is to apply suitable defensive coding practices.
Input type checking: SQL Injection Attacks can be
performed by injecting commands into either a string or numeric
parameter; a simple check of such inputs can prevent many attacks. For
example, in the case of numeric inputs, developers can simply reject any
input that contains characters other than numerical digits.
Concealing Error Messages: Injection attacks often
depend on the attacker having at least some information about the
database schema. As discussed in some of the attacks mentioned
previously, an attacker can gain much information through error messages
which may tell the attacker quite a lot about the schema. DBMS and
Programming Language Database Connectors generally provide clear,
informative error messages that are incredibly helpful to programmers,
but can also provide information to a malicious user.
After the launch of an application (the application is available to
end users), It is best to log errors directly to a table in the database
rather than outputting them to end user.
Encoding of inputs: Injection values into a
parameter is often accomplished through the use of meta-characters that
trick the SQL parser into interpreting user input as SQL tokens.
Prohibiting any usage of meta-characters would restrict a non-malicious
user’s ability to specify legal inputs that contain such characters. A
better solution is to use functions that encode a string in such a way
that all meta-characters are specially encoded and interpreted by the
database as normal characters. Here is an example in PHP:
$username = $_POST[username];
$username = mysqli_real_escape_string ($username);
mysql_query (SELECT first_name, last_name from authors where username = '$username’);
Positive pattern matching: Developers should
establish input validation routines that identify good input as opposed
to bad input. This approach is generally called positive validation, as
opposed to negative validation, which searches input for forbidden
patterns or SQL tokens. Because developers might not be able to envision
every type of attack that could be launched against their application,
but should be able to specify all the forms of legal input, positive
validation is a safer way to check inputs.
Identification of all input sources: Developers must
check all inputs to their application; there are many possible sources
of input to an application. If used to construct a query, these input
sources can be a way for an attacker to introduce an SQL Injection
Attack. Simply put, all input sources must be checked.
Although defensive coding practices remain the best way to prevent
SQL injection vulnerabilities, their application is problematic in
practice. Defensive coding is prone to human error and is not as
rigorously and completely applied as automated techniques Moreover,
approaches based on defensive coding are weakened by the widespread
promotion and acceptance of so-called “pseudoremedies”. We discuss two
of the most commonly-proposed pseudo-remedies:
- The first of such remedies consists of checking user input for SQL
keywords, such as “FROM,” “WHERE,” and “SELECT,” and SQL operators, such
as the single quote or comment operator. The rationale behind this
suggestion is that the presence of such keywords and operators may
indicate an attempted SQL Injection Attack. This approach clearly
results in a high rate of false positives because in many applications,
SQL keywords can be part of a normal text entry, and SQL operators can
be used to express formulas or even names (e.g., O’Brian).
- The second commonly suggested pseudo-remedy is to use stored
procedures or prepared statements to prevent SQL Injection Attacks.
Unfortunately, stored procedures and prepared statements can also be
vulnerable to SQL Injection Attacks unless developers rigorously apply
defensive coding guidelines.
Detection and Prevention Techniques
Researchers have proposed a range of techniques to assist developers
and compensate for the shortcomings in the application of defensive
coding.
Black Box Testing: Huang and colleagues proposed
WAVES, a black-box technique for testing Web applications for SQL
injection vulnerabilities. The technique uses a Web crawler to identify
all points in a Web application that may be vulnerable to SQL Injection
Attacks. It then builds attacks that target such points based on a
specified list of patterns and attack techniques. WAVES then monitors
the application’s response to the attacks and uses machine learning
techniques to improve its attack methodology. However, like all
black-box and penetration testing techniques, it cannot provide
guarantees of completeness.
Static Code Checkers: JDBC-Checker is a technique
for statically checking the type correctness of dynamically-generated
SQL queries. JDBC-Checker is able to detect one of the root causes of
SQLIA vulnerabilities in code - improper type checking of input.
Combined Static and Dynamic Analysis: AMNESIA is a
model-based technique that combines static analysis and runtime
monitoring. In its static phase, AMNESIA uses static analysis to build
models of the different types of queries an application can legally
generate at each point of access to the database. In its dynamic phase,
AMNESIA intercepts all queries before they are sent to the database and
checks each query against the statically built models. Queries that
violate the model are identified as SQL Injection Attacks and are
prevented from executing on the database. The primary limitation of this
technique is that its success is dependent on the accuracy of its
static analysis for building query models.
Taint Based Approaches: WebSSARI detects
input-validation related errors using information flow analysis. In this
approach, static analysis is used to check contaminated flows against
preconditions for sensitive functions. The analysis detects the points
in which preconditions have not been met and can suggest filters and
sanitization functions that can be automatically added to the
application to satisfy these preconditions. The WebSSARI system works by
considering as sanitized input that has passed through a predefined set
of filters. The primary drawbacks of this technique are that it assumes
that adequate preconditions for sensitive functions can be accurately
expressed using their typing system and that having input passing
through certain types of filters is sufficient to consider it not
tainted.
New Query Development Paradigms: Two recent approaches, SQL DOM and
Safe Query Objects, use encapsulation of database queries to provide a
safe and reliable way to access databases. These techniques offer an
effective way to avoid the SQL Injection Attack problem by changing the
query-building process from an unregulated one that uses string
concatenation to a systematic one that uses a type-checked API. Within
their API, they are able to systematically apply coding best practices
such as input filtering and rigorous type checking of user input. By
changing the development paradigm in which SQL queries are created,
these techniques eliminate the coding practices that make most SQL
Injection Attacks possible. Although effective, these techniques have
the drawback that they require developers to learn and use a new
programming paradigm or query-development process. Furthermore, because
they focus on using a new development process, they do not provide any
type of protection or improved security for existing legacy systems.
Intrusion Detection Systems (IDS): A proposed IDS
system to prevent SQL Injection is based on a machine learning technique
that is trained using a set of typical application queries. The
technique builds models of the typical queries and then monitors the
application at runtime to identify queries that do not match the model.
In the evaluation, it was shown that the system is able to detect
attacks with a high rate of success. However, the fundamental limitation
of learning based techniques is that they can provide no guarantees
about their detection abilities because their success is dependent on
the quality of the training set used.
Proxy Filters: Security Gateway is a proxy filtering
system that enforces input validation rules on the data flowing to a
Web application. Using their Security Policy Descriptor Language (SPDL),
developers provide constraints and specify transformations to be
applied to application parameters as they flow from the Web page to the
application server. Because SPDL is highly expressive, it allows
developers considerable freedom in expressing their policies. This
approach is human-based and, like defensive programming, requires
developers to know not only which data needs to be filtered, but also
what patterns and filters to apply to the data.
Instruction Set Randomization: SQLrand is an
approach based on instruction-set randomization. SQLrand provides a
framework that allows developers to create queries using randomized
instructions instead of normal SQL keywords. A proxy filter intercepts
queries to the database and de-randomizes the keywords. SQL code
injected by an attacker would not have been constructed using the
randomized instruction set. Therefore, injected commands would result in
a syntactically incorrect query. While this technique can be very
effective, it has several practical drawbacks: Firstly, since it uses a
secret key to modify instructions, security of the approach is dependent
on attackers not being able to discover the key; Secondly, the approach
imposes a significant infrastructure overhead because it require the
integration of a proxy for the database in the system