SQL Injection prevention

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

User avatar
The Phoenix
Forum Contributor
Posts: 294
Joined: Fri Oct 06, 2006 8:12 pm

SQL Injection prevention

Post by The Phoenix »

I'm looking to contribute to an open source project, and improve their security processes for database calls.

Before I do, I want to be absolutely sure I understand all the potential gotchas before giving bad advice.

So, to the community at large I ask the question:

What php code needs to be run on query text prior to sending to a database to prevent only sql injections.

Notably, the issues of XSS, CSRF, etc are all handled in a very different space. I'm just interested in preventing sql injections.

So, on mysql, is it as simple as:

mysql_real_escape_string($query)

Or is there more that must be done?

Please be specific with examples of items that will not handle properly, and ideally, the correct code/method to handle them properly.

Thanks gang. :)
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

mysql_real_escape_string() should do it. :)

However, I'm all for validating data before it goes into the database.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

I would at minimum filter the incoming data. You can also validate the input and reject requests that are not well formed.
(#10850)
User avatar
The Phoenix
Forum Contributor
Posts: 294
Joined: Fri Oct 06, 2006 8:12 pm

Post by The Phoenix »

arborint wrote:I would at minimum filter the incoming data.
Do you feel that mysql_real_escape_string accomplishes that effectively?
arborint wrote:You can also validate the input and reject requests that are not well formed.
This is for a database library, where the library doesn't have awareness of the input format, or the request types. Those activities would have to take place outside of the library itself, although I agree that they are important and need to be done.
mrkite
Forum Contributor
Posts: 104
Joined: Tue Sep 11, 2007 4:19 am

Post by mrkite »

Use PEAR:DB, PEAR:MDB2, or pdo and use prepared statements.

No need to worry about escaping.
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

Nope, mysql_real_escape_string() is not enough, here's why (with plenty of examples):
http://www.webappsec.org/projects/articles/091007.shtml

Generic filtering of input is good, but it works on another level, the DB-aware layers should concentrate on proper syntax and escaping.
User avatar
stereofrog
Forum Contributor
Posts: 386
Joined: Mon Dec 04, 2006 6:10 am

Post by stereofrog »

Mordred wrote:Nope, mysql_real_escape_string() is not enough, here's why (with plenty of examples):
http://www.webappsec.org/projects/articles/091007.shtml
Nice stuff, thanks. Most people don't realize that escaping only protects the value parts of the queries.
matthijs
DevNet Master
Posts: 3360
Joined: Thu Oct 06, 2005 3:57 pm

Post by matthijs »

Indeed. That was an educational read.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

I don't believe that you should run in on the entire query, anyway... Then you'd escape everything instead of only the data.
User avatar
The Phoenix
Forum Contributor
Posts: 294
Joined: Fri Oct 06, 2006 8:12 pm

Post by The Phoenix »

Mordred wrote:Nope, mysql_real_escape_string() is not enough, here's why (with plenty of examples):
http://www.webappsec.org/projects/articles/091007.shtml

Generic filtering of input is good, but it works on another level, the DB-aware layers should concentrate on proper syntax and escaping.
Thats an excellent article about the full lifecycle of validation and escaping, but I'm struggling with a few issues.

Starting from #4 down, #4 highlights an issue with validation - not escaping. Further, even you state:
The reader must realise that input validation (in our case making sure that what we expect to be an int is really int) and escaping the parameter before giving it to the query are two different security steps. In this particular case either one will suffice, but in general, for in-depth security, you must always do both. Also, the two tasks will most probably be carried by two different subsystems in your real-life code, so the validating and escaping code will not be adjacent as displayed in this simplified case.
I don't think a database abstraction layer has sufficient awareness to be able to determine whether an item is intended to be a string or an int, and as a result, can't reliably perform the validation. That has to be performed in a different subsystem, correct?

#5 is a variation on the same issue, albeit with a much more harsh bite. Nevertheless, it still puts the requirement on validation, which the db abstraction layer isn't sufficiently informed to do.

#6 is validation prevented as well, this time with hex instead of ints.

#8 is covered, because the db abstraction layer in question prevents multiple queries in a single call.

So that really leaves #7 as an issue that the db abstraction layer can/should effectively work to prevent, is that correct?

#7 is easy for me to understand, but I'm stumbling on understanding the proper prevention. I guess I'm lost because of the line:
Test 3 demonstrates that in our case we don't need to escape the LIKE escape character, as mysql_real_escape_string() already escaped it.
Even though above that it showed mysql_real_escape_string did NOT escape it. I'm sure I'm just not understanding effectively.

So, with validation not something the db abstraction layer can effectively provide, is there more that needs to be done to escape the query than just mysql_real_escape_string (at the db abstraction layer only)?

I suspect from #7 that there is. Perhaps if we detect a % sign (or another wildcard for LIKE), we run addcslashes?

Thanks for the informative post!
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

The Phoenix wrote:
Mordred wrote:Nope, mysql_real_escape_string() is not enough, here's why (with plenty of examples):
http://www.webappsec.org/projects/articles/091007.shtml

Generic filtering of input is good, but it works on another level, the DB-aware layers should concentrate on proper syntax and escaping.
Thats an excellent article about the full lifecycle of validation and escaping, but I'm struggling with a few issues.

Starting from #4 down, #4 highlights an issue with validation - not escaping. Further, even you state:
The reader must realise that input validation (in our case making sure that what we expect to be an int is really int) and escaping the parameter before giving it to the query are two different security steps. In this particular case either one will suffice, but in general, for in-depth security, you must always do both. Also, the two tasks will most probably be carried by two different subsystems in your real-life code, so the validating and escaping code will not be adjacent as displayed in this simplified case.
I don't think a database abstraction layer has sufficient awareness to be able to determine whether an item is intended to be a string or an int, and as a result, can't reliably perform the validation. That has to be performed in a different subsystem, correct?
In example 4 the issue is twofold, that's why it can be solved in two different ways. Either the application layer forces the value to be integer (i.e. we rely on validation), and we leave the field unquoted (which is a valid SQL syntax), OR (much better) we leave this to the database layer, by adding quotes (which is also valid SQL syntax, although maybe only for some DB systems, MySQL for sure) and doing escaping. The best ( I called it robust in the article, because it will continue to work even after source modifications) makes use of both techniques at the same time.

So yes, the validation is done on a different level (to emphasize this, I put an elipsis between the validation and escaping). As a side note, though, in the cases where we're talking about ints, the database layer is aware enough of the type of data it needs - after all the column type in the table is known.
The Phoenix wrote: #5 is a variation on the same issue, albeit with a much more harsh bite. Nevertheless, it still puts the requirement on validation, which the db abstraction layer isn't sufficiently informed to do.
On the contrary, see the above comment. The database layer knows where the variable will be put in the query, and it knows that it wants an integer. In a sense, casting to int is equivalent to escaping - all invalid characters (non-digits in our case) are "escaped" by being removed.
The Phoenix wrote: #6 is validation prevented as well, this time with hex instead of ints.
Oh, #6 means chapter 6 ... I thought is it Example 6. Ok.

No, it is not a validation issue, frankly it is the developer being an idiot ;) Alas, I've seen it happen in the wild both with column names and sort directions... Anyway, as in the case of ints above, this is not validation. The systems that handle input shouldn't know such intimate details about the database structure, it is the DB-layer's job to ensure that what it is being given is not invalid names or sorting keywords. Ideally, this shouldn't happen at all. I personally implement ordering instructions by encoding them with a number - $n/2 gives the column number, $n%2 gives ASC/DESC. In that way only valid column names and order keywords go in the dynamic query.

Also the hex encoding is just a way to use the vulnerability without quotes, it goes around the escaping mechanism, not the validation mechanism. An attacker could also use MySQL string functions (CHAR/CONCAT) to construct string values,

Again, the fault is not in the lack of validation! I very much dislike the idea that secure code need validation, but that's a topic for another rant.
The Phoenix wrote: #8 is covered, because the db abstraction layer in question prevents multiple queries in a single call.

So that really leaves #7 as an issue that the db abstraction layer can/should effectively work to prevent, is that correct?

#7 is easy for me to understand, but I'm stumbling on understanding the proper prevention. I guess I'm lost because of the line:
Test 3 demonstrates that in our case we don't need to escape the LIKE escape character, as mysql_real_escape_string() already escaped it.
Even though above that it showed mysql_real_escape_string did NOT escape it. I'm sure I'm just not understanding effectively.

So, with validation not something the db abstraction layer can effectively provide, is there more that needs to be done to escape the query than just mysql_real_escape_string (at the db abstraction layer only)?

I suspect from #7 that there is. Perhaps if we detect a % sign (or another wildcard for LIKE), we run addcslashes?

Thanks for the informative post!
I admit that the wording of that paragraph could be clearer. It describes the highly theoretical scenario of using another escape character for the LIKE statement. The "LIKE escape character" is backslash (\), and mysql_real_escape_string() does escape it. The reason I mention it, is because if we choose another escape character, lets say '*', and we use addcslashes, the attacker could give us a query containing the new escape character:

Code: Select all

$search = '*'; //attack
$search = mysql_real_escape_string($search);
$search = addcslashes($search, "%_");
RunQuery("SELECT * FROM `table` WHERE `column` LIKE '$search*%' ESCAPE '*' "); //we want to search for values ending in '%'.
What would happen is that the injected '*' will escape the constant '*', and the '%' sign will lose its escaping, and will start to act as a wildcard. As I said this is highly theoretical (haven't seen it done in the real world) and it is much milder than being able to inject wildcards in the beginning of a LIKE parameter.


Edit: To reemphasize my answer to that question:
So, with validation not something the db abstraction layer can effectively provide, is there more that needs to be done to escape the query than just mysql_real_escape_string (at the db abstraction layer only)?
All prevention steps are done entirely in the database layer. It knows all information it needs to do so. No fancy-shmancy validation is needed to have secure database code.
User avatar
junix
Forum Newbie
Posts: 5
Joined: Tue Sep 25, 2007 2:10 am
Location: MSU, Philippines
Contact:

just keep the foe...

Post by junix »

just learning here !
User avatar
Josh1billion
Forum Contributor
Posts: 316
Joined: Tue Sep 11, 2007 3:25 pm

Post by Josh1billion »

Very good read, Mondred, it helped me out. :)
User avatar
arpowers
Forum Commoner
Posts: 76
Joined: Sun Oct 14, 2007 10:05 pm
Location: san diego, ca

Post by arpowers »

my mysql_prep function...

please let me know if I need to improve it:P

Code: Select all

function mysql_prep( $value ) {
		$magic_quotes_active = get_magic_quotes_gpc();
		$new_enough_php = function_exists( "mysql_real_escape_string" ); // i.e. PHP >= v4.3.0
		if( $new_enough_php ) { // PHP v4.3.0 or higher
			// undo any magic quote effects so mysql_real_escape_string can do the work
			if( $magic_quotes_active ) { $value = stripslashes( $value ); }
			$value = mysql_real_escape_string( $value );
		} else { // before PHP v4.3.0
			// if magic quotes aren't already on then add slashes manually
			if( !$magic_quotes_active ) { $value = addslashes( $value ); }
			// if magic quotes are active, then the slashes already exist
		}
		return $value;
	}
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

Code: Select all

function mysql_prep( $value ) {
      if (get_magic_quotes_gpc()) {
         $value = stripslashes($value);
      }

      //post php 4.3
      if (function_exists('mysql_real_escape_string')) {
         return mysql_real_escape_string($value);
      }
      //pre php 4.3
      else {
         return mysql_escape_string($value);
      }
   } 
}
You can change the addslashes() to mysql_escape_string(), generally you want to avoid addslash()'ing data into your db.
Post Reply