Page 1 of 2

Your Thoughts on an Anti-Injection SQL Generating Script.

Posted: Thu Feb 09, 2006 5:12 pm
by daedalus__
Hello.

I've been developing in PHP for around a month now and realised that it would be good to have a script on-hand that can handle all my SQL queries. I started this script around 3 days ago as a means for myself to have injection-free SQL without embedding functions and SQL in my page(s).

I've seen other people write functions similar to this and am dissatisfied with the results. After working on this for a few hours I decided that it would not be a huge effort to create a script that anyone familiar with SQL syntax can use on their website.

My questions are; has this kind of thing been done before, like this? Would anyone be interested in using such a script? What kind of features would you like to see?

The current version is 0.3.1 and I'm busy building support for INSERT, UPDATE, and DELETE.

This was version 0.1.0:

Code: Select all

class dal
{
	public $shoes;
	private $dbhost;
	private $dbuser;
	private $dbpass;

	public function execSQL($fields, $from, $join, $field, $on, $refs, $values)
	{
		// Let's build a SQL string!
		$this->buildSQL($fields, $from, $join, $field, $on, $refs, $values);
		// Connect to MySQL
		mysql_connect($this->dbhost, $this->dbuser, $this->dbpass) or die('<b>Page Error: </b>Could not connect to the mysql server. '.mysql_error());
		mysql_select_db('igal') or die('<b>Page Error: </b>Could not select Database. '.mysql_error());
		// result and return variables
		$result	= mysql_query($this->sql) or die('<b>Page Error: </b>Could not query the database. '.mysql_error().'<br />'.$this->sql);
		$row		= mysql_fetch_array($result, MYSQL_NUM);
		// Free the result and return the value
		mysql_free_result($result);
		return $row;
	}

	// This function builds the sql string and then passes it to the class
	private function buildSQL($fields, $from, $join, $field, $on, $refs, $values)
	{
		// This is all very self-explanatory
		$this->innerJOIN($from, $join, $field, $on);
		$this->getSELECT($fields, $from);
		$this->getWHERE($refs, $values);
		$this->sql	=	$this->select.$this->inner.$this->where;
	}

	// This function builds the SELECT part of the sql string
	private function getSELECT($fields, $from) # array $fields
	{
		// Let's make sure there are no duplicate keys in our array
		$fields		= array_unique($fields);
		// Now let's turn it into a comma-delimited list
		$fields		= implode(", ", $fields);
		// Build the SELECT statement
		$this->select	= 'SELECT '.$fields.' FROM ';
		if ($this->shoes == 'dull') { $this->select .= $from.' '; }
	}

	// This function provides the object with the WHERE part of the sql string
	public function getWHERE($refs, $values) # array $refs, array $values
	{
		// Does the user want a WHERE statement?
		if (empty($refs) && empty($values)) { return; }
		$this->where	=	' WHERE ';
		
		// Firstly and most importantly we are going to hand off the $values array
		// to another function, which will parse the values and then return them
		// to the getWHERE function, this is all to prevent SQL injection attacks
		$values	=	$this->SQLInjectionSucks($values);	

		// Second let's make sure the number of keys in each of our arrays match
		$refCount		=	count($refs);
		$valueCount		=	count($values);
		
		if ($refCount != $valueCount) { die('$refs and $values arrays have different numbers of keys'); } else { $count = $refCount; unset($refCount); unset($valueCount); }
		
		// Now let's add parentheses to our WHERE statement
		$this->where .= '(';
		for ($i = 0; $i < $count; $i++)
		{
			$this->where .= '(('.$refs[$i].')='.$values[$i].')';
			if ($i != $count-1) { $this->where .= ' AND '; }
		}
		$this->where .= ')';
	}

	// This function, which is by far my favorite, provides the object
	// with all the information needed to perform any INNER JOIN's
	private function innerJOIN($from, $join, $field, $on) # string $from, array $join, array $field, array $on
	{
		// We should make sure that the user wants to do an INNER JOIN
		if (empty($join) && empty($field) && empty($on)) { return; }
		// Let's make sure there are no duplicate keys in our arrays
		$join			=	array_unique($join);
		$field		=	array_unique($field);
		$on			=	array_unique($on);
		// Now we should count the number of keys in each array and make sure they match
		$joinCount	=	count($join);
		$fieldCount =	count($field);
		$onCount		=	count($on);
		
		// If they don't match we should probably alert the user
		if ($joinCount != $fieldCount && $fieldCount != $onCount) { die('$join, $field, $on arrays have different numbers of keys, SQL cannot execute.'); } else { $count = $joinCount; unset($joinCount); unset($fieldCount); unset($onCount); }
		
		// This says 'add a parentheses for every join but the last one'
		for ($i = 0; $i < $count-1; $i++) { $this->inner .= '('; }
		// Now we add the table for the FROM statement
		$this->inner	.=	$from.' ';
		// Finally we add the inner join to the string
		for ($i = 0; $i < $count; $i++)
		{
			$this->inner	.=	'INNER JOIN '.$join[$i].' ON '.$field[$i].' = '.$on[$i];
			if ($i != $count-1) { $this->inner	.=	') '; }
		}
		$this->inner	.=	' ';
		$this->shoes	=	'shiny';
	}

	// This function is small, but very important. it will take an array of values
	// then parse them, removing any bad SQL that could be an injection attack
	function SQLInjectionSucks($values)
	{
		$search	= array('/INSERT/', '/UPDATE/', '/DELETE/', '/FROM/', '/VALUES/', '/\*/', '/ /', '/\;/', '/\'/', '/\"/', '/--/');
		$replace	= '';
		return preg_replace($search, $replace, $values);
	}

	// I'm still not quite sure what a construct is, but the class needs one
	// in order for it to hold all the variables it needs to build the sql stirng
	function __construct() {
			$this->shoes	= 'dull';
			$this->dbhost	= cSQLHost;
			$this->dbuser	= cSQLUser;
			$this->dbpass	= cSQLPass;

			$this->inner	=	'';
			$this->select	=	'';
			$this->where	=	'';
			$this->sql		=	'';
	}

}
Sorry for the mess. It's cleaner now than in version 0.1.0.

I would really like input from other developers on this script before I continue it. I'm relatively new to programming and constantly wonder if I'm doing a decent job.

Thanks for your time!

Posted: Thu Feb 09, 2006 5:32 pm
by josh
What's wrong with mysql_real_escape_string() ??

If my last name is O'Reilly I can't be in your database?

Posted: Thu Feb 09, 2006 5:39 pm
by daedalus__
I'm addressing that issue.

mysql_real_escape_string and addslashes will not work for everyone.
The only way to be 100% sure that injection attacks do not work on your script is with a whitelist.

Beyond those statements, I am not going to address that question anymore. There are many answers to it, some right here on this forum.

By the way, that picture rools.

Posted: Thu Feb 09, 2006 5:42 pm
by feyd
Something you missed.. "insert" passes. :)

Posted: Thu Feb 09, 2006 5:45 pm
by josh
Daedalus- wrote:The only way to be 100% sure that injection attacks do not work on your script is with a whitelist.
Simply untrue, a whitelist approach is recommended where possible, for instance if you are accepting the user's age in years check that $age == (int)$age before performing the insert, your script does not do this for me, it simply removes content I might need to store. What happens when I store binary data and your script replaces something instead of escaping it and my files get corrupted?

Also you did not address my question of why you are not escaping the data instead of removing it altogether, in what cases would mysql_real_escape_string not be available? (besides PHP4 < 4.3.0, in which case an upgrade would solve that). Also addslashes() should not be used to escape mysql queries


~~~~~

Also, you're not quoting your values?

Code: Select all

$this->where .= '(('.$refs[$i].')='.$values[$i].')';
What is the point of those parenthesis there?

Posted: Thu Feb 09, 2006 5:46 pm
by daedalus__
feyd wrote:Something you missed.. "insert" passes. :)
what?

I'd like to mention something about the code above. That is the first version of the script. It is designed only to handle SELECT statements. I decided to expand it and a better version is being worked on at this moment. I only wish to have some input from the community.

Posted: Thu Feb 09, 2006 5:48 pm
by feyd
any form of mixed case or lower case passes right by your "whitelist."

Posted: Thu Feb 09, 2006 5:51 pm
by josh
feyd wrote:your "whitelist."
Yup, you've actually designed a blacklist, you need to make it case-insensitive if you want it to work in the way you want it to, something like str_ireplace()

Posted: Thu Feb 09, 2006 8:25 pm
by daedalus__
The code was posted to give you an idea of how the script will work. It was not posted for you to pick apart. All these problems that you two have picked out of my code, which is still in alpha, are being addressed.

If you didn't notice, the code came with the questions. The reason I posted was because I wanted those questions answered. Address them or do not post in the thread.

feyd:

I'm aware that I designed a blacklist. That's why that code is in the trash.

jshpro2:

There are many reasons this kind of a script can be useful, beyond preventing SQL injection. That is only one of the features. I called the thread 'Your Thoughts on an Anti-Injection SQL Generating Script' because that is what I wanted from users who reply to it, also because that is my summary of what this script does.

both:

Try to keep in mind that it is more a concept than a script at this point.

The idea is to have a script that will generate SQL that is secure. It will be easy enough for developers with a very basic knowledge of SQL but advanced enough to fill most developers needs.

My questions are; has this kind of thing been done before? Would anyone be interested in using such a script? What kind of features would you like to see?

Posted: Thu Feb 09, 2006 9:02 pm
by neophyte
The idea of a whitelist is that you know exactly what type "should be coming". So if you know an incoming variable is an integer use intval(), or if you know it's a string it should be escaped (mysql_real_escape_string()). If I was writing the class I'd be thinking about types of data. I might start with a request array of some kind and then assign some types to the /post/get/ variable.

Posted: Thu Feb 09, 2006 9:07 pm
by Roja
Daedalus- wrote:The code was posted to give you an idea of how the script will work. It was not posted for you to pick apart. All these problems that you two have picked out of my code, which is still in alpha, are being addressed.
You specifically stated "I would really like input from other developers on this script before I continue it".

There were no conditions, limitations, or other commentary restricting that feedback.

Developers, especially when asked to find a secure solution, start by breaking down the code presented. If you take that as criticism, that isn't entirely wrong, but you miss the value - it is constructive criticism. By learning what your script doesn't do, and what it isn't, you learn how to make it more secure, and do what you want.
Daedalus- wrote:If you didn't notice, the code came with the questions. The reason I posted was because I wanted those questions answered. Address them or do not post in the thread.
Your tone isn't appropriate. These forums are an open discussion, and by posting here, asking for people's feedback, you need to be friendly and understanding that other people may have a different opinion than yours. *Asking* people to answer is one thing, *demanding* that people do nothing else is a quick method to getting no responses, and potentially, worse.
Daedalus- wrote:I'm aware that I designed a blacklist. That's why that code is in the trash.
So share the new code so that we can help you improve. Thats what we try to do - help people improve their code and development ideas. In some cases, we even try to help solve problems.
Daedalus- wrote:Try to keep in mind that it is more a concept than a script at this point.
Thats a helpful clarification. You never made that statement before, or hinted at it. By including the script in the original post, it made the opposite statement - that you had a script, and needed feedback on the script.

I think if anyone chooses to respond further, the answers will be different in nature now.
Daedalus- wrote:The idea is to have a script that will generate SQL that is secure. It will be easy enough for developers with a very basic knowledge of SQL but advanced enough to fill most developers needs.
The problem is that your criteria is too loose. You specifically mention in a previous post that you feel that "mysql_real_escape_string and addslashes will not work for everyone", but then specifically say that you will not address that question further.

Without understanding the logic and reasoning behind your design criteria (Which right now are best summarized as "Secure beyond mysql_real_escape_string and addslashes), we cannot reasonably be expected to discuss what could fit your non-existant criteria.

Please, educate us so we can help you further. Explain what your criteria are, and what specific cases you are trying to solve beyond mysql_real_escape_string and addslashes. Then we can discuss design ideas.
Daedalus- wrote:My questions are; has this kind of thing been done before?
Of course. One done by a valued contributor to several opensource projects, including Smarty: http://www.phpinsider.com/php/code/SafeSQL/
Daedalus- wrote:Would anyone be interested in using such a script?
To be honest, you haven't explained what your script does in specific enough detail. As a general statement, I would trust a script written with clear purpose, by an established coder, and widely distributed more. Thats not to say you can't fit all three in time, I'm just clarifying *my* criteria for interest, since you asked.
Daedalus- wrote:What kind of features would you like to see?
Honestly, I use adodb for all db access, and with bind variables, there is little risk left for an attacker to exploit. Combine that with decent controls around what input you accept (filtered), and I fail to see much that I need to address with an additional script. But by all means - educate me.

Posted: Thu Feb 09, 2006 9:22 pm
by feyd
Your questions were buried inside your post. They have been ignored by me, because I didn't see them. I barely saw them when you mentioned them again. Your choice of titles was, I guess a bit misleading.

So here are the answers:
  1. Yes, there are many database abstraction layers around. ADOdb is probably the most famous for PHP.
  2. I don't see the point of using your script over the something as feature rich and heavily used ADOdb. Maybe you could shed some light.
  3. I don't particularly like the entire query being generated for me, as it can be mildly unpredictable. So I can't really think of anything at the moment.

Posted: Thu Feb 09, 2006 10:58 pm
by nincha
got a headache reading this thread. :x

Posted: Fri Feb 10, 2006 1:09 am
by Benjamin
OMG I have never seen code that well written in my life!!!!!

Somewhere there is a god, and somewhere else, there is someone who thinks they are a god. I am not going to comment on this post. I have already addressed that issue.

Posted: Fri Feb 10, 2006 4:04 am
by Benjamin
feyd | removed. wasteful post.