$_POST handling and other security

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

Post Reply
User avatar
Stryks
Forum Regular
Posts: 746
Joined: Wed Jan 14, 2004 5:06 pm

$_POST handling and other security

Post by Stryks »

I've been working on a site for some time now. Kind of a pet project in amongst my other work. I started the site with a sketched in security system and built on from there. After a few revisions and translations into a series of classes (session / database / user) I now have a fairly modular system which serve my purposes quite well.

But now that the site is nearing completion, I've gone back and had a bit of a security audit. I read up all I could find on security and tried to find any holes in my system, and yeah ... I found many. :oops:

So there are a few steps I want to implement to try and overcome it's flaws.

* Have all forms use the post method and check all inputs from $_POST
* Insert a random single use value into every form view and compare to session stored value on submit.
* Regenerate session id on every page view

But what is going to require the most adjustment is the handling of collected post data. At the moment, I collect it, do some cursory checks (is there a value here basically) and then hand it off to the database class to be stored. The database class does do a bit of scrubbing for me (stripslashes / htmlentities / mysql_real_escape_string). At the time I thought this was enough, but then what about the unsafe info before it is written. It would be a simple oversight to display a field value before saving it, especially if a form failed validation and the form was redisplayed with the original values re-displayed as part of an error message or the like. Not to mention forms where the info isn't saved such as search boxes or even login boxes (though the database class does clean the query, I already have a "you searhed for: X" dialog sourced from the $_POST value).

So, I have written a class to handle my form validation ( as in making sure the fields contain expected information ) which is kind of cumbersome to set up, but does what I need without making the code too ugly.

Code: Select all

// Start form validation
	require_once(PATH_CLASS . "class_validform.php");
	$vform = &new validform("form_login");	
	// Configure form validation
	// R = Required : M = AlphaNumeric (Mixed Alpha & Numeric) : ># = Longer(text) / Larger(number) than #
	$vform->add_element("username", "RA");
	$vform->add_element("password", "RA>7");
	// Validate
	$vform->run_test();
I'm just wondering whether this class would be best suited to securing the input as well as validating, and if so, what should be done to the values to make them safe? I'm not expecting any html or markup of any kind, so I dont need to preserve tags. Recommendations?

Also, if I really drill down the form input, what security is required then for database writes? Should I get rid of everything except the mysql_real_escape_string call, or is even that overkill?

Given the above implementations, is there anything I'm overlooking that could do with plugging? (apart from $_GET variables which I'm going to have to have a think about)

Thanks for taking the time to read and for any advice you can offer. :)
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

* Have all forms use the post method and check all inputs from $_POST
Good
* Insert a random single use value into every form view and compare to session stored value on submit.
This is to prevent CSRF I guess. Specify a time limit as well for extra security.
* Regenerate session id on every page view
That could interfere with users. Regenerate the session id only when there is a slackening change in permissions. For instance where an admin user moves from the standard user area to the admin area.
The database class does do a bit of scrubbing for me (stripslashes / htmlentities / mysql_real_escape_string)
Disable magic_quotes and you won't need to stripslashes.
form was redisplayed with the original values re-displayed as part of an error message or the like.
I already have a "you searhed for: X" dialog sourced from the $_POST value
These won't be a problem as long as the output is html escaped.
I'm just wondering whether this class would be best suited to securing the input as well as validating, and if so, what should be done to the values to make them safe? I'm not expecting any html or markup of any kind, so I dont need to preserve tags. Recommendations?
Its simple really you html escape everything you output and you mysql escape everything that goes into the database. Do not store html escaped strings in the database as this limits what you can do with them an also lurs you into a false sense of security believing that everything that is in the database is html escaped. I avoid striptags as this limits the user who just happens to want to use < or > in their posts. Imagine if PHPBB did that?!
mysql_real_escape_string call, or is even that overkill?
No, it is essential.
User avatar
Stryks
Forum Regular
Posts: 746
Joined: Wed Jan 14, 2004 5:06 pm

Post by Stryks »

Thanks for the response.

I am going to have a lot more questions on this before I grasp it, I can just feel it.

Firstly, why the timeout check on the embedded form variable? If it is generated when the form is created and either reset (when the form is revisited) or destroyed (when the form is processed) then how will a timestamp add security? I mean, surely the only vulnerability would be if someone had access to the session id and the forms embedded value and either used it before the form was processed or the user was logged out (15 minutes).

Or am I getting confused already. An example of abuse would be great.

Also, while I'm aware I'm double handling a bit here, for the sake of saving me from doing something without thinking, wouldn't it be best to pass all form input through htmlentities and then, where I specifically want html to be allowed, I display it with html_entity_decode? I guess I'm more comfortable with that as there is no place in my site where users would benefit from marking up their input.

More sure to come, but all input is appreicated.

Cheers
User avatar
Stryks
Forum Regular
Posts: 746
Joined: Wed Jan 14, 2004 5:06 pm

Post by Stryks »

Ok .... So I have modified the class to handle both $_GET and $_POST data, so that when the class is loaded, it loops through all elements of each of the arrays and converts values to the result of htmlentities($original_value, ENT_QUOTES, 'utf-8'). As I mentioned earlier, for some this may be the wrong approach as most of the data presented may be designed to contain html, meaning that you would have to convert it back and then run it through an HTML validation routine (like html purifier). My input is not going to allow html so I feel justified in taking this approach.

While writing this I realised it would be a simple thing to just check the input for tags and run it through purifier if required ... something to consider later anyhow.

So, my POST and GET values, while not safe, are at least a little safer. Safe enough to display back without any additional checking though. The class then tests defined values against a range of expected patterns and returns an error code on failure. The data when written to the database is passed with mysql_real_escape_string(trim($value)), so one would hope that is enough to stop any injection attempts.

So that covers input validation and database storage.

My login routine now regenerates the session id when the status changes from guest to user, from user to admin, and from admin back to user. When a user is logged out their session is destroyed and a new session is created on the next page load. Guest permissions are assigned to new sessions, so someone starting a session fixation attack could only get guest access, as the session id will be regenerated when the potential victim logs in. I also fingerprint the user with md5($_SERVER['HTTP_USER_AGENT']) and force a re-login if it changes between requests. It's a small measure which can be fooled, but it can only help.

The class now also generates the random key and timout value for embedding in forms to discourage CSRF attacks.

Apart from that, my brain is fried and I can think of nothing else that isnt going to majorly detract from the users experience of the site.

Can anyone think of anything else I should be covering, or offer any suggestions about how to improve the processes described.

I can post the class if it will help with locking it down, or help anyone else for that matter.

Cheers
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

Firstly, why the timeout check on the embedded form variable? If it is generated when the form is created and either reset (when the form is revisited) or destroyed (when the form is processed) then how will a timestamp add security? I mean, surely the only vulnerability would be if someone had access to the session id and the forms embedded value and either used it before the form was processed or the user was logged out (15 minutes).
Timeouts aren't usually required - once you are certain the form can only be submitted once (i.e. once submitted delete the session stored id). Timeouts can be useful where there's a perceived risk in leaving a submission outstanding for too long.
Apart from that, my brain is fried and I can think of nothing else that isnt going to majorly detract from the users experience of the site.
If you're using a specific character set, e.g. UTF-8, set the form attribute accept-charset. It's a little added incentive to tell browsers to submit data in the expected character encoding. A few vulnerabilities exist where filtering logic expects one encoding, but get's another, and the filtering logic fails to verify the encoding properly. Such encoding mis-matches can let things slip through your gates...
Also, while I'm aware I'm double handling a bit here, for the sake of saving me from doing something without thinking, wouldn't it be best to pass all form input through htmlentities and then, where I specifically want html to be allowed, I display it with html_entity_decode? I guess I'm more comfortable with that as there is no place in my site where users would benefit from marking up their input.
Okay this is where you need to become aware of "context". Where data is potentially dangerous in a specific form. Take input coming in - right at the beginning. At this time we must (it's not optional ;)) filter it, i.e. check the data is exactly what we expect it to be. If it's not what we expect it to be then it should immediately be considered suspicious and NEVER get anywhere near the database, our application internals, or output as HTML/Other.

There are lots of stuff you might want to do with input once it's considered safe - escaping is the common one. It's a common thought to automatically escape right there and then. But I disagree with the practice. I would delay any form of escaping until the data enters a new context - i.e. you are going to write it to a database or output it as HTML. At those two points implement a feature to handle escaping. For example, MySQL supports Prepared Statements which can escape SQL automatically without calling mysql_real_escape_string a hundred times in an application. A template system like Smarty offers an escape() method for HTML.

One important point here on context. htmlentities() escaped input can't be used outside the HTML output context (with a pinch of salt since html output can go anywhere - into a database, a cache, the browser). To use these values in SQL statements you'd need to unescape them (unless you intend storing escaped html). Same might go for writing XML, outputting to a file for caching, etc. So at the very least offer a method for retrieving unescaped data (which is safe) for those other uses.

On the quoted section - hopefully this explains why default htmlentities() isn't going to work very well. Other contexts will need the original filtered data without the escaping. In most cases, good filtering will already eliminate any values containing unwanted HTML but feel free to bring that up again.
Post Reply