Page 1 of 2
addslashes
Posted: Sat Oct 20, 2007 6:11 pm
by alex.barylski
Most of us by now are aware that this simply doesn't cut the mustard when submitting to a database such as MySql - so instead we use mysql_* escape functions...
What about when posting GET data back into HTML however? Like say propagating values across page requests in a link? Is just simple escaping enough to prevent trivial XSS exploits?
Should we validate that data using regex as well? Or is addslashes enough?
Posted: Sat Oct 20, 2007 7:17 pm
by Kieran Huggins
If you're just re-populating input boxes (like a failed record creation) addslashes should be fine. If you're representing data that _has_ been entered into the database, just return the result that you stored.
Posted: Sun Oct 21, 2007 3:35 pm
by Mordred
Depending on the HTML context you put the data in, addslashes() may be far from adequate. At the very least it would render quote characters incorrectly. The correct function to use would be htmlentities() and you still must pay attention to the HTML context. In any case, Kieran is not right, do not "just return the result that you stored", this is how persistent XSS happens.
In fact, I cannot think of ONE useful usage of addslashes, can anyone?
Posted: Sun Oct 21, 2007 4:14 pm
by John Cartwright
Mordred wrote:Depending on the HTML context you put the data in, addslashes() may be far from adequate. At the very least it would render quote characters incorrectly. The correct function to use would be htmlentities() and you still must pay attention to the HTML context. In any case, Kieran is not right, do not "just return the result that you stored", this is how persistent XSS happens.
In fact, I cannot think of ONE useful usage of addslashes, can anyone?
Right on. I personally cannot remember an instance when I used addslashes().
Posted: Sun Oct 21, 2007 7:15 pm
by Kieran Huggins
Mordred is certainly correct that you should always do input cleaning on ALL data either going into the database or for general inclusion in a web page (preview, etc...). Since I make sure that only clean, safe data makes it into the database, I don't worry about it on the way out. This policy is backed by Cal Henderson (of Flickr fame) and seems to be fairly wise as far as I can tell. Most database entries tend to be write once / read many, so this will increase performance while helping to keep you safe when writing your views.
The only exception (that I was trying, albeit poorly, to express) is when a form entry fails to validate for some reason. For instance, if a username field contains unwanted characters and needs to be changed. I can't think of how re-populating the inputs from the $_GET array would be dangerous after addslashes() to is used escape the value attribute quotes.
I could always be wrong, and welcome any case examples of a potential security issue!
Posted: Sun Oct 21, 2007 9:11 pm
by Mordred
Kieran Huggins wrote:The only exception (that I was trying, albeit poorly, to express) is when a form entry fails to validate for some reason. For instance, if a username field contains unwanted characters and needs to be changed. I can't think of how re-populating the inputs from the $_GET array would be dangerous after addslashes() to is used escape the value attribute quotes.
It's wrong. Try to enter O'R"ly and tell me what happens
Kieran Huggins wrote:Since I make sure that only clean, safe data makes it into the database, I don't worry about it on the way out. This policy is backed by Cal Henderson (of Flickr fame) and seems to be fairly wise as far as I can tell. Most database entries tend to be write once / read many, so this will increase performance while helping to keep you safe when writing your views.
I could always be wrong, and welcome any case examples of a potential security issue!
It's good if you can absolutely, positively,
100% guarantee that it's write
once / read many to
one destination. It's only rarely so in the apps I've done and seen. Look up
second order SQL injection for example.
Posted: Sun Oct 21, 2007 9:19 pm
by shiflett
There is some dangerous information being presented here. Mordred's advice is sound, but I'd like to elaborate, just for clarification.
First, addslashes() is a function intended to escape data being used in SQL. Even for this use, it is inadequate. (See
my post on addslashes() for more information.) Using addslashes() to escape for any other context is wrong.
To escape for HTML, use htmlentities() or htmlspecialchars(), and be sure to match the character encoding you use in the Content-Type header, else
bad things can happen. Under no circumstances should you ever fail to escape a value before using it in HTML.
XSS is not an input filtering problem.
Cal's advice about preparing data for display prior to storing it in a database is for performance. If you can avoid escaping whenever you read from the database, and only do so before writing, you save yourself a lot of work. There are two things to keep in mind:
1. When you take this approach, escape for HTML first, then escape for SQL. This way, the data you read from the database has already been escaped for HTML. Failing to do this results in XSS vulnerabilities.
2. If you take this approach, the security of your application is tied to that of your database. (A vulnerability in the database can compromise your application.) This isn't an enormous risk, but it is a risk, and we should always know where we're placing our trust.
I hope this helps.
Posted: Mon Oct 22, 2007 12:02 am
by alex.barylski
So instead of using the mysql_* escpae functions and/or input filtering could I not simply call htmlentities() on every bit of data, assuming none were HTML required to later be sent to screen?
My particular problem right now is directly echoing $_GET data back into URL's to propagate things such as page indexes, etc...
Until now I have filtered that data using strict principle of least privilege regex. It would be a lot easier if I could just htmlentities that data.
Likewise, if all I had to do was htmlentities the data going to the database that would save me a lot of effort as well. Most applications I could get by by supporting a bbcode style of HTML and just convert bbcode to HTML at display time all the while storing the HTML hamrful characters as html entities instead.
I never even considered using htmlentities...until now...is it a solid way of securing data?
Posted: Mon Oct 22, 2007 12:54 am
by Kieran Huggins
Mordred is indeed correct. Nice!
@Hockey: htmlentities() won't protect you from SQL injection (as far as I know).
Posted: Mon Oct 22, 2007 12:35 pm
by Mordred
@
shiflett: thanks for the elaboration, it's hard to be detailed when typing with one hand and a baby in the other

I must add two more things one should be extra careful with:
htmlentities()/htmlspecialchars() must specify not only the correct encoding, but the correct quote style. I find it extremely silly that the default quote flag, ENT_COMPAT, will translate the double quotes and not the single ones. For PHP developers it's extremely easy to do:
Code: Select all
$sUserName = htmlentities($_GET['username']); //ATTN: insecure!
echo "<img src='avatars/{$sUsername}.jpg'>"; //silly example but you get the idea -- it's easiest to use single quotes.
At least when specifying an encoding, the quote parameter is mandatory, so one must take care to use
ENT_QUOTES.
-----
To be continued...
Posted: Mon Oct 22, 2007 1:37 pm
by Christopher
Mordred wrote:At least when specifying an encoding, the quote parameter is mandatory, so one must take care to use ENT_QUOTES.
This is a great discussions to learn the finer points on this subject. Thanks.
I have always used
ENT_NOQUOTES, what it the security reason to also escape the quotes as well in HTML?
Posted: Mon Oct 22, 2007 2:22 pm
by Mordred
... continued
(@Kieran Huggins)
The second thing is that in reality you don't have write once / read many data. Maybe the admin also has an interface for editing posts? Or the data is displayed in many contexts - in HTML, in the RSS feed and in the user-friendly mod_rewrite URLs. Pay extra caution when you take such design decisions and by all means do this only for a limited portion of the data.
@Hockey:
Nope, every data usage calls for its own escaping. Be doubly cautious with data that goes several ways (db | html | url | whatever).
@arborint:
You definitely need to take care about quotes when the user data goes in an HTML attribute. Otherwise, even if tag brackets would not be accessible, so the current tag would not be terminated, quotes will be used to terminate the current attribute, and add another - such as a ecmascript On* handler.
Posted: Mon Oct 22, 2007 7:52 pm
by Ambush Commander
I used to think that the " quote had to be escaped when used outside of an attribute value context, but it turns out this is not the case. Here's the rub with quote escaping:
If you're outputting text outside of attributes, ENT_NOQUOTES is sufficient and most readable. If you're outputting text inside attributes, you must use single quote escaping or double quote escaping, dependent on which delimiter is being used for the attribute. Using ENT_QUOTES is safest: it will always be safe. The single-quote escape sequence is pretty ugly though.
Posted: Tue Oct 23, 2007 9:31 pm
by alex.barylski
Assuming all my data has been filtered, or at best cast to an integer value, is it still required to call: mysql_real_escape_string on integer data or just character data?
Posted: Tue Oct 23, 2007 9:32 pm
by Ambush Commander
This is a point of contention. Those who believe in defense in depth would escape it anyway, however, in theory, there is no direct security risk posed by directly passing integers through without escaping.