addslashes
Moderator: General Moderators
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
addslashes
Most of us by now are aware that this simply doesn't cut the mustard when submitting to a database such as MySql - so instead we use mysql_* escape functions...
What about when posting GET data back into HTML however? Like say propagating values across page requests in a link? Is just simple escaping enough to prevent trivial XSS exploits?
Should we validate that data using regex as well? Or is addslashes enough?
What about when posting GET data back into HTML however? Like say propagating values across page requests in a link? Is just simple escaping enough to prevent trivial XSS exploits?
Should we validate that data using regex as well? Or is addslashes enough?
- Kieran Huggins
- DevNet Master
- Posts: 3635
- Joined: Wed Dec 06, 2006 4:14 pm
- Location: Toronto, Canada
- Contact:
Depending on the HTML context you put the data in, addslashes() may be far from adequate. At the very least it would render quote characters incorrectly. The correct function to use would be htmlentities() and you still must pay attention to the HTML context. In any case, Kieran is not right, do not "just return the result that you stored", this is how persistent XSS happens.
In fact, I cannot think of ONE useful usage of addslashes, can anyone?
In fact, I cannot think of ONE useful usage of addslashes, can anyone?
- John Cartwright
- Site Admin
- Posts: 11470
- Joined: Tue Dec 23, 2003 2:10 am
- Location: Toronto
- Contact:
Right on. I personally cannot remember an instance when I used addslashes().Mordred wrote:Depending on the HTML context you put the data in, addslashes() may be far from adequate. At the very least it would render quote characters incorrectly. The correct function to use would be htmlentities() and you still must pay attention to the HTML context. In any case, Kieran is not right, do not "just return the result that you stored", this is how persistent XSS happens.
In fact, I cannot think of ONE useful usage of addslashes, can anyone?
- Kieran Huggins
- DevNet Master
- Posts: 3635
- Joined: Wed Dec 06, 2006 4:14 pm
- Location: Toronto, Canada
- Contact:
Mordred is certainly correct that you should always do input cleaning on ALL data either going into the database or for general inclusion in a web page (preview, etc...). Since I make sure that only clean, safe data makes it into the database, I don't worry about it on the way out. This policy is backed by Cal Henderson (of Flickr fame) and seems to be fairly wise as far as I can tell. Most database entries tend to be write once / read many, so this will increase performance while helping to keep you safe when writing your views.
The only exception (that I was trying, albeit poorly, to express) is when a form entry fails to validate for some reason. For instance, if a username field contains unwanted characters and needs to be changed. I can't think of how re-populating the inputs from the $_GET array would be dangerous after addslashes() to is used escape the value attribute quotes.
I could always be wrong, and welcome any case examples of a potential security issue!
The only exception (that I was trying, albeit poorly, to express) is when a form entry fails to validate for some reason. For instance, if a username field contains unwanted characters and needs to be changed. I can't think of how re-populating the inputs from the $_GET array would be dangerous after addslashes() to is used escape the value attribute quotes.
I could always be wrong, and welcome any case examples of a potential security issue!
It's wrong. Try to enter O'R"ly and tell me what happensKieran Huggins wrote:The only exception (that I was trying, albeit poorly, to express) is when a form entry fails to validate for some reason. For instance, if a username field contains unwanted characters and needs to be changed. I can't think of how re-populating the inputs from the $_GET array would be dangerous after addslashes() to is used escape the value attribute quotes.
Kieran Huggins wrote:Since I make sure that only clean, safe data makes it into the database, I don't worry about it on the way out. This policy is backed by Cal Henderson (of Flickr fame) and seems to be fairly wise as far as I can tell. Most database entries tend to be write once / read many, so this will increase performance while helping to keep you safe when writing your views.
I could always be wrong, and welcome any case examples of a potential security issue!
It's good if you can absolutely, positively, 100% guarantee that it's write once / read many to one destination. It's only rarely so in the apps I've done and seen. Look up second order SQL injection for example.
There is some dangerous information being presented here. Mordred's advice is sound, but I'd like to elaborate, just for clarification.
First, addslashes() is a function intended to escape data being used in SQL. Even for this use, it is inadequate. (See my post on addslashes() for more information.) Using addslashes() to escape for any other context is wrong.
To escape for HTML, use htmlentities() or htmlspecialchars(), and be sure to match the character encoding you use in the Content-Type header, else bad things can happen. Under no circumstances should you ever fail to escape a value before using it in HTML. XSS is not an input filtering problem.
Cal's advice about preparing data for display prior to storing it in a database is for performance. If you can avoid escaping whenever you read from the database, and only do so before writing, you save yourself a lot of work. There are two things to keep in mind:
1. When you take this approach, escape for HTML first, then escape for SQL. This way, the data you read from the database has already been escaped for HTML. Failing to do this results in XSS vulnerabilities.
2. If you take this approach, the security of your application is tied to that of your database. (A vulnerability in the database can compromise your application.) This isn't an enormous risk, but it is a risk, and we should always know where we're placing our trust.
I hope this helps.
First, addslashes() is a function intended to escape data being used in SQL. Even for this use, it is inadequate. (See my post on addslashes() for more information.) Using addslashes() to escape for any other context is wrong.
To escape for HTML, use htmlentities() or htmlspecialchars(), and be sure to match the character encoding you use in the Content-Type header, else bad things can happen. Under no circumstances should you ever fail to escape a value before using it in HTML. XSS is not an input filtering problem.
Cal's advice about preparing data for display prior to storing it in a database is for performance. If you can avoid escaping whenever you read from the database, and only do so before writing, you save yourself a lot of work. There are two things to keep in mind:
1. When you take this approach, escape for HTML first, then escape for SQL. This way, the data you read from the database has already been escaped for HTML. Failing to do this results in XSS vulnerabilities.
2. If you take this approach, the security of your application is tied to that of your database. (A vulnerability in the database can compromise your application.) This isn't an enormous risk, but it is a risk, and we should always know where we're placing our trust.
I hope this helps.
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
So instead of using the mysql_* escpae functions and/or input filtering could I not simply call htmlentities() on every bit of data, assuming none were HTML required to later be sent to screen?
My particular problem right now is directly echoing $_GET data back into URL's to propagate things such as page indexes, etc...
Until now I have filtered that data using strict principle of least privilege regex. It would be a lot easier if I could just htmlentities that data.
Likewise, if all I had to do was htmlentities the data going to the database that would save me a lot of effort as well. Most applications I could get by by supporting a bbcode style of HTML and just convert bbcode to HTML at display time all the while storing the HTML hamrful characters as html entities instead.
I never even considered using htmlentities...until now...is it a solid way of securing data?
My particular problem right now is directly echoing $_GET data back into URL's to propagate things such as page indexes, etc...
Until now I have filtered that data using strict principle of least privilege regex. It would be a lot easier if I could just htmlentities that data.
Likewise, if all I had to do was htmlentities the data going to the database that would save me a lot of effort as well. Most applications I could get by by supporting a bbcode style of HTML and just convert bbcode to HTML at display time all the while storing the HTML hamrful characters as html entities instead.
I never even considered using htmlentities...until now...is it a solid way of securing data?
- Kieran Huggins
- DevNet Master
- Posts: 3635
- Joined: Wed Dec 06, 2006 4:14 pm
- Location: Toronto, Canada
- Contact:
@shiflett: thanks for the elaboration, it's hard to be detailed when typing with one hand and a baby in the other 
I must add two more things one should be extra careful with:
htmlentities()/htmlspecialchars() must specify not only the correct encoding, but the correct quote style. I find it extremely silly that the default quote flag, ENT_COMPAT, will translate the double quotes and not the single ones. For PHP developers it's extremely easy to do:
At least when specifying an encoding, the quote parameter is mandatory, so one must take care to use ENT_QUOTES.
-----
To be continued...
I must add two more things one should be extra careful with:
htmlentities()/htmlspecialchars() must specify not only the correct encoding, but the correct quote style. I find it extremely silly that the default quote flag, ENT_COMPAT, will translate the double quotes and not the single ones. For PHP developers it's extremely easy to do:
Code: Select all
$sUserName = htmlentities($_GET['username']); //ATTN: insecure!
echo "<img src='avatars/{$sUsername}.jpg'>"; //silly example but you get the idea -- it's easiest to use single quotes.-----
To be continued...
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
This is a great discussions to learn the finer points on this subject. Thanks.Mordred wrote:At least when specifying an encoding, the quote parameter is mandatory, so one must take care to use ENT_QUOTES.
I have always used ENT_NOQUOTES, what it the security reason to also escape the quotes as well in HTML?
(#10850)
... continued
(@Kieran Huggins)
The second thing is that in reality you don't have write once / read many data. Maybe the admin also has an interface for editing posts? Or the data is displayed in many contexts - in HTML, in the RSS feed and in the user-friendly mod_rewrite URLs. Pay extra caution when you take such design decisions and by all means do this only for a limited portion of the data.
@Hockey:
Nope, every data usage calls for its own escaping. Be doubly cautious with data that goes several ways (db | html | url | whatever).
@arborint:
You definitely need to take care about quotes when the user data goes in an HTML attribute. Otherwise, even if tag brackets would not be accessible, so the current tag would not be terminated, quotes will be used to terminate the current attribute, and add another - such as a ecmascript On* handler.
(@Kieran Huggins)
The second thing is that in reality you don't have write once / read many data. Maybe the admin also has an interface for editing posts? Or the data is displayed in many contexts - in HTML, in the RSS feed and in the user-friendly mod_rewrite URLs. Pay extra caution when you take such design decisions and by all means do this only for a limited portion of the data.
@Hockey:
Nope, every data usage calls for its own escaping. Be doubly cautious with data that goes several ways (db | html | url | whatever).
@arborint:
You definitely need to take care about quotes when the user data goes in an HTML attribute. Otherwise, even if tag brackets would not be accessible, so the current tag would not be terminated, quotes will be used to terminate the current attribute, and add another - such as a ecmascript On* handler.
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
I used to think that the " quote had to be escaped when used outside of an attribute value context, but it turns out this is not the case. Here's the rub with quote escaping:
If you're outputting text outside of attributes, ENT_NOQUOTES is sufficient and most readable. If you're outputting text inside attributes, you must use single quote escaping or double quote escaping, dependent on which delimiter is being used for the attribute. Using ENT_QUOTES is safest: it will always be safe. The single-quote escape sequence is pretty ugly though.
If you're outputting text outside of attributes, ENT_NOQUOTES is sufficient and most readable. If you're outputting text inside attributes, you must use single quote escaping or double quote escaping, dependent on which delimiter is being used for the attribute. Using ENT_QUOTES is safest: it will always be safe. The single-quote escape sequence is pretty ugly though.
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US