Lugubriousness

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: Lugubriousness

Post by kaisellgren »

arborint wrote:I don't think you can say that \n \r \t \77 \xFF are is not transformations.
Well, I think I can. Here's my explanation. \n (literal string) -> 0x0a (byte) is a transformation... but the process that happens beyond the interface of mysql_real_escape_string() for instance, is not actually a transformation. You give the parser Input, which remains the same through the entire process, and as you move on, the parser copies data from the Input to the output buffer. Thus, there are never transformations, but just some copying/creation of data to a separate buffer. MySQL will literally for loop through the bytes, and if it faces 0x27 ('), then it directly creates bytes 0x5c (\) and 0x27 (') into the output buffer (separate). In the end of the escaping process, the input stream is still the same (untouched).

Usually none of this matters to programmers, but it's interesting to discuss though.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Lugubriousness

Post by Christopher »

kaisellgren wrote:Well, I think I can. Here's my explanation. \n (literal string) -> 0x0a (byte) is a transformation... but the process that happens beyond the interface of mysql_real_escape_string() for instance, is not actually a transformation. You give the parser Input, which remains the same through the entire process, and as you move on, the parser copies data from the Input to the output buffer. Thus, there are never transformations, but just some copying/creation of data to a separate buffer. MySQL will literally for loop through the bytes, and if it faces 0x27 ('), then it directly creates bytes 0x5c (\) and 0x27 (') into the output buffer (separate). In the end of the escaping process, the input stream is still the same (untouched).
As bizarre as it may sound, I think we agree that it "is a transformation" but it "is not actually a transformation." ;)
kaisellgren wrote:Usually none of this matters to programmers, but it's interesting to discuss though.
I think it clarifies some of the nuance of the discussion. I think this understanding will help get to the bottom of the problem you state in your first post.
(#10850)
User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: Lugubriousness

Post by kaisellgren »

arborint wrote:I think it clarifies some of the nuance of the discussion. I think this understanding will help get to the bottom of the problem you state in your first post.
Indeed. I asked a couple of experienced programmers who all agreed with the terms. I think we have pretty well covered what do those common terms mean and I'm sure most people can agree with those or at least understand our perspective.
arborint wrote:As bizarre as it may sound, I think we agree that it "is a transformation" but it "is not actually a transformation." ;)
Image
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Lugubriousness

Post by Christopher »

Next is to similarly clarify the types of attacks. For example:

- XSS Cross Site Scripting
- SQL Injection
(#10850)
User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: Lugubriousness

Post by kaisellgren »

A successful SQLi usually requires to place 0x27 into the query intact unless the place where you place it is not enclosed within quotes (ORDER BY, LIMIT or just a mistake of not using quotes around values). In case of XSS, that's way too broad... usually we want to make the Javascript parser to execute and run some code for us.

Basically, when you are using Encoding as your protection, things are much more complicated. If you are Escaping, then you usually have to take care of a few meta characters.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Lugubriousness

Post by Christopher »

kaisellgren wrote:A successful SQLi usually requires to place 0x27 into the query intact unless the place where you place it is not enclosed within quotes (ORDER BY, LIMIT or just a mistake of not using quotes around values).
So if you always quote and always escape you are safe from SQL injection attacks?
kaisellgren wrote:In case of XSS, that's way too broad... usually we want to make the Javascript parser to execute and run some code for us.

Basically, when you are using Encoding as your protection, things are much more complicated. If you are Escaping, then you usually have to take care of a few meta characters.
Leaving the Escaping/Encoding question aside ;) what are the values that you need to quote?
(#10850)
User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: Lugubriousness

Post by kaisellgren »

If you just could quote... imagine having a dynamic ORDER BY statement, you can't quote the column name with 0x27 or it will not work. That's where prepared statements become useless - they only apply to DML.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Lugubriousness

Post by Christopher »

So the only way to use user input in SQL for something that is not a quoted value would be to filter it with a whitelist character set? The other options is to check it against acceptable values or use it to select among acceptable values.
(#10850)
matthijs
DevNet Master
Posts: 3360
Joined: Thu Oct 06, 2005 3:57 pm

Re: Lugubriousness

Post by matthijs »

Another type of attack could be mail injection. Still a more complex topic then what you'd think at first sight
User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: Lugubriousness

Post by kaisellgren »

arborint wrote:So the only way to use user input in SQL for something that is not a quoted value would be to filter it with a whitelist character set? The other options is to check it against acceptable values or use it to select among acceptable values.
Yeah, either filter

Code: Select all

$column = preg_replace('#^[^a-z0-9_]+$#iD','',$column);
or use white listing

Code: Select all

$allowedColumns = array('id','downloads');
if (!in_array($column,$allowedColumns))
 $column = 'downloads'; // you could default to something
Those are filtering the column name. You could also use validation approach.
matthijs wrote:Another type of attack could be mail injection. Still a more complex topic then what you'd think at first sight
SQLi is simple, but yeah injection as the subject isn't.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Lugubriousness

Post by Christopher »

kaisellgren wrote:Yeah, either filter

Code: Select all

$column = preg_replace('#^[^a-z0-9_]+$#iD','',$column);
or use white listing

Code: Select all

$allowedColumns = array('id','downloads');
if (!in_array($column,$allowedColumns))
 $column = 'downloads'; // you could default to something
Those are filtering the column name. You could also use validation approach.
Those are really both white listing. The first we might call Character White Listing and while the second is Value White Listing.
(#10850)
User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: Lugubriousness

Post by kaisellgren »

arborint wrote:Those are really both white listing. The first we might call Character White Listing and while the second is Value White Listing.
Imo, white listing and black listing are not methods themselves, but approaches. Filtering, validating, escaping and many others can be categorized into white lists and black lists. For example, mysql_real_escape_string() is black listing. The second example I gave could be said to be validating (rejects non-allowed values) using a white list. Maybe we could define "white lists" and "black lists", too.

What about:

Code: Select all

/^[\x{0000}-\x{007f}]+$/D
which tries to match bytes in a range of 0-127? And what about

Code: Select all

/^[^\x{0000}-\x{007f}]+$/D
which tries to match bytes that are not in a range of 0-127?

Are these white lists or black lists? Which is which?

Essentially, a filter

Code: Select all

/^[!"#]+$/D
that removes those three dangerous characters (in an imaginary situation) like that would be using black listing. All other potentially dangerous characters such as 0x0a, 0x00, etc. are not filtered. On the contrary,

Code: Select all

/^[^a-z0-9]+$/D
would be using white listing. It filter characters that are not in the needed character set.

Again, I don't think there are any standard terms for these, but this is the way I think of these. Comments? Arguments? Nods?
matthijs
DevNet Master
Posts: 3360
Joined: Thu Oct 06, 2005 3:57 pm

Re: Lugubriousness

Post by matthijs »

kaisellgren wrote:Again, I don't think there are any standard terms for these, but this is the way I think of these. Comments? Arguments? Nods?
Excellent explanation. I think it's really important to really think about things like this. The difference between using one regular expression (whitelisting) and another (blacklisting) can be pretty fundamental, as you show.
Post Reply