Page 1 of 2

Library level security and UTF-8 support

Posted: Thu Aug 24, 2006 9:13 am
by Ollie Saunders
Everybody knows that you have to call htmlentities() on any previous input that is going to be output to avoid XSS.
But how many people do this?:

Code: Select all

$safeVar = 'mainbox';
$output = '<div class="'. htmlentities($safeVar) . '">Foo</div>';
Not many I bet, because $safeVar is safe; not input.

What about the case when you are writing a library. Lets take the most simple example possible:

Code: Select all

function divWithClass($classText)
{
    return '<div class="'. $classText . '">Foo</div>';
}
should you htmlentities() there? or leave it to call time then the user only uses it if necessary:

Code: Select all

divWithClass('safeVar');
divWithClass(htmlentities($_GET['tainted']));
Fact is you don't know how divWithClass() might be used. Considering this is the class attribute which isn't typically used as input is it not fair to require the function user to call htmlentities() themselves.

I've gone from being certain that htmlentities is library user's responsibility to being certain that its library writer's responibility. And now I don't have a clue what I think.
Let me give one last more contexual example. I have an object OF_Help that has the following public properties:
  • content (an array of strings converted to separate paragraphs)
  • class
  • javascript events (all of them)
  • title
At the moment you could assign something like '"><foo /><bar attrib="' to any of these and it would not be altered and would be output just like that. This is particularly a problem for javascript events because you aren't allowed to use double-quoted strings.

Thoughts?

Posted: Thu Aug 24, 2006 9:33 am
by feyd
Library writers responsiblity to reasonably protect the user from shooting themselves in the foot.

Posted: Thu Aug 24, 2006 12:23 pm
by Ollie Saunders
Hmm yes I guess you are right. A confirmation from feyd is a confirmation indeed.

No i've got to add a bizaillon htmlentities calls everywhere. No wait, I don't because I have this:

Code: Select all

final static protected function _makeAttrib($name, $value)
{
	if ($value === null or $value === false) {
		return '';
	}
	if ($value === true) { // for attributes like disabled
	    $value = $name;
	}
	return ' ' . $name . '="' . OF_Filter::escape($value) . '"'; // escaping for the application in one call yay!
}
I have saved myself hours of my life! Woooooooooo!

Posted: Thu Aug 24, 2006 12:37 pm
by Oren
Change this:

Code: Select all

final static protected
to this:

Code: Select all

final protected static
:wink:

Posted: Thu Aug 24, 2006 2:30 pm
by Ambush Commander
I would recommend using htmlspecialchars() instead. Also, make sure you get the character encoding right.

Posted: Thu Aug 24, 2006 2:53 pm
by Ollie Saunders
Oren wrote:Change this:

Code: Select all

final static protected
to this:

Code: Select all

final protected static
:wink:
Sorry but you're going to have to provide a good reason why i should do that.
I would recommend using htmlspecialchars() instead. Also, make sure you get the character encoding right.
I am setting the encoding on htmlentities or rather the library user can, in an application wide sense. And again you'll need to tell me why I should change to specialchars() please, because i'm using entities at shiflett's instruction (although its possible that may be out of date).

I appreicate the comments guys.

Posted: Thu Aug 24, 2006 2:58 pm
by Oren
ole wrote:
Oren wrote:Change this:

Code: Select all

final static protected
to this:

Code: Select all

final protected static
:wink:
Sorry but you're going to have to provide a good reason why i should do that.
http://us2.php.net/manual/en/language.oop5.static.php

Posted: Thu Aug 24, 2006 2:59 pm
by Ambush Commander
htmlspecialchars has better support for UTF-8 and other multibyte functions earlier than htmlentities. Combined with a UTF-8 well-formedness check and an SGML codepoint check (I can give you code for that too), it's guaranteed to work.

Posted: Thu Aug 24, 2006 4:32 pm
by feyd
Hate to break it to you, but it works in any location.

Code: Select all

[feyd@home]>php -r "class a{final public static function foo($b){return $b;}} var_dump(a::foo(mt_rand(1,100)));"
int(28)

[feyd@home]>php -r "class a{final static public function foo($b){return $b;}} var_dump(a::foo(mt_rand(1,100)));"
int(10)

[feyd@home]>php -r "class a{static final public function foo($b){return $b;}} var_dump(a::foo(mt_rand(1,100)));"
int(99)

[feyd@home]>php -r "class a{static public final function foo($b){return $b;}} var_dump(a::foo(mt_rand(1,100)));"
int(57)

[feyd@home]>php -r "class a{public static final function foo($b){return $b;}} var_dump(a::foo(mt_rand(1,100)));"
int(82)

[feyd@home]>php -r "class a{public final static function foo($b){return $b;}} var_dump(a::foo(mt_rand(1,100)));"
int(60)

Posted: Thu Aug 24, 2006 4:36 pm
by Oren
feyd wrote:
Hate to break it to you, but it works in any location.
This also works:

Code: Select all

<b><i>Some text...</b></i>
Do you use it?

Posted: Thu Aug 24, 2006 5:10 pm
by feyd
Oren wrote:This also works:

Code: Select all

<b><i>Some text...</b></i>
Do you use it?
That is irrelevant. "No, it's not." Yes, it is. The above breaks validation. The code I posted does not break PHP's validation pass.

The documentation cited in your link is mistaken. PHP does not care what order you declare the modifiers. It would be silly if it did.

Posted: Thu Aug 24, 2006 5:17 pm
by Ollie Saunders
Oren wrote:
feyd wrote:
Hate to break it to you, but it works in any location.
This also works:

Code: Select all

<b><i>Some text...</b></i>
Do you use it?
No I don't, because the official syntax implementation, namely W3C's validator, tells me otherwise. In the case of...
The PHP Manual wrote:The static declaration must be after the visibility declaration
...I don't care because the official syntax implementation does not tell me otherwise. In fact I am willing to assert that quote from the manual is incorrect.
feyd wrote:Hate to break it to you, but it works in any location.
Thanks Feyd.
AC wrote:htmlspecialchars has better support for UTF-8 and other multibyte functions earlier than htmlentities. Combined with a UTF-8 well-formedness check and an SGML codepoint check (I can give you code for that too), it's guaranteed to work.
OK, but htmlentities is moreexhaustive, so I have been using it in that premise. In fact at the moment OsisForms does not support Unicode, I have made no attempt for it to do so considering that it would make the mb_string extension a requirement and PHP 6 will solve this anyway. Are you saying that even if I specify the character encoding in htmlentities() I am still vunerible to XSS?

Posted: Thu Aug 24, 2006 5:19 pm
by Ollie Saunders
feyd wrote:That is irrelevant. "No, it's not." Yes, it is. The above breaks validation. The code I posted does not break PHP's validation pass.

The documentation cited in your link is mistaken. PHP does not care what order you declare the modifiers. It would be silly if it did.
Great minds, eh feyd? :D

I promise I didn't look at your post, I just spent ages starring at the manual pages on htmlspeci... and entities so i missed your post.

Posted: Thu Aug 24, 2006 5:41 pm
by Oren
Well, where should I learn PHP from if not from the manual? :?

Posted: Thu Aug 24, 2006 5:51 pm
by feyd
Me.

Image

Couldn't resist.

I really don't know. The manual certainly isn't infallible, it's written by humans (less or more evolved than others, but humans none-the-less.)