Page 2 of 3
Posted: Tue Jul 04, 2006 9:05 pm
by mabufo
Couldn't you just ban the use of greater than, and less than? As far as I know that would eliminate any possiblility of something malicious.
Posted: Tue Jul 04, 2006 9:18 pm
by Nathaniel
But people could still slip bad stuff into other things.
Like, in a website field, enter: '
http://goodwebsite.com" onclick="window.location=
http://badwebsite.com"'
Then, the user thinks they are visiting goodwebsite.com, and it sends them to badwebsite.com. Just one of many examples.
Posted: Tue Jul 04, 2006 9:51 pm
by bdlang
mabufo wrote:Couldn't you just ban the use of greater than, and less than? As far as I know that would eliminate any possiblility of something malicious.
Well, you
could simply use htmlentities() on all user input / output, but giving some flexibility as to formatting their entries is the goal.
Posted: Wed Jul 05, 2006 2:00 am
by John Cartwright
bdlang wrote:mabufo wrote:Couldn't you just ban the use of greater than, and less than? As far as I know that would eliminate any possiblility of something malicious.
Well, you
could simply use htmlentities() on all user input / output, but giving some flexibility as to formatting their entries is the goal.
It is generally easier to use your own formatting markup, such as bbcode to avoid this whole issue.
Posted: Wed Jul 05, 2006 6:14 am
by Bigun
The point of this post is to allow some HTML to go through. By eliminating all HTML tags by using htmlentities, would defeat that purpose. But isn't there a regex string that could find all of the html tages by looking for string starting with a "<" and ending with a ">" then running it through a whitelist?
Posted: Wed Jul 05, 2006 6:39 am
by Weirdan
Bigun wrote:But isn't there a regex string that could find all of the html tages by looking for string starting with a "<" and ending with a ">" then running it through a whitelist?
there is
strip_tags function. It does exactly that.
Posted: Wed Jul 05, 2006 6:50 am
by Bigun
Reading up on that function it seems that if you set the 'allowable_tags' feature you can specify only certain tags to be allowed.
Code: Select all
strip_tags ( string str [, string allowable_tags] )
example:
Code: Select all
$string = strip_tags($string, '<a><b><i><u>');
Perfect... so yeah... we can start whitelisting with ease...
Posted: Wed Jul 05, 2006 7:04 am
by Jenk
not got a parser at hand.. does that disallow any tags that have events?
e.g.:
Code: Select all
<b onmouseover="alert('boo!');">Text..</b>
Posted: Wed Jul 05, 2006 9:09 am
by Bigun
Are tables harmless?
Posted: Wed Jul 05, 2006 9:37 am
by Jenk
Nothing is harmless.
There is no harmless HTML because all tags can be bound to events, which is what causes problems.
If you are going to allow HTML rather than use a version of BBCode, you will have to filter very carefully all input. This is why BBCode was created in the first place, it's far, far easier.
Posted: Wed Jul 05, 2006 12:00 pm
by basdog22
I agree. There are ways to bypass htmlentities. There are some times that you need to do:
Code: Select all
$doc = str_replace("\xC0\xBC", "<", $doc);
and i think i can remember a list of known xss attack types. Can't remember the source though

Posted: Wed Jul 05, 2006 2:43 pm
by Luke
I would love to use bbcode, but I like using the wysiwyg editor, tinyMCE. So, I have to allow some html.
Posted: Wed Jul 05, 2006 3:25 pm
by John Cartwright
Check over at phpclasses for some scripts, I recall seeing one that allowed you to defined a whitelist of tags along with appropriate attributes and such. Wouldn't be that hard to write one yourself either..
Posted: Wed Jul 05, 2006 3:28 pm
by RobertGonzalez
The Ninja Space Goat wrote:I would love to use bbcode, but I like using the wysiwyg editor, tinyMCE. So, I have to allow some html.
You know, there are only a few tags in TinyMCE that allow event attributes to be added to them through the WYSIWYG feature. You could probably create a list of allowed complete tages in an array, then use that array to scan the posted value of the textarea. If the posted text contains tags that are not exactly to your specification, deny the post.
Posted: Wed Jul 05, 2006 5:05 pm
by Luke
What if user disables javascript and just enters whatever they want in the text box?
EDIT: Scratch that question... I misread your post.