Page 1 of 2

BBcode +function

Posted: Mon Sep 11, 2006 3:16 am
by bob_the _builder
Hi,

Is the following code a secure way of converting html to BBcode and clean user data ready to be inserted into the db. Also using the same variables if needed, to call on in queries
The second function to call data out of the db and echo to screen converting BBcode back to html and stripslashes.

Code: Select all

function ValidateInput($value) { 

	$BBCode = array( 
	"<b>" => "[b]", 
	"</b>" => "[/b]", 
	"<u>" => "[u]", 
	"</u>" => "[/u]", 
); 
	$value = str_replace(array_keys($BBCode), array_values($BBCode), $value); 
	$value = mysql_real_escape_string(trim(strip_tags($value))); 
	return $value; 
} 

function ValidateOutput($value) { 

	$BBCode = array( 
	"[b]" => "<b>", 
	"[/b]" => "</b>", 
	"[u]" => "<u>", 
	"[/u]" => "</u>", 
); 
	$value = str_replace(array_keys($BBCode), array_values($BBCode), $value); 
	return $value; 
} 
	$data = ValidateInput('<b><u>some data here</u></b>'); 
	
// Insert into database here

// Echo out of the database converting bbcode back to html

	$FinalOutput = stripslashes(ValidateOutput($data)); 
	
	echo "$FinalOutput";

Thanks

Posted: Mon Sep 11, 2006 2:48 pm
by feyd
Your function(s) don't block a user from posting other html tags.

From what I understand of your post, this functionality have no effect.. so I'm a but confused.

Posted: Mon Sep 11, 2006 3:45 pm
by bob_the _builder
My thoughts on what the code does;

Code: Select all

function ValidateInput($value) { 

        $BBCode = array( 
        "<b>" => "[b]", 
        "</b>" => "[/b]", 
        "<u>" => "[u]", 
        "</u>" => "[/u]", 
); 
        $value = str_replace(array_keys($BBCode), array_values($BBCode), $value); 
        $value = mysql_real_escape_string(trim(strip_tags($value))); 
        return $value; 
} 

// Converts <b> </b> and <u> </u> to bbcode  and 
// addslashes to \x00, \n, \r, \, ', ", 
// trims spaces from the beginning and the end of the variable
// strips any html tags left behind
The question; Is the above enough code to stop attacks (hacking) on a site, also is it enough cleaning of user input before insering into the database.

The following code used to call data from a database:

Code: Select all

function ValidateOutput($value) { 

        $BBCode = array( 
        "[b]" => "<b>", 
        "[/b]" => "</b>", 
        "[u]" => "<u>", 
        "[/u]" => "</u>", 
); 
        $value = str_replace(array_keys($BBCode), array_values($BBCode), $value);
        $value = stripslashes($value);
        return $value; 
}

// converts [/b] [/b] and  back to html <b> </b> and <u> </u>
// stripslashes
// echo back to screen with html formating

So .. Is it enough code for site security with data input and is the BBcode the correct way to use it;

ie are you ment to store it in the db as BBcode then convert back to html when printing to the screen?


Thanks

Posted: Mon Sep 11, 2006 9:02 pm
by Ambush Commander
Precisely: it doesn't filter out other html tags. Try passing <a onload="javascript:alert('boo');">asdf</a> to it and see what happens.

If you're looking for secure HTML filtering, I'd recommend my own library: HTML Purifier

Posted: Mon Sep 11, 2006 10:11 pm
by bob_the _builder
Hi,

If I parse this thru the function ValidateInput

Code: Select all

$data = ValidateInput('<a onload="javascript:alert(\'boo\');">asdf</a>');
obviously escaping alert(\'boo\')

and echo $data it prints: asdf

So im :? at :
Precisely: it doesn't filter out other html tags. Try passing <a onload="javascript:alert('boo');">asdf</a> to it and see what happens.
Looks like it filtered out the tags around 'asdf' ?

Am I missunderstanding the tags part?

Thanks

Posted: Tue Sep 12, 2006 10:51 am
by RobertGonzalez
View the output source and see what was actually sent to the browser. If you are only replacing HTML tags that have no event handlers in them you are putting yourself at risk. The reason is that <a> and <a href"...."> will not be treated the same by the function put up. Unless you plan on str_replacing every possible passed HTML tag combination you are putting yourself at risk by depending on that function entirely.

Posted: Tue Sep 12, 2006 2:46 pm
by bob_the _builder
Hi,

Im really not following the answers to the post, it seems that its converting the asked tags to bbcode, taking out html tags, cleaning spaces at the begining and end of the string, adding slashes to the string .. bringing it back plain text before its inserted in the database .

Maybe a clearer explaination of what its missing in the sence of cleaning the user data before placing into the database.

From what I read people use addslashes at the minimum?

Also with bbcode, is it converted from <b> </b> to then inserted into the database, once called out of the database convert back to <b> </b> prior to been echoed to the screen?


Thanks

Posted: Tue Sep 12, 2006 3:10 pm
by RobertGonzalez
Typically bbCode is added by the user and stored with the content in the database. When another user requests that content the content is converted for display, but the actual content in the database is not converted to anything. The only change happens to what is displayed.

Posted: Tue Sep 12, 2006 3:50 pm
by bob_the _builder
Hi,

Yep, but just to clean up any html tags that may have been posted I convert it to bbcode. When displaying I meant call out of the database and convert on the fly like below, not altering the database record.

$data = ValidateOutput($row['data']);

So the bbcode concept am I on the right track?

As for cleaning user input im a bit lost, like I said in the above post, alot of ppl say use addslashes at the least.


Thanks

Posted: Tue Sep 12, 2006 4:26 pm
by John Cartwright
you can always pass your string through htmlentities() and then do your bbtag replaces.

Posted: Tue Sep 12, 2006 6:19 pm
by RobertGonzalez
Which tags are you allowing your users to submit?

Posted: Tue Sep 12, 2006 8:25 pm
by bob_the _builder
Hi,

<b></b>
<i></i>
<u></u>
<quote></quote>
<a href="">link</a>
<img src="">

But have them convert to bbcode if they arnt input as bbcode allready, then everything else that could be a threat taken out or turned to plain text.

Thanks

Posted: Tue Sep 12, 2006 9:03 pm
by RobertGonzalez
The first three you can do with str_replace(). The last two will need to be regex'ed.

Posted: Tue Sep 12, 2006 10:31 pm
by bob_the _builder
Hi,

I thought I could just add:

"[img]" => "<img src='",
"[/img]" => "'>",

"" => "<a a href='",
"
" => "'>",

etc into the BBcode array.

The basic question was ..

Is the ValidateInput function enough to clean user input, convert any html tags to the given bbcode and remove any other distructive code.

Also is the way I am going about storing bbcode into the db, then converting it on the fly back to html when been called outa the databse.


Thanks

Posted: Tue Sep 12, 2006 10:46 pm
by feyd
Both image tags and link tags are potential XSS attack vectors, among other things. The URL given must to properly sanitized. This is often done with regular expressions. The bbcode parsers I've seen generally do an htmlentities() as previously noted then transform inline HTML to the bbcode forms after confirming they are indeed valid, then finally transform all the bbcode tags into actual HTML as needed.