Removing script from text

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

Post Reply
AGISB
Forum Contributor
Posts: 422
Joined: Fri Jul 09, 2004 1:23 am

Removing script from text

Post by AGISB »

As this issue of validation is security related I post this here

In my contact form people can send me their comments. I want to write this directly to the database because I don't want to deal with emails.

All fields are easy to validate except the textarea field.

I want to remove any dangerous code out of those textarea input. As code could be done by entities this has to be also removed.

As I am not very experienced with strip_tags(), htmlentities() and their limitations I need some input how to best archive this?

I was thinking in the lines of:

1. convert all entities into their chars
2. remove all dangerous chars like <>&;$

Anyone got a working code for this?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

running it through htmlentities() alone, should be sufficient, as the "code" they write will not have the ability to run, for the most part. The only avenue to look out for is binary, which is easily filtered by running through the bytes of the submission like this:

Code: Select all

$c = 0;
for($i = 0, $j = strlen($submittedText); $i < $j; $i++)
  $c += intval( (bool)($submittedText&#123;$i&#125; & 0x80) );

$percentage = floatval( $c ) / $j * 100;

echo round($percentage, 2) . '% of the text was binary data.';
AGISB
Forum Contributor
Posts: 422
Joined: Fri Jul 09, 2004 1:23 am

Post by AGISB »

So what percentage is acceptable? 0?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

closer to zero, the better. I'd say, anything above 50% would definitely be binary file. It depends on the character set used for the page, as some browsers will send UTF-8 data.. or entitied stuff..
AGISB
Forum Contributor
Posts: 422
Joined: Fri Jul 09, 2004 1:23 am

Post by AGISB »

Charset is ISO8859-1 as I am coding a German application.

I probably have to do some testing with this.


I think limiting the messages to 1024 - 2048 chars will help as well to keep binary out.

Another problem is that I have to find a way to decode the entities in Visual Basic as that is the admin back end of the application.

Maybe I will have to resolve to my initial thought
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

if you have the source to the vb back end, you should be able to recompile into unicode.. this will solve most issues with character sets. It's possible the application supports unicode already as well.
Jutboy
Forum Newbie
Posts: 4
Joined: Sun Jul 01, 2007 4:04 pm

Post by Jutboy »

I was gonna contact you feyd via PM instead of opening up a new thread but your sig. says you frown at that so...

I wan't to know more about this binary attack....surely you can not make a string be read as binary on the server? I'd really like to learn more about it..

I take a lot of percautions already but not this one....

Code: Select all

for($i = 0, $j = strlen($var); $i < $j; $i++)
$c += intval( (bool)($var{$i} & 0x80) );

$percentage = floatval( $c ) / $j * 100;
//Percentage should be 0
if ($percentage > 2){}
I keep geeting Parse error: syntax error, unexpected ';', expecting ')' in XXX
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

Jutboy wrote:I was gonna contact you feyd via PM instead of opening up a new thread but your sig. says you frown at that so...

I wan't to know more about this binary attack....surely you can not make a string be read as binary on the server? I'd really like to learn more about it..

I take a lot of percautions already but not this one....

Code: Select all

for($i = 0, $j = strlen($var); $i < $j; $i++)
$c += intval( (bool)($var{$i} & 0x80) );

$percentage = floatval( $c ) / $j * 100;
//Percentage should be 0
if ($percentage > 2){}
I keep geeting Parse error: syntax error, unexpected ';', expecting ')' in XXX
The unexpected ';' is from the "<" and "&" you're using. PHP is not HTML.
Jutboy
Forum Newbie
Posts: 4
Joined: Sun Jul 01, 2007 4:04 pm

Post by Jutboy »

I dont' get it - why would feyd put html into a PHP For condition statement?

What should it be instead?
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

Jutboy wrote:I dont' get it - why would feyd put html into a PHP For condition statement?

What should it be instead?
I didn't even notice that. His

Code: Select all

tags weren't parsed either. I saw a lot of that in some of the older tutorials posted here... Maybe it's a database entry error. DevNet feels some database errors every now and then.

Anyway, just convert those characters from HTML encoded characters to the regular counterparts. If you need to see what they are, paste the code [b]without[/b] surrounding it in <?php ?> tags, and then copy it from the browser.
Jutboy
Forum Newbie
Posts: 4
Joined: Sun Jul 01, 2007 4:04 pm

Post by Jutboy »

Thanks a lot - that worked great....

for other people who might read this....

Code: Select all

function XXXXXXX($submittedText){
//Binary Percentage Checker
$c = 0;
for($i = 0, $j = strlen($submittedText); $i < $j; $i++){
$c += intval( (bool)($submittedText{$i} & 0x80) );
}
$percentage = floatval( $c ) / $j * 100;
Now if someone could just clue me on how this attack works....thanks
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Post by VladSun »

Jutboy wrote: Now if someone could just clue me on how this attack works....thanks
http://ha.ckers.org/xss.html
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

The entities et al are from a database change that occurred and didn't fix itself after a repair.

In the future, please create a new thread instead of waking up a long dead one, but please do reference the dead thread so you don't have to explain too much.

HTML Purifier may be more inline with what you are wanting to fix/prevent. It's far smarter than my excessively simple snippet above.
Jutboy
Forum Newbie
Posts: 4
Joined: Sun Jul 01, 2007 4:04 pm

Post by Jutboy »

Excellent response guys....thank you so much!
Post Reply