strip_tags
Posted: Sun Nov 09, 2008 5:05 pm
Just a concept idea:
If I were to run strip_tags() over all GPC data (assuming no input was to be HTML) this would in theory prevent most XSS exploits 1 & 2 I believe? I guess DOM injection would still be possible but this likely needs to be address on the client...
Anyways I am aware of HTML_Purifier and it's ability to filter HTML in accordance with UTF-8 however to call HTML_Purifier on *every* incoming variable would be over kill and introduce a big performance hit.
So I figure strip_tags would likely suffice if all I wanted to do was remove any likelyhood of HTML sneaking into the GPC data. My concern is that of localization. Is strip_tags() safe to use in all instances?
Are HTML tags always encoded() as the ASCII < and > or can they be some other unicode code point and thus be interpreted by a browser as HTML code? I assume strip_tags() will only remove the ASCII versions and while it makes sense that tags can only be ASCII characters I cannot be certain for sure and thus the security scare.
What says you, assuming all I want to do is remove all HTML tags (no exceptions) is strip_tags a safe bet?
Cheers,
Alex
If I were to run strip_tags() over all GPC data (assuming no input was to be HTML) this would in theory prevent most XSS exploits 1 & 2 I believe? I guess DOM injection would still be possible but this likely needs to be address on the client...
Anyways I am aware of HTML_Purifier and it's ability to filter HTML in accordance with UTF-8 however to call HTML_Purifier on *every* incoming variable would be over kill and introduce a big performance hit.
So I figure strip_tags would likely suffice if all I wanted to do was remove any likelyhood of HTML sneaking into the GPC data. My concern is that of localization. Is strip_tags() safe to use in all instances?
Are HTML tags always encoded() as the ASCII < and > or can they be some other unicode code point and thus be interpreted by a browser as HTML code? I assume strip_tags() will only remove the ASCII versions and while it makes sense that tags can only be ASCII characters I cannot be certain for sure and thus the security scare.
What says you, assuming all I want to do is remove all HTML tags (no exceptions) is strip_tags a safe bet?
Cheers,
Alex