Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.
this is something I'd be very interested in as well... I'm building an article management app, and I want to allow them to post just about anything they want, without allowing them the ability to destroy the website or server.
I TRIED TO POST THIS YESTERDAY BUT THE DEVNET SERVER WAS NOT RESPONDING. IT WAS OPEN IN MY WINDOW SO I AM POSTING NOW.
<iframe>, <object> and <embed> and anything else that can potentially reach out and grab content from another site and execute it on yours. This is a very common way hackers can take over your site for the purpose of spreading malice. Keep in mind that these tags can be written by javascripts as part of XSS attacks and the like, but not allowing the tags in your posted content is a step in a more secure direction.
EDIT | WOW, this thread really grew overnight! When it comes to harmful HTML, there only a few tags that can truly cause you problems. I forgot to mention the <img> tag, since folks can actually tie viruses to images now, you may want to watch out for that. Basically anything that has a 'src' attribute could be harmful because there is nothing limiting that element from reaching outside of your domain.
A section of my code will allow some HTML to go through for customization, but I need to know what can become potentially harmful.
I'm not the regular expressions guru, but it would seem to me a better idea to whitelist things that you want through rather than try to eliminate things that you don't want. If all you want to do is let the user have the ability to use those specific elements (of which, I disagree with <br />, for the sake that you should format the output 'sizing' yourself and not allow 100 new lines to be created with 100 <br /> elements) then let through only elements like <b>, <i>, <em>, <strong>, etc maybe..
This is the list of (X)HTML tags. You are safer allowing a certain group of tags rather than disallowing others. This is because anyone can add in just about any tag they want if you are checking for disallowed tags (for example, the <thisismytag> tag). Custom tags or other means of trying to sabotage your pages would be essentially eliminated if you provide your script a list of allowed tags.
Yeah. I was thinking about it, and after coming to the conclusion that a user could feasibly get really stupid and put any tag they want in there, it makes more sense to provide a whitelist as opposed to a blacklist.
PS If I said otherwise before, then I am changing gears faster than a trucker who sees his wife with a biker.