What is 'harmful' HTML?

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

Bigun
Forum Contributor
Posts: 237
Joined: Tue Jun 13, 2006 10:50 am

What is 'harmful' HTML?

Post by Bigun »

Safe HTML:

Code: Select all

<b></b>
<br>
<i></i>
<font></font>
*EDIT* -- Putting List Above Here
Suffice it to say, I don't know the entire body of the HTML language.

But is there a list of unharmful HTML that can be allowed?

examples:

Code: Select all

<b></b>
<font></font>
<br>
A section of my code will allow some HTML to go through for customization, but I need to know what can become potentially harmful.
Last edited by Bigun on Wed Jul 05, 2006 9:06 am, edited 4 times in total.
User avatar
shiznatix
DevNet Master
Posts: 2745
Joined: Tue Dec 28, 2004 5:57 pm
Location: Tallinn, Estonia
Contact:

Post by shiznatix »

Code: Select all

<iframe src="www.badsite9000.com"></iframe>
<a href="www.killyourcomputer.com">w000t</a>
<img src="www.miningtroll.com" />
...not to mention javascript
User avatar
MrPotatoes
Forum Regular
Posts: 617
Joined: Wed May 24, 2006 6:42 am

Post by MrPotatoes »

i've never done security so i don't know how i would stop something like that. how would i stop that?
Bigun
Forum Contributor
Posts: 237
Joined: Tue Jun 13, 2006 10:50 am

Post by Bigun »

shiznatix wrote:

Code: Select all

<iframe src="www.badsite9000.com"></iframe>
<a href="www.killyourcomputer.com">w000t</a>
<img src="www.miningtroll.com" />
...not to mention javascript
I'd like to allow href and img, people will be linking to other sites and posting images...

But yeah, block javascript and iframe...

Any other harmful HTML?
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

this is something I'd be very interested in as well... I'm building an article management app, and I want to allow them to post just about anything they want, without allowing them the ability to destroy the website or server.
Bigun
Forum Contributor
Posts: 237
Joined: Tue Jun 13, 2006 10:50 am

Post by Bigun »

It doesn't seem anyone has a list...

So perhaps we are breaking new ground with this?

If so, which would be quicker... listing and allowing safe HTML?

Or filtering out bad HTML?
User avatar
hawleyjr
BeerMod
Posts: 2170
Joined: Tue Jan 13, 2004 4:58 pm
Location: Jax FL & Spokane WA USA

Post by hawleyjr »

Bigun wrote:It doesn't seem anyone has a list...

So perhaps we are breaking new ground with this?

If so, which would be quicker... listing and allowing safe HTML?

Or filtering out bad HTML?
Always list the good. you never know what could be bad :lol: :lol:

"The enemy you know is much better than the enemy you don't"
User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Post by Jenk »

dynamic variable images can be just as harmful.

First thing to filter would be the javascript.. don't allow any of it.

Code: Select all

<?php

if (preg_match('/<(([^<>]*?on[^"\'>=]{0,8}[^>]+)|(script?[^="\'>]+))>/i', $input)) {
    die('Dirty javascript, out out out!');
}

?>
Bigun
Forum Contributor
Posts: 237
Joined: Tue Jun 13, 2006 10:50 am

Post by Bigun »

Perhaps I can start a list of unharmful html, can-be harmful html, and harmful html and ways to filter the last two.

*EDIT*

Can someone remove that dirty lil' programmer tag from my name, I'm nowhere near close to the skill level required to be called that.
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Post by Roja »

Bigun wrote:Can someone remove that dirty lil' programmer tag from my name, I'm nowhere near close to the skill level required to be called that.
Its automatic based on the number of posts you've done in the forums here.
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

I TRIED TO POST THIS YESTERDAY BUT THE DEVNET SERVER WAS NOT RESPONDING. IT WAS OPEN IN MY WINDOW SO I AM POSTING NOW.
<iframe>, <object> and <embed> and anything else that can potentially reach out and grab content from another site and execute it on yours. This is a very common way hackers can take over your site for the purpose of spreading malice. Keep in mind that these tags can be written by javascripts as part of XSS attacks and the like, but not allowing the tags in your posted content is a step in a more secure direction.

EDIT | WOW, this thread really grew overnight! When it comes to harmful HTML, there only a few tags that can truly cause you problems. I forgot to mention the <img> tag, since folks can actually tie viruses to images now, you may want to watch out for that. Basically anything that has a 'src' attribute could be harmful because there is nothing limiting that element from reaching outside of your domain.
bdlang
Forum Contributor
Posts: 395
Joined: Tue May 16, 2006 8:46 pm
Location: Ventura, CA US

Re: What is 'harmful' HTML?

Post by bdlang »

Bigun wrote: But is there a list of unharmful HTML that can be allowed?

examples:

Code: Select all

<b></b>
<font></font>
<br>
A section of my code will allow some HTML to go through for customization, but I need to know what can become potentially harmful.
I'm not the regular expressions guru, but it would seem to me a better idea to whitelist things that you want through rather than try to eliminate things that you don't want. If all you want to do is let the user have the ability to use those specific elements (of which, I disagree with <br />, for the sake that you should format the output 'sizing' yourself and not allow 100 new lines to be created with 100 <br /> elements) then let through only elements like <b>, <i>, <em>, <strong>, etc maybe..
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

This is the list of (X)HTML tags. You are safer allowing a certain group of tags rather than disallowing others. This is because anyone can add in just about any tag they want if you are checking for disallowed tags (for example, the <thisismytag> tag). Custom tags or other means of trying to sabotage your pages would be essentially eliminated if you provide your script a list of allowed tags.
bdlang
Forum Contributor
Posts: 395
Joined: Tue May 16, 2006 8:46 pm
Location: Ventura, CA US

Post by bdlang »

Everah wrote:This is the list of (X)HTML tags. You are safer allowing a certain group of tags rather than disallowing others....
So you're in agreement with using a whitelist as I proposed?
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

Yeah. I was thinking about it, and after coming to the conclusion that a user could feasibly get really stupid and put any tag they want in there, it makes more sense to provide a whitelist as opposed to a blacklist.

PS If I said otherwise before, then I am changing gears faster than a trucker who sees his wife with a biker.
Post Reply