Page 1 of 1

HTML Encoding

Posted: Tue Apr 06, 2010 7:47 am
by lilalfyalien
Hi,

I have a scenario where I am expecting a user to enter HTML in a box. I then want to process this to make sure that if they have put special characters e.g. &, ", ' etc that these are correctly encoded. I cannot work out what to do because there is an htmlentities() function but this will also convert the "<" & ">" in the html tags e.g. "<span>". How do I only encode the text within the html tags?

My ultimate goal is to allow users to copy and paste from MS Word into an iframe which is editable. MS Word at the moment is putting the normal junk in when the users copy and paste and I need it cleaned.

Thanks in advance,

Lilalfyalien

Re: HTML Encoding

Posted: Tue Apr 06, 2010 7:49 am
by lilalfyalien
Oh, I've just spotted the double_encode parameter... I'll give that a go!

Re: HTML Encoding

Posted: Tue Apr 06, 2010 8:05 am
by lilalfyalien
Oh no double_encode didn't work :(

It converted "<br />" into "<br />"- not what I wanted!

Re: HTML Encoding

Posted: Tue Apr 06, 2010 8:13 am
by omniuni
Pasting from Word is a terrible terrible idea, but if you must have it, check out:

http://www.bioinformatics.org/phplabwar ... /index.php

It's a script called htmlLawed, and it has a lot of useful functions including options to TRY to clean up Word's mess.

Re: HTML Encoding

Posted: Tue Apr 06, 2010 8:26 am
by lilalfyalien
I know! I'm trying to integrate it into a CMS and a lot of people insist on typing it up in Word first!

This was just what I was looking for- thanks!

Re: HTML Encoding

Posted: Wed Apr 07, 2010 7:18 am
by roders