Page 1 of 1

Encode/Decode HTML

Posted: Wed Aug 19, 2009 5:24 pm
by tecktalkcm0391
Hello,

I am having trouble figuring out which functions I should use to save HTML data to a database, and then retrieve it turning it back into HTML.

Thanks!

Re: Encode/Decode HTML

Posted: Wed Aug 19, 2009 5:51 pm
by infolock
I usually use Serlialze to store HTML in a table field. others have their own methods too though

Re: Encode/Decode HTML

Posted: Wed Aug 19, 2009 6:01 pm
by Eran
why serialize HTML..? what does that achieve?

Basically if you trust the input, you can store HTML directly (just remember escaping it with the proper database functions). If it's user input and you need to protect against XSS attacks, you can use a filtering library such as HTML Purifier - http://htmlpurifier.org/

Re: Encode/Decode HTML

Posted: Wed Aug 19, 2009 8:03 pm
by tecktalkcm0391
Okay, which functions are they to do the "to mysql" and "from mysql" commands, cause my HTML keeps getting messed up when it's saved.
Thanks for the htmlpurifier.org link.

Re: Encode/Decode HTML

Posted: Wed Aug 19, 2009 8:16 pm
by Eran
cause my HTML keeps getting messed up when it's saved.
Can you elaborate on what "messed up" means in this context and also give code examples of how you insert / retrieve data from the database?

Re: Encode/Decode HTML

Posted: Wed Aug 19, 2009 9:18 pm
by tecktalkcm0391
htmlentities is used to put the data in the database, and html_entity_decode is used before its displayed in the FCKeditor

If the HTML is

Code: Select all

<strong>PAGE FAILED TO LOAD</strong>
I get back

Code: Select all

<strong>PAGE&nbsp;FAILED&nbsp;TO&nbsp;LOAD</strong>
when it comes back from the database

Re: Encode/Decode HTML

Posted: Wed Aug 19, 2009 10:35 pm
by AlanG
Information in your database should be stored in a neutral manner and not biased towards a certain medium. Basically HTML is used for displaying information and shouldn't be stored along with the information, but rather the info should be encased in HTML only when it is being viewed by hypertext media (such as a browser).

serialize (and unserialize are functions for transforming data types (such as an array or object) into a string, while still maintaining it's structure. It's handy for storing arrays and objects in databases and files etc, but it has alot of overhead. This function won't suit your needs in this case.

The htmlentities and html_entity_decode are what you need. Unfortunately a space is also converted to it's html equivalent. May I suggest a str_replace function to solve that. :) A custom function would be ideal.

Code: Select all

 <?php
    function html_encode($str) {
        $str = htmlentities($str);
        $str = str_replace('&nbsp;',' ',$str); // Search for and replace all occurrences of &nbsp; with a single space
 
        return $str;
    }
?>

Re: Encode/Decode HTML

Posted: Thu Aug 20, 2009 2:48 am
by Eran
What's the point of using htmlentities to store data in the database if you intend to decode it back? the database doesn't care. The only reason to use htmlentities prior to storing the HTML would be if it would be always outputted in that format and not as plain HTML.

Re: Encode/Decode HTML

Posted: Thu Aug 20, 2009 2:57 am
by VladSun
I agree with pytrin.
Escape your string by using a DB specific escape function (e.g. mysql_real_escape_string() ) and store it in the DB.
Later, you may (or may not) display it as HTML or as plain text (i.e. by using htmlentities() ) .