Anyone have any info on this charset and entities issue?
Thanks
Ben
How to store character entities in a database -best practice
Moderator: General Moderators
Which issue? The mysql injection issue?
Seach around http://shiflett.org/ you'll find that one there somewhere....
Seach around http://shiflett.org/ you'll find that one there somewhere....
Hi guys
I've got the MySQL injection thing sorted now. The final thing is converting various HTML entities into their characters, and storing those in MySQL.
I'm using...
To change HTML alpha entities and HTML numeric entities respectively, into their proper characters.
The problem is the numericentitieshtml() function (from the PHP manual somewhere) is converting the HTML numeric entities, but into different characters.
So running the following entities through my numericentities() function (without the periods obviously)...
I'm guessing this is because the TM and euro symbols aren't in the UTF-8 charset
So what charset do I need to use instead to get these chars into my database properly?
The collation columns in my database tables are all currently set to latin1_swedish_ci
Hope this makes things a bit clearer!
Ben
I've got the MySQL injection thing sorted now. The final thing is converting various HTML entities into their characters, and storing those in MySQL.
I'm using...
Code: Select all
html_entities_decode()
and
function numericentitieshtml($str) {
return utf8_encode(preg_replace('/&#(\d+);/e', 'chr(str_replace(";", "", str_replace("&#","","$0")))', $str));
}The problem is the numericentitieshtml() function (from the PHP manual somewhere) is converting the HTML numeric entities, but into different characters.
So running the following entities through my numericentities() function (without the periods obviously)...
Code: Select all
&.#8482; becomes a " quote mark whereas it should be a TM symbol
&.#8364; becomes a ¬ character whereas it should be euro symbolSo what charset do I need to use instead to get these chars into my database properly?
The collation columns in my database tables are all currently set to latin1_swedish_ci
Hope this makes things a bit clearer!
Ben
Last edited by batfastad on Fri Oct 13, 2006 5:14 am, edited 1 time in total.