UTF-8 MySQL tables, HTML output and more...
Posted: Tue Aug 17, 2010 10:17 am
I have a very confusing issue, I hope someone with more experience in dealing with UTF can clear up for me.
I have some legacy data that was entered using Access. I exported this data into MySQL tables, having never touched the charset of the MySQL tables, I assume they remained as latin1.
Some characters entered in Access were special characters, such as the ° or ® and others.
When I query this data from the MySQL table, all is well, but I need to convert the result into JSON and the default JSON methods (json_encode/json_decode) deal with UTF-8 only. So before passing the array to these methods I would iterate over the array and htmlentities() or now utf8_encode() which seemed to satisfy the JSON methods. Without this operation, JSON would choke on arrays that had fields with funny characters.
Then I got to thinking, I could probably avoid manually converting the charset each time I return an array by simply changing the charset of the fields from latin to UTF-8. So I exported the old data, changed the charset to UTF-8 on all required fields and re-imported data into the new table.
Problem is, I still need to iterate the array and utf8_encode() each text field before passing array to json_encode() - so I am wondering whether I need a script (SQL or PHP) that will convert all characters in theB to UTF-8 as well? Is it not enough to convert just the field? I assumed that would convert the existing characters at import.
ANy ideas? Input? Insight?
Cheers,
Alex
I have some legacy data that was entered using Access. I exported this data into MySQL tables, having never touched the charset of the MySQL tables, I assume they remained as latin1.
Some characters entered in Access were special characters, such as the ° or ® and others.
When I query this data from the MySQL table, all is well, but I need to convert the result into JSON and the default JSON methods (json_encode/json_decode) deal with UTF-8 only. So before passing the array to these methods I would iterate over the array and htmlentities() or now utf8_encode() which seemed to satisfy the JSON methods. Without this operation, JSON would choke on arrays that had fields with funny characters.
Then I got to thinking, I could probably avoid manually converting the charset each time I return an array by simply changing the charset of the fields from latin to UTF-8. So I exported the old data, changed the charset to UTF-8 on all required fields and re-imported data into the new table.
Problem is, I still need to iterate the array and utf8_encode() each text field before passing array to json_encode() - so I am wondering whether I need a script (SQL or PHP) that will convert all characters in theB to UTF-8 as well? Is it not enough to convert just the field? I assumed that would convert the existing characters at import.
ANy ideas? Input? Insight?
Cheers,
Alex