Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy. This forum is not for asking programming related questions.
Every once in a while I run into the same problem. Unicode/UTF-8. Not so uni for me
So now I have this webpage in which everything works fine. Content displays well, both on the front and in the backend. Headers sent:
Content-Type: text/html; charset=UTF-8
Meta tag
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
However, when I look in PHPMyAdmin, some of the characters are messed up. The same as when I do an dump from the db from PhpMyAdmin.
Apollo, thanks for the answer. You say
é (e with acute accent) = C3 A9
‘ (backtick) = E2 80 98
may I ask, where do you find this? Is there a good table with the characters some where you use? Is "C3 A9" a standard notation for certain characters?
By the way, on the website with the problems I just did an sql dump using a plugin for wordpress. And now the characters do display fine when I open the .sql file in my texteditor. So it seems to me it really is a problem in phpMyAdmin
matthijs wrote:may I ask, where do you find this? Is there a good table with the characters some where you use? Is "C3 A9" a standard notation for certain characters?
It's just hex notation of the bytes involved.
I got them by looking at the hex-view of my UTF-8 compatible editor. In practical situations you don't enter binary UTF-8 data though, you typically get them from user input (in an online form or whatever that specifies UTF-8 encoding in the HTML header).
Website in browser with headers UTF-8 displays, correctly
ö and ô
phpMyAdmin and sql dump files displays
ö ô
What is what? Are the characters in the first line ( ö and ô ) UTF-8? Or Latin-1? And the messed up characters in the second line?
This is so annoying. The characters display fine on the website. However, when I do a backup, I want to be sure those sql files are correct and don't contain all those messy characters.
If I ever meet the one having invented character sets, I'm not sure we'll have a very pleasant conversation
matthijs wrote:If I ever meet the one having invented character sets, I'm not sure we'll have a very pleasant conversation
There will always be some kind of encoding / charset as long as text is stored as a bit sequence.
Actually it's a requirement to specify encoding when transmitting / storing text data. Failing to do so (or indicating wrong encoding) is application developer's fault. Multiple character sets (and encodings thereof) exist for legacy and efficiency reasons. Given the storage price 40 years ago choosing limited but compact encoding over all-encompassing but wasteful encoding was a no-brainer. And this is still an issue for various small devices today.
Yes, it's phpMyAdmin that has it backwards. If you can't change the ini, then the SET NAMES command achieves a similar result (only it requires a query for each request). You might need also
Ok just created a quick script with headers set to utf-8 and retrieving the content from the db displays the characters fine. So it's indeed phpMyAdmin, as pytrin said
jayshields wrote:It might be to do with multibyte strings - does phpMyAdmin warn you that this extension is not enabled?
I don't see any warnings. Where could I see that?
If you don't see any warnings (and your version of phpMyAdmin is new enough) then you've got it installed already. You can check using phpinfo() if you need to (just search the resulting page for mbstring).