Character encoding
Moderator: General Moderators
- shiznatix
- DevNet Master
- Posts: 2745
- Joined: Tue Dec 28, 2004 5:57 pm
- Location: Tallinn, Estonia
- Contact:
Character encoding
Ok I have this strange problem. I have a username saved in my database that has this word in it: Söze
now when I view this record in phpMyAdmin all is well, it shows the ö no problem. But when I pull this out of the database and display it on the website it wont show the ö but instead has the 'unknown character' symbol. But if I copy it from phpMyAdmin and put it as static text on the site it shows the ö no problem.
So what is happening from the database to the site thats screwing it up so much?
now when I view this record in phpMyAdmin all is well, it shows the ö no problem. But when I pull this out of the database and display it on the website it wont show the ö but instead has the 'unknown character' symbol. But if I copy it from phpMyAdmin and put it as static text on the site it shows the ö no problem.
So what is happening from the database to the site thats screwing it up so much?
Are you using html_entities() on the data? If so, are you using the correct character set? It defaults to ISO-8859-1.. so if you didn't use a character set, this could be the problem.
Also what's the collation of your database table?
Also what's the collation of your database table?
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
If I'm correct, that collation is for western/european countries. I actually have this same problem with unicode characters on one of my web sites (same collation) showing improperly. I never dug into it to solve it, though. I wonder if setting the collation to a utf-8* if it would solve the problem.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
- shiznatix
- DevNet Master
- Posts: 2745
- Joined: Tue Dec 28, 2004 5:57 pm
- Location: Tallinn, Estonia
- Contact:
My charset is such:
DEFAULT CHARSET=latin1
I am sure this is on all my tables because its just the default. But what I don't understand is why phpMyAdmin can extract the data and show it no problems at all but when I do it using just complete simple mysql queries on my website it messes up. It's the same server and everything so I don't understand.
DEFAULT CHARSET=latin1
I am sure this is on all my tables because its just the default. But what I don't understand is why phpMyAdmin can extract the data and show it no problems at all but when I do it using just complete simple mysql queries on my website it messes up. It's the same server and everything so I don't understand.
I also got problem when to generate the RSS. I have chinese character stored in database and is about to pull out the content (in chinese) from database and write it in RSS. But when view the xml file, the character showing the '????' instead of the encoding one. I use html_entity_decode for the result from database. What is the right character setting for this?
The best way to be sure it'll work is to make sure everything matches. This means you need to check:shiznatix wrote:My charset is such:
DEFAULT CHARSET=latin1
I am sure this is on all my tables because its just the default. But what I don't understand is why phpMyAdmin can extract the data and show it no problems at all but when I do it using just complete simple mysql queries on my website it messes up. It's the same server and everything so I don't understand.
The character set of the database table.
The character set of the database client (eg PHP .. you can set it with mysql_query("SET NAMES UTF-8"); .. or use CONVERT() in each SQL statement ).
The character set of the HTML page (set with header("Content-type: text/html; charset=utf8;"); .. or a meta tag .. or both. )
The character set of the browser (It defaults to autodetect which sets it to the header/meta character set, but if you've changed it to a manual setting things won't display right).
Plus, the the character set of the incoming form data needs to be correct else you'll end up trying to put wrongly encoded characters into the database (it should be the same as the HTML page unless your form has a language setting).
- shiznatix
- DevNet Master
- Posts: 2745
- Joined: Tue Dec 28, 2004 5:57 pm
- Location: Tallinn, Estonia
- Contact:
Whoops, spoke too soon.
That worked for getting the username of the guy out of the database without problems but I have a lot of serialized arrays stored as text in the database and now pulling those out just screws everything up with letters becoming just crazy symbols and stuff. Any way around this or a way to convert everything stored in the database to utf8 or something?
That worked for getting the username of the guy out of the database without problems but I have a lot of serialized arrays stored as text in the database and now pulling those out just screws everything up with letters becoming just crazy symbols and stuff. Any way around this or a way to convert everything stored in the database to utf8 or something?
- shiznatix
- DevNet Master
- Posts: 2745
- Joined: Tue Dec 28, 2004 5:57 pm
- Location: Tallinn, Estonia
- Contact:
And that was it. Thanks very much, with that and the SET NAMES everything is working all happy dandy.Weirdan wrote:what field type you're using to store that data? If it's TEXT - switch to BLOB, it shouldn't autoconvert the data as TEXT does.
By the way, what is the difference between TEXT and BLOB?