Page 1 of 1

MySQL Collation

Posted: Sat Nov 08, 2008 11:46 pm
by alex.barylski
EDIT | Seems collation can only be changed at the server level not per connection? How are you supposed to support multiple users from different countries if you cannot change this value at the connection level? :(

I'm reading up on internationalizing applications...WOW! What a mouth full. :P

If everyone just assimilated with the American way of life things would be much easier. :P I'm Canadian so I'm allowed to say that...

Anyways...collation...I finally understand what that option means in phpMyAdmin...

At least I assume it means that MySQL will handle collation for my users automatically?

Only how does MySQL know to sort the results according to locale? For instance, apparently identical lists will be sorted differently by someone in Sweden as opposed to someone in Germany. I'm not sure what language it was, but as an English speaker I would expect to see an accented "A" in the A's but some cultures do not wrk this way.

Does using MySQL take care of this caveat? How does it know what locale my user is in? How do I tell it what locale my user is in?

Re: MySQL Collation

Posted: Sun Nov 09, 2008 4:03 am
by Eran
If you are storing UTF-8 data, it is only reasonable to use a UTF collation.. Again, you should only use other collations if you have a good reason. I use utf8_unicode_ci (ci stands for case-insensitive). There are some caveats, you should read the Mysql manual on it.

You can change charset and collation per connection, there are a couple of commands that affect those settings: SET NAMES and SET CHARACETER SET(read the manual for more information - http://dev.mysql.com/doc/refman/5.0/en/ ... ction.html). Personally, I force the connection to utf8 so it won't be overridden by the client using a couple of settings in my configuration file -

Code: Select all

 
default-character-set=utf8
skip-character-set-client-handshake
 
This ensures the data arrives as utf8.

Re: MySQL Collation

Posted: Sun Nov 09, 2008 7:48 am
by alex.barylski
I assume I will be using entirely UTF-8 data yes so collation will be UTF-8 as well...

Although after reading all those articles on Unicode, etc...I'm stuck wondering...

Swedish and German are the examples given but they result in different sorting/collation being required inorder to meet locale awareness.

Does using UTF-8 collation on a database table field ensure that regardless of users locale (ie: German or Swedish) the names would be sorted properly? What about the Names versus Phonebook comparison under MySQL?

I figured you had to somehow dynamically change/notify MySQL of the collation for a table via the locale or somethign similar? I'm extremely tired right now, having been at this Unicode thing for well over 20 hours with no sleep, so pardon my lack of something. :P