Language/locale indicators
Moderator: General Moderators
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
Language/locale indicators
I'm considering adding a multi-language feature to my CMS, while I have seen many solutions rely on cookies, this is obviously not a good thing for SEO/SEF, etc. The URI seems a logical place, maybe even the extension could be used to indicate the language which is desired.
I'm curious if you were building a CMS that supported multi-langauge features, how would you persist that requirement across page requests? Sub-domain?
I've always thought, having a site translated would warrant a different web site, and likley the purchase of a new domain, like domain.fr (french) or domain.ch (chinese), etc. At least this appears to be how big corporations do it and it makes sense IMHO, but it's a very cmomon feature request for small/medium CMS builders to have this functionality in-built.
What says you? New installation and database, etc? Or implement this funcitonality inline with other languages?
I'm curious if you were building a CMS that supported multi-langauge features, how would you persist that requirement across page requests? Sub-domain?
I've always thought, having a site translated would warrant a different web site, and likley the purchase of a new domain, like domain.fr (french) or domain.ch (chinese), etc. At least this appears to be how big corporations do it and it makes sense IMHO, but it's a very cmomon feature request for small/medium CMS builders to have this functionality in-built.
What says you? New installation and database, etc? Or implement this funcitonality inline with other languages?
- kaisellgren
- DevNet Resident
- Posts: 1675
- Joined: Sat Jan 07, 2006 5:52 am
- Location: Lahti, Finland.
Re: Language/locale indicators
I've been planning doing the same. I have a few thoughts in my mind:
First of all, let the admin specify a default language. Then, if the site is accessed directly (i.e. site.com), look into the ACCEPT_LANGUAGE header for the user's preferred language and if the language is available, use it. Otherwise, use the default language specified by the admin. However, if there's a language given in the URI (e.g. site.com/en-US/, let the admin specify this URI key for each language), then use it regardless.
And my reasoning: the ACCEPT_LANGUAGE is given to you by the browser, which took it based on your operating system details. So, the value is very approximate even if I'm on a holiday trip in Italy which leads me to my second point: don't target languages based on IPs. It just won't work. However, if he wants to view a page in English, or he comes from a Google search because he found content he was looking for - trust the URI and use the language specified there.
In my CMS, all links are built through an object and therefore it's easy for me to pass in some URI data if needed. And what comes to these language domains, I'm currently working on a crap project that uses .fi for Finnish, .com for English and .se for Swedish. I'm okay with the idea behind it, but it does make things harder to implement. The site uses a custom written application specific framework, which is located above the root directory and used by all these three domains:
So, all these three domains just contain the main index.php file and the site assets are loaded from the public_html (e.g. CSS and photos). Anyway, this leads to two decisions: separate config files to specify different languages or built-in domain detector. I decided to go for the ladder in my hobby CMS project, but this specific site used the former (not my choice). In my CMS, I let the admin specify domains that are of specific language just like with the language keys in the URI. The problem is here that we have to rely on HTTP_HOST, so, I'm not sure if I stay with my choice.
Also, admins are given a form where they can fill in some text, and then select the appropriate language from a select box that indicates the language used in the text. So, they can select English from the dropbox, and type text in English. After that, select Finnish from the dropbox, and type Finnish. When they press Save, both versions are saved into the database, which stores the language in a column along with the text. I also provide a quick way to translate text using different translators (mainly Google).
First of all, let the admin specify a default language. Then, if the site is accessed directly (i.e. site.com), look into the ACCEPT_LANGUAGE header for the user's preferred language and if the language is available, use it. Otherwise, use the default language specified by the admin. However, if there's a language given in the URI (e.g. site.com/en-US/, let the admin specify this URI key for each language), then use it regardless.
And my reasoning: the ACCEPT_LANGUAGE is given to you by the browser, which took it based on your operating system details. So, the value is very approximate even if I'm on a holiday trip in Italy which leads me to my second point: don't target languages based on IPs. It just won't work. However, if he wants to view a page in English, or he comes from a Google search because he found content he was looking for - trust the URI and use the language specified there.
In my CMS, all links are built through an object and therefore it's easy for me to pass in some URI data if needed. And what comes to these language domains, I'm currently working on a crap project that uses .fi for Finnish, .com for English and .se for Swedish. I'm okay with the idea behind it, but it does make things harder to implement. The site uses a custom written application specific framework, which is located above the root directory and used by all these three domains:
Code: Select all
/home/account/site-framework/
/home/account/public_html/
/home/account/xxx.se/
/home/account/xxx.com/Also, admins are given a form where they can fill in some text, and then select the appropriate language from a select box that indicates the language used in the text. So, they can select English from the dropbox, and type text in English. After that, select Finnish from the dropbox, and type Finnish. When they press Save, both versions are saved into the database, which stores the language in a column along with the text. I also provide a quick way to translate text using different translators (mainly Google).
Re: Language/locale indicators
A while ago I made a multi-language site in which the language was NOT part of the domain or URI at all.
If no language-cookie was set, it was auto-detected from the visitor's IP (default English), and visitors could select another language on the site at all times (selection was saved in cookie).
This way, if a French visitor posts a link to the site somewhere, and a German visitor clicks it, he will get automatically the same page in German, and a U.S. visitor (or someone from an unsupported/unrecognized country) would get the English version.
If no language-cookie was set, it was auto-detected from the visitor's IP (default English), and visitors could select another language on the site at all times (selection was saved in cookie).
This way, if a French visitor posts a link to the site somewhere, and a German visitor clicks it, he will get automatically the same page in German, and a U.S. visitor (or someone from an unsupported/unrecognized country) would get the English version.
- kaisellgren
- DevNet Resident
- Posts: 1675
- Joined: Sat Jan 07, 2006 5:52 am
- Location: Lahti, Finland.
Re: Language/locale indicators
I like the idea of automatically selecting the correct language, but it's not so good I visit the site as my IP is sometimes located in Russia :/. You are going to need a really precise IP detector, but what if we combine ACCEPT_LANGUAGE and cookies? This way the chosen language is right, links can be pasted and the settings are kept. Are there drawbacks to this scenario?Apollo wrote:This way, if a French visitor posts a link to the site somewhere, and a German visitor clicks it, he will get automatically the same page in German, and a U.S. visitor (or someone from an unsupported/unrecognized country) would get the English version.
- iankent
- Forum Contributor
- Posts: 333
- Joined: Mon Nov 16, 2009 4:23 pm
- Location: Wales, United Kingdom
Re: Language/locale indicators
I have to agree with you Kai, IP is not a good way of detecting location (e.g. UK AOL customers appear to be in the US), and certainly not which language the person may be speaking. Headers will nearly always be right (as Kai says, that comes from the users OS language settings), but if following a link to a language specific page in a search engine I'd expect to get the language its listed in.kaisellgren wrote:I like the idea of automatically selecting the correct language, but it's not so good I visit the site as my IP is sometimes located in Russia :/. You are going to need a really precise IP detector, but what if we combine ACCEPT_LANGUAGE and cookies? This way the chosen language is right, links can be pasted and the settings are kept. Are there drawbacks to this scenario?
For that specific scenario (i.e., the URL tells you one language while the header tells you another) you could display links to the other language pages.
Re: Language/locale indicators
The best thing for SEO is sub-domains for each version of the site. (source: google on canonical urls)
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
Re: Language/locale indicators
fr.domain.comThe best thing for SEO is sub-domains for each version of the site.
Would not be acceptable for 95% of the cases I work with. Sub-domains are almost always used for sub-systems, such as blogs, forums, etc. Persisting the language in the sub-domain would introduce all sorts of issues, hence the reason most corporations use the TLD to set the languages/locales.
What is interesting about this problem, is that if you look at dell.ca they default to English but offer the option of switching to French. This happens under the same domain dell.ca so obviously the implement some kind of multi-language functionality.
Cheers,
Alex
Re: Language/locale indicators
You could have
fr.blog.example.com
en.blog.example.com
The TLDs aren't available in every case, and google treats them as equally distinct in each case. ( theres no difference between a subdomain and a TLD in google's eyes AFAIK).
Take for example this page which ranks well for it's keywords:
alumni.nutrition.tufts.edu/?pid=66&c=118
fr.blog.example.com
en.blog.example.com
The TLDs aren't available in every case, and google treats them as equally distinct in each case. ( theres no difference between a subdomain and a TLD in google's eyes AFAIK).
Take for example this page which ranks well for it's keywords:
alumni.nutrition.tufts.edu/?pid=66&c=118
- kaisellgren
- DevNet Resident
- Posts: 1675
- Joined: Sat Jan 07, 2006 5:52 am
- Location: Lahti, Finland.
Re: Language/locale indicators
I crawled through a few website and here are the results:
Sub-domain language targeting:
http://fin.afterdawn.com/, http://sv.afterdawn.com/, http://www.afterdawn.com/
TLD language targeting:
http://www.asus.fi/, http://www.asus.se/, http://www.asus.com/
http://www.nvidia.fr/, http://www.nvidia.com/, http://www.nvidia.it/
http://www.google.fi/, http://www.google.fr/, http://www.google.com/
URI language targeting:
http://www.amd.com/fr/, http://www.amd.com/us/, http://www.amd.com/de/
http://www.microsoft.com/fi, http://www.microsoft.com, http://www.microsoft.com/fr
http://www.logitech.com/index.cfm/home/&cl=se,sv, http://www.logitech.com/index.cfm/home/&cl=fi,fi
http://www.ocztechnology.com/jp/, http://www.ocztechnology.com
http://westerndigital.com/de/, http://westerndigital.com/en/
http://www.seagate.com/www/es-es/, http://www.seagate.com/www/de-de/
Cookies only
http://www.megaupload.com/
http://www.corsair.com/
Sub-domain language targeting:
http://fin.afterdawn.com/, http://sv.afterdawn.com/, http://www.afterdawn.com/
TLD language targeting:
http://www.asus.fi/, http://www.asus.se/, http://www.asus.com/
http://www.nvidia.fr/, http://www.nvidia.com/, http://www.nvidia.it/
http://www.google.fi/, http://www.google.fr/, http://www.google.com/
URI language targeting:
http://www.amd.com/fr/, http://www.amd.com/us/, http://www.amd.com/de/
http://www.microsoft.com/fi, http://www.microsoft.com, http://www.microsoft.com/fr
http://www.logitech.com/index.cfm/home/&cl=se,sv, http://www.logitech.com/index.cfm/home/&cl=fi,fi
http://www.ocztechnology.com/jp/, http://www.ocztechnology.com
http://westerndigital.com/de/, http://westerndigital.com/en/
http://www.seagate.com/www/es-es/, http://www.seagate.com/www/de-de/
Cookies only
http://www.megaupload.com/
http://www.corsair.com/