Language/locale indicators

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Language/locale indicators

Post by alex.barylski »

I'm considering adding a multi-language feature to my CMS, while I have seen many solutions rely on cookies, this is obviously not a good thing for SEO/SEF, etc. The URI seems a logical place, maybe even the extension could be used to indicate the language which is desired.

I'm curious if you were building a CMS that supported multi-langauge features, how would you persist that requirement across page requests? Sub-domain?

I've always thought, having a site translated would warrant a different web site, and likley the purchase of a new domain, like domain.fr (french) or domain.ch (chinese), etc. At least this appears to be how big corporations do it and it makes sense IMHO, but it's a very cmomon feature request for small/medium CMS builders to have this functionality in-built.

What says you? New installation and database, etc? Or implement this funcitonality inline with other languages?
User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: Language/locale indicators

Post by kaisellgren »

I've been planning doing the same. I have a few thoughts in my mind:

First of all, let the admin specify a default language. Then, if the site is accessed directly (i.e. site.com), look into the ACCEPT_LANGUAGE header for the user's preferred language and if the language is available, use it. Otherwise, use the default language specified by the admin. However, if there's a language given in the URI (e.g. site.com/en-US/, let the admin specify this URI key for each language), then use it regardless.

And my reasoning: the ACCEPT_LANGUAGE is given to you by the browser, which took it based on your operating system details. So, the value is very approximate even if I'm on a holiday trip in Italy which leads me to my second point: don't target languages based on IPs. It just won't work. However, if he wants to view a page in English, or he comes from a Google search because he found content he was looking for - trust the URI and use the language specified there.

In my CMS, all links are built through an object and therefore it's easy for me to pass in some URI data if needed. And what comes to these language domains, I'm currently working on a crap project that uses .fi for Finnish, .com for English and .se for Swedish. I'm okay with the idea behind it, but it does make things harder to implement. The site uses a custom written application specific framework, which is located above the root directory and used by all these three domains:

Code: Select all

/home/account/site-framework/
/home/account/public_html/ 
/home/account/xxx.se/
/home/account/xxx.com/
So, all these three domains just contain the main index.php file and the site assets are loaded from the public_html (e.g. CSS and photos). Anyway, this leads to two decisions: separate config files to specify different languages or built-in domain detector. I decided to go for the ladder in my hobby CMS project, but this specific site used the former (not my choice). In my CMS, I let the admin specify domains that are of specific language just like with the language keys in the URI. The problem is here that we have to rely on HTTP_HOST, so, I'm not sure if I stay with my choice.

Also, admins are given a form where they can fill in some text, and then select the appropriate language from a select box that indicates the language used in the text. So, they can select English from the dropbox, and type text in English. After that, select Finnish from the dropbox, and type Finnish. When they press Save, both versions are saved into the database, which stores the language in a column along with the text. I also provide a quick way to translate text using different translators (mainly Google).
User avatar
Apollo
Forum Regular
Posts: 794
Joined: Wed Apr 30, 2008 2:34 am

Re: Language/locale indicators

Post by Apollo »

A while ago I made a multi-language site in which the language was NOT part of the domain or URI at all.

If no language-cookie was set, it was auto-detected from the visitor's IP (default English), and visitors could select another language on the site at all times (selection was saved in cookie).
This way, if a French visitor posts a link to the site somewhere, and a German visitor clicks it, he will get automatically the same page in German, and a U.S. visitor (or someone from an unsupported/unrecognized country) would get the English version.
User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: Language/locale indicators

Post by kaisellgren »

Apollo wrote:This way, if a French visitor posts a link to the site somewhere, and a German visitor clicks it, he will get automatically the same page in German, and a U.S. visitor (or someone from an unsupported/unrecognized country) would get the English version.
I like the idea of automatically selecting the correct language, but it's not so good I visit the site as my IP is sometimes located in Russia :/. You are going to need a really precise IP detector, but what if we combine ACCEPT_LANGUAGE and cookies? This way the chosen language is right, links can be pasted and the settings are kept. Are there drawbacks to this scenario?
User avatar
iankent
Forum Contributor
Posts: 333
Joined: Mon Nov 16, 2009 4:23 pm
Location: Wales, United Kingdom

Re: Language/locale indicators

Post by iankent »

kaisellgren wrote:I like the idea of automatically selecting the correct language, but it's not so good I visit the site as my IP is sometimes located in Russia :/. You are going to need a really precise IP detector, but what if we combine ACCEPT_LANGUAGE and cookies? This way the chosen language is right, links can be pasted and the settings are kept. Are there drawbacks to this scenario?
I have to agree with you Kai, IP is not a good way of detecting location (e.g. UK AOL customers appear to be in the US), and certainly not which language the person may be speaking. Headers will nearly always be right (as Kai says, that comes from the users OS language settings), but if following a link to a language specific page in a search engine I'd expect to get the language its listed in.

For that specific scenario (i.e., the URL tells you one language while the header tells you another) you could display links to the other language pages.
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Language/locale indicators

Post by josh »

The best thing for SEO is sub-domains for each version of the site. (source: google on canonical urls)
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: Language/locale indicators

Post by alex.barylski »

The best thing for SEO is sub-domains for each version of the site.
fr.domain.com

Would not be acceptable for 95% of the cases I work with. Sub-domains are almost always used for sub-systems, such as blogs, forums, etc. Persisting the language in the sub-domain would introduce all sorts of issues, hence the reason most corporations use the TLD to set the languages/locales.

What is interesting about this problem, is that if you look at dell.ca they default to English but offer the option of switching to French. This happens under the same domain dell.ca so obviously the implement some kind of multi-language functionality.

Cheers,
Alex
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Language/locale indicators

Post by josh »

You could have

fr.blog.example.com
en.blog.example.com

The TLDs aren't available in every case, and google treats them as equally distinct in each case. ( theres no difference between a subdomain and a TLD in google's eyes AFAIK).
Take for example this page which ranks well for it's keywords:
alumni.nutrition.tufts.edu/?pid=66&c=118
Post Reply