Page 1 of 1

Multiple Languages

Posted: Fri Jul 21, 2006 2:35 am
by shiznatix
If this has been discussed before, beat me with things.

What is the best way to have a website with multiple languages? Like I can switch the site from English to German to Estonia and whatever. What I have done in the past is make an array and put in each langugaes translation. Like this:

Code: Select all

$words = array(
  'TITLE' => array(
    'en' => 'title',
    'ee' => 'esttiel',
    'de' => 'germantitle'
  ),

  'MAIN BODY' => array(
    'en' => 'this is a lot of text that in no way could be properly translated word by word'
    'ee' => 'repeat that but in estonian!'
    'de' => 'repeat that in german this time around!'
  ),
);

//usage
echo $words['MAIN BODY'][$_SESSION['lang']];
but for some reason this seams silly. the array gets to be thousands of lines long really fast, you have to load the entire thing into memory every page load, and you can often repeat the same word twice if you forget about it down the road just taking up more time and space.

what is the best way to do this?

Posted: Fri Jul 21, 2006 2:38 am
by Benjamin
osCommerce has a language file for every page. All the words and phrases are defined.

ie..

Code: Select all

define('NAVBAR_TITLE', 'My Account');
define('HEADING_TITLE', 'My Account Information');

define('OVERVIEW_TITLE', 'Overview');
define('OVERVIEW_SHOW_ALL_ORDERS', '(show all orders)');
define('OVERVIEW_PREVIOUS_ORDERS', 'Previous Orders');

define('MY_ACCOUNT_TITLE', 'My Account');
define('MY_ACCOUNT_INFORMATION', 'View or change my account information.');
define('MY_ACCOUNT_ADDRESS_BOOK', 'View or change entries in my address book.');
define('MY_ACCOUNT_PASSWORD', 'Change my account password.');

define('MY_ORDERS_TITLE', 'My Orders');
define('MY_ORDERS_VIEW', 'View the orders I have made.');

define('EMAIL_NOTIFICATIONS_TITLE', 'E-Mail Notifications');
define('EMAIL_NOTIFICATIONS_NEWSLETTERS', 'Subscribe or unsubscribe from newsletters.');
define('EMAIL_NOTIFICATIONS_PRODUCTS', 'View or change my product notification list.');
I'm not sure that is the best way to do it though..

Posted: Fri Jul 21, 2006 2:49 am
by shiznatix
that would get messy and difficult to maintain. also if the page has dynamic content then it would be even crazier.

Posted: Fri Jul 21, 2006 2:54 am
by Benjamin
Yeah, in osCommerce all the standard text is in language files. Product descriptions, titles and dynamic text is stored in the database, with a different field for each language. I forgot about that because I always disable it.

Posted: Fri Jul 21, 2006 3:50 am
by GM
Would it be too much of a performance hit to store all your literals in a database table with language as a key? That way you can check (maybe in $_SESSION) for the language, and pull the relevant content from the database:

Example:

Code: Select all

ID_LITERAL  -  LANG - DE_LITERAL
TITLE       - EN - "Welcome to the site"
TITLE       - IT - "Benvenuto al sito!"
TITLE       - DE - "Wilkommen ... "
MENU_ITEM_1 - EN - "Profile"
MENU_ITEM_1 - IT - "Profilo"
MENU_ITEM_1 - DE - "german for profile"
etc.

This is how this kind of thing is stored in the Oracle database I manage at work, but that is used only in LAN situations, so I'm not sure how performance would be over t'Internet.

Posted: Fri Jul 21, 2006 4:49 am
by shiznatix
i thought about that but it would be difficult to have to first get all the phrases that will be used and build the query off it. this seams to be the biggest problem.

if i went 1 by 1 getting the word it would be horrid as i might run 100 queries on 1 page call. also, adding more languages would be a pain having to run all those insert queries.

but this might not be a bad idea if done correctly. maybe...

Posted: Fri Jul 21, 2006 5:30 am
by GM
shiznatix wrote:if i went 1 by 1 getting the word it would be horrid as i might run 100 queries on 1 page call. also, adding more languages would be a pain having to run all those insert queries.
That's exactly how mine does it (not written by me, I hasten to add). There is a function "getLiteral(ID_LITERAL, LANG)" that basically does a select from the database to extract one text item at a time. I can't say I've noticed any performance problems (selecting by primary key is the quickest kind of SELECT that there is), but as I said earlier, it operates exclusively in a LAN environment.

Adding languages is going to be a pain in whatever setup you use. You've got to store them somehow. I manage all my language data in an excel spreadsheet, and create the INSERT queries dynamically there. Then it's just a case of copy/paste into the MySQL prompt.

One solution might be to know which Literals you are going to need on a page and load them all in one select statement - SELECT DE_LITERAL FROM LITERALS WHERE ID_LITERAL IN ('...', '...', '...', ...); - you could even store which literals were needed for each page in a separate database table.

With suitable naming conventions for dynamic content (ie: description for the chosen product could be stored in the database as "DESC_[product_id]") you could allow for a certain amount of flexibility on your pages.

I'm not saying this is the best way to do things, I'm just saying that it's the way I've (been forced) to use, and it doesn't seem to be too bad.

Posted: Fri Jul 21, 2006 7:35 am
by fastfingertips
Better for you is to create a management solution for language using a database, where you can create a table translation, etc

Posted: Fri Jul 21, 2006 8:19 pm
by Ambush Commander
I recommend looking at the MediaWiki codebase, and also consulting with the developers on how they plan on revamping their language file system (something about extract(unserialize()) ). They attention to performance as well as plenty of real-world multi-language deployment.

Tim Starling proposed this as a template for a MessagesXx.php file:

Code: Select all

<?php
$fallback = 'en';
$rtl = false;
$timeBeforeDate = true;
$timeSeparator = ':';
$timeDateSeparator = ', ';
$digitTransformTable = null;
$separatorTransformTable = null;

$namespaceNames = array(
	NS_MEDIA            => 'Media',
	...
);

$quickbarSettings = array( ... );

$skinNames = array( ... );

$mathNames = array( ... );

$dateFormats = array(
	MW_DATE_DEFAULT => 'No preference',
	...
);

$bookstoreList = array( ... );

$weekdayNames = array( ... );

$monthNames = array( ... );
$monthNamesGen = array( ... );

$monthAbbreviations = array( ... );

$magicWords = array(
#   ID                                 CASE  SYNONYMS
	'redirect'               => array( 0,    '#REDIRECT'              ),
	...
);

$messages = array(
	...
);
?>
However, the process you described is essentially the correct one. You will end up loading the entire message cache into memory unless you can compartmentize messages into modules, and make sure that each request neatly maps into one or two (or a few) modules. Usually, it's not a big problem though: I've taken a look at some very comprehensive localization files and their only ~80kb, which is not bad at all.

It is feasable to put messages in the database for maximum customization, but YOU MUST have aggressive caching mechanisms in-place in order to effectively load the entire message cache into memory (the alternative is ripple loading which will absolutely cripple an application.). unserialize() is an extremely effective and fast method of doing this.

You may want to give in and define a global function like wfMsg() which localizes the message automatically; just pass a key. Instead of your convoluted statement, just do:

Code: Select all

echo wfMsg('MAIN BODY');
But as for me, well, I've never had to do something this flexible. I'm just reporting what seems to me to be a solid implementation of it.

Posted: Fri Jul 21, 2006 8:27 pm
by Christopher
I tend to solve this on the template level rather than the string level. So I have folders named after each supported language code in the templates directory and just change the path.

Posted: Fri Jul 21, 2006 10:26 pm
by timvw
http://be2.php.net/gettext is what i recommend...

Posted: Sat Jul 22, 2006 5:56 pm
by Hesus
I think the best way would to use MMcache's shared memory functions load and array of translated words into the RAM. i think this wil give the best RAM usage and performance

Posted: Sat Jul 22, 2006 7:38 pm
by Jenk
table 'articles':

id (key), title, content, language

Code: Select all

mysql_query ('SELECT * FROM `articles` WHERE `id` = '{$articleId}' AND `language` = {$lang} ');

Posted: Wed Jul 26, 2006 2:02 am
by Popcorn
Hi,

Like others mentioned, I think that storing Product descriptions, titles, etc. in different langs in the db is a good idea. See this http://www.openmymind.net/localization/index2.html for a schema suggestion. It should be possible to make a pretty nice interface to edit all the different languages in the db.

As for the pages themselves, I think it's a toss up between separate language files for each page or also putting it in the db. If you put it in the db you can still get translators to work with text files, a little db exporting and scripting to get the text back in the db shouldn't be too hard.

I am going with the db choice myself. I kind of like it all in one place. Bert Hooyman's comment in http://www.jugglingdb.com/compendium/ge ... sites.html is pretty much the same as my idea - put a list of all pages in one table, text in different langs in another, then link the two thru an intermediate table. This way different pages can use the same text phrases and you can grab all text required for a page in one statement - something like Bert's:

Code: Select all

SELECT page_phrases.key, phrases.text from page_phrases, phrases WHERE
page_phrase.pageID = theCurrentPage AND page_phrase.phraseID = phrases.phraseID AND phrases.languageID = theCurrentLanguage;
I already have a list of pages in the db for authorization purposes so it's a little easier to implement. However, instead of implementing return codes from functions and includes, I have several that output loads of text themselves :oops: So what I'd also have is any fn/inc that needed to, also querying the db for its localized text.

I defer to Ambush Commander's and any other's comments on caching - I know little about implementing any.

Posted: Sun Jul 30, 2006 12:17 am
by bg
There is a very easy and simple way to do this. Have all literals stored as constants in its own file. One file for each language. The application language is determined by which file is included.