Page 1 of 2

MySQL vs PHP flat file for this specific situation

Posted: Mon Aug 07, 2006 2:18 pm
by nutkenz
I'm designing a multi-language website and handling the language switching like this;

Website:

Code: Select all

$lang->translate("Your first name")
Function:

Code: Select all

function translate($from)
	{
		if (($this->lang != $this->defLang) && (array_key_exists($from,$this->langTable)))
		{
			return $this->langTable[$from];
		}
		else // no translation required or possible
		{
			return $from;
		}
	}
Language file:

Code: Select all

<?

$langTable["Your first name"] = "Uw voornaam";
$langTable["Your last name"] = "Uw achternaam";

?>
I was wondering if it would be slower to use MySQL either by getting all the sentences for a specific page and language in one query and another scenario where I'd get each individual sentence translation as I need it. Does anyone have experience on this field?

Posted: Mon Aug 07, 2006 2:27 pm
by feyd
I store the language data in separate files to reduce the query and memory requirements on the database side. The individual strings can be loaded back and forth however to make managing easier for various people. The system itself will later dump all the strings back to the file however for faster access.

Posted: Mon Aug 07, 2006 2:42 pm
by nutkenz
feyd wrote:I store the language data in separate files to reduce the query and memory requirements on the database side. The individual strings can be loaded back and forth however to make managing easier for various people. The system itself will later dump all the strings back to the file however for faster access.
What do you mean by files exactly? Wouldn't they be tables if you're referring to a DB or am I missing the point?

In case I missed it; Where are your translation strings being stored and how are they fetched exactly?

Posted: Mon Aug 07, 2006 3:25 pm
by feyd
Technically, my translations are kept in both a database and PHP files. The database is exported to the PHP files when changes are made.

Posted: Mon Aug 07, 2006 3:33 pm
by nutkenz
feyd wrote:Technically, my translations are kept in both a database and PHP files. The database is exported to the PHP files when changes are made.
Okay, that's also an interesting solution.

Do you think a flat PHP file is faster than doing one query when a page is called to get all language sentences for that selected language?

Does your script create a file similar to this one:

$langTable["Your first name"] = "Uw voornaam";
$langTable["Your last name"] = "Uw achternaam";

Or did you put more thought into it than I did and come up with another structure? :)

Posted: Mon Aug 07, 2006 3:40 pm
by feyd
nutkenz wrote:Do you think a flat PHP file is faster than doing one query when a page is called to get all language sentences for that selected language?
It can depend on a great many of things, including the operating system, file system and database settings. Overall, I've found that a file will be faster provided it is in a format easily processed by PHP, i.e. a script. It can be dependant on implementation and memory availability too.
nutkenz wrote:Does your script create a file similar to this one:

$langTable["Your first name"] = "Uw voornaam";
$langTable["Your last name"] = "Uw achternaam";

Or did you put more thought into it than I did and come up with another structure? :)
similar, yes.

Posted: Mon Aug 07, 2006 3:53 pm
by nutkenz
Ok, thank you very much. Time to get coding again :)

Posted: Mon Aug 07, 2006 4:12 pm
by nutkenz
On a related note; it looks like I'll be needing to use this $lang class in a large part of my existing functions because some return HTML, which would make it difficult to find the strings to be translated, and some others echo directly to the browser (XAJAX).

How have you implemented this? I'm currently doubting between these options:

- Making the class (referenced by $lang) global, not by including it in every function but by putting it in $GLOBALS
- Using a singleton (just read about this by accident - but it seems like this might be a little bit 'better' than simply globalizing it)
- Thinking longer and/or harder about another solution :)

Posted: Mon Aug 07, 2006 4:27 pm
by Chris Corbyn
feyd wrote:It can depend on a great many of things, including the operating system, file system and database settings. Overall, I've found that a file will be faster provided it is in a format easily processed by PHP, i.e. a script. It can be dependant on implementation and memory availability too.
Someone was explaining the B-Tree to me the other day and I've decided I'm going to *try* and write a fast generic flat-file library for PHP which supports SQL based querying. That'll be a slow process though. It'd be handy because you could pretty much move the code and the database (flat files) between servers without worrying about the PHP environment.

Posted: Mon Aug 07, 2006 7:14 pm
by Ollie Saunders
Aren't there any extensions for translation?

If you going to write all you translations like this:

Code: Select all

$langTable["Your first name"] = "Uw voornaam";
$langTable["Your last name"] = "Uw achternaam";
Aren't you going to end up with one massive array and a long file? How much are you hoping to translate?

Also can I remind everyone that, yes there is an overhead between moving data from a database to PHP, but if you want to store data in any resonible/variable quantity a database is usually better. Why?
  • A large PHP file has to be parsed everytime it is executed
  • PHP Arrays have to be indexed everytime they are are initialized
  • If even the slightest change is made to a PHP file the next time it is required the entire file has to be read from the hard disk again
  • You can't write to a PHP file whilst it is in use, which means if you are using PHP itself as a data storage system that you will write to you get the multi-user capability of a dead horse.
  • Databases keep a lot of the recently requested data in RAM so they rarely read from the disk.
  • Databases maintain indexing in there data, they never have to be reindexed instead when entries are added they are added in a way that maintains the existing index
  • Databases are designed, by nature, to be fast as retrieveing data, that is 50% of the reason we use them

Posted: Tue Aug 08, 2006 4:32 am
by nutkenz
ole wrote:Aren't there any extensions for translation?
No, the extension is the file itself (nl.php or en.php), I only load the ones I need.
ole wrote:If you going to write all you translations like this:

Code: Select all

$langTable["Your first name"] = "Uw voornaam";
$langTable["Your last name"] = "Uw achternaam";
Aren't you going to end up with one massive array and a long file? How much are you hoping to translate?
The site won't become that massive, I don't think I'll end up having more than 500 phrases.
ole wrote:Also can I remind everyone that, yes there is an overhead between moving data from a database to PHP, but if you want to store data in any resonible/variable quantity a database is usually better. Why?
  • A large PHP file has to be parsed everytime it is executed
Isn't the file cached though? I can't imagine that having a huge effect on the load time as my previous websites include a bunch of other PHP scripts on every page.
ole wrote: [*]PHP Arrays have to be indexed everytime they are are initialized
Would a singleton resolve this?
ole wrote: [*]If even the slightest change is made to a PHP file the next time it is required the entire file has to be read from the hard disk again
My file wouldn't change that often though. Probably once every couple of months once the site is finished.
ole wrote: [*]You can't write to a PHP file whilst it is in use, which means if you are using PHP itself as a data storage system that you will write to you get the multi-user capability of a dead horse.
No problem for me.
ole wrote: [*]Databases keep a lot of the recently requested data in RAM so they rarely read from the disk.
But the PHP file is also in RAM without the overhead of a query.
ole wrote: [*]Databases maintain indexing in there data, they never have to be reindexed instead when entries are added they are added in a way that maintains the existing index
Yes, but I need about 20-30 translated strings per page, so it might not be a good idea to request each of them seperately. 30 queries per page per visitor seems like it would cause more load than reading the translation file for all strings in that language and using the ones I need. I could be wrong though, that's why I'm posting here...
ole wrote: [*]Databases are designed, by nature, to be fast as retrieveing data, that is 50% of the reason we use them[/list]
That's a good point, the question is which would be faster in this scenario though. :)

I'm still not sure which one I'll be using, I'm leaning more towards a database now because it'd be easier to add new phrases: if they don't exist yet when a page is loaded in a certain language, a new (empty) entry is created so I wouldn't have to copy the English index to all of the language files seperately.

Posted: Tue Aug 08, 2006 5:51 am
by Ollie Saunders
The site won't become that massive, I don't think I'll end up having more than 500 phrases.
OK then the performance of php arrays probably won't be too bad. Now you need to consider if it is easier or harder to keep them in PHP. Remember you can't add or change them without modifing a PHP file wheras with a database you could create a frontend, unless you are planning on create a frontend to modify the PHP file which could be interesting.
Isn't the file cached though? I can't imagine that having a huge effect on the load time as my previous websites include a bunch of other PHP scripts on every page.
Apache might cache the physical data of the file but PHP doesn't (not until PHP 6 anyway) keep the tokenized version of the file. This means that if you have 10,000 elements in a PHP array they would have to be indexed every time but in a db they would not. The more data you keep the better a database will be at handling it over PHP.
Would a singleton resolve this?
No, its how the language works, when you create an array element or variable, PHP has to store information about where that variable/element can be found; indexing.
My file wouldn't change that often though. Probably once every couple of months once the site is finished.
OK but i'm thinking about the larger picture here? Is it OK to use PHP as a method of data storage? Too which you have to consider the fact that you can't write to a PHP whilst its in use and that there the performance over head of parsing and indexing.
But the PHP file is also in RAM without the overhead of a query.
The file may be kept in RAM but PHP has no optimization for returning frequently used data faster, unlike a database.
Yes, but I need about 20-30 translated strings per page, so it might not be a good idea to request each of them seperately. 30 queries per page per visitor seems like it would cause more load than reading the translation file for all strings in that language and using the ones I need. I could be wrong though, that's why I'm posting here...
You have to try it.
That's a good point, the question is which would be faster in this scenario though.
Like I said, try it. But even if a PHP array was faster I would still use a database, because your PHP array isn't going to be faster for long if more data is added or if you want to update the data.

To summerize: its data, you should probably use a database.
Question: :?: is there any other reason you are avoiding a database? Is performance on a small scale really that important to you? Surely performance only becomes an issue as things get larger.

Posted: Tue Aug 08, 2006 6:45 am
by s.dot
Doing it flat file wouldn't be faster, unless you're doing a simple 'include'. However, I think saving on SQL queries and server resources would be more important than a few milliseconds.

Posted: Tue Aug 08, 2006 7:50 am
by nutkenz
Okay, great. A database is it then :) thanks for your opinions.

Posted: Tue Aug 08, 2006 8:42 am
by feyd
ole wrote:Apache might cache the physical data of the file but PHP doesn't (not until PHP 6 anyway) keep the tokenized version of the file. This means that if you have 10,000 elements in a PHP array they would have to be indexed every time but in a db they would not. The more data you keep the better a database will be at handling it over PHP.
*cough* APC among others.
ole wrote:OK but i'm thinking about the larger picture here? Is it OK to use PHP as a method of data storage? Too which you have to consider the fact that you can't write to a PHP whilst its in use and that there the performance over head of parsing and indexing.
It's not that bad of a hit normally. The structure of these files will be very simple, i.e. easy to parse, for php.
ole wrote:The file may be kept in RAM but PHP has no optimization for returning frequently used data faster, unlike a database.
False. There's shared memory and all the accelerators and bytecode caching systems out there.
ole wrote:is there any other reason you are avoiding a database? Is performance on a small scale really that important to you? Surely performance only becomes an issue as things get larger.
I would agree that the performance of either is fairly negligable (provided nutkenz has the right database settings) at this scale. Performance depends highly on the individual implementation chosen, both for the flat files and the database. For instance, breaking the translation data up into mutliple files can hurt or help, depending on various things. Equally, the table structure used can hurt or help. In the best of all worlds, at this scale, they will have similar performance that either is fine.