Huge database, server side cache, store > 180 000 files?

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Symthic
Forum Newbie
Posts: 2
Joined: Fri Apr 20, 2012 6:23 pm

Huge database, server side cache, store > 180 000 files?

Post by Symthic »

Heya.
I have a question about few things.
I have static MySQL database with around 60 rows and 340 columns -> 20400 Values
I need to get ALL these values, make few advanced calculations and print them to a table.
This of course causes performance issues since I'm getting over 300 page views per minute (5/s) on rush hours.

What's best way around this?
I did implement server-side cache. It stores the pages as plain html code. If the file is not found for current settings (sorting and few others), it writes new file, and if its found, it includes that file.

This reduced the php work time by 95-99%

BUT, there's a problem in this method: I have around 180 000 different setting combinations for the table.
This would mean that by time, the cache would contain 180 000 files. Around 18Gigabytes of html files..... doesn't sound too good.

Could someone tell me best way to do this?
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: Huge database, server side cache, store > 180 000 files?

Post by califdon »

I would be surprised if a database that small would be causing any problem, even with that much traffic. It's more likely the query you are using and/or the kind of calculations you are referring to. Show us the query and at least some idea of what kinds of calculations you are performing.
Symthic
Forum Newbie
Posts: 2
Joined: Fri Apr 20, 2012 6:23 pm

Re: Huge database, server side cache, store > 180 000 files?

Post by Symthic »

Yes, the problem indeed is in processing the values.

For almost every number, there's if/switch to select font color, size, column width etc.
And on top of that the calculations. ie for every line (60 total) you have to output different variations of following depending on user's settings 40 times: x / y + 60 / z * ( ceil ( 100 / v ) - 1 )

Making all calculations takes around 0.2 second for Athlon 255 on my home server. When I publish the site, it will be on average VPS host where it would take even longer.
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: Huge database, server side cache, store > 180 000 files?

Post by califdon »

I have never faced such an issue, so I am only making a guess here, but I think that trying to create that many html files to avoid the calculations would only serve to make the problem worse. If it were my project, I would re-assess the value of having that much variability in a web page. There are some things that are just not worth doing.
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Re: Huge database, server side cache, store > 180 000 files?

Post by Weirdan »

340 columns? What kind of data you store there?
x_mutatis_mutandis_x
Forum Contributor
Posts: 160
Joined: Tue Apr 17, 2012 12:57 pm

Re: Huge database, server side cache, store > 180 000 files?

Post by x_mutatis_mutandis_x »

340 columns for a table is way too many. It is recommended to have a table not more than 20 columns. Re-evaluate the way you store your data, consider normalizing your table. Do that first, before you can look into alternative work arounds.

Querying over rows using joins is cheaper than querying over so many columns, because you can index your row data. When you query over the columns, the database first has to fetch the metadata of the columns and then the results..
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: Huge database, server side cache, store > 180 000 files?

Post by califdon »

x_mutatis_mutandis_x wrote:340 columns for a table is way too many. It is recommended to have a table not more than 20 columns.
I disagree. There is no magic number. The one and only rule is to normalize your data and whatever number of columns results from that is the right number.

I agree with the rest of your comments.

It also seems to me that 340 columns is very likely to be excessive, but not just because of the number. There are many situations in which a properly normalized set of data will require more than 20 (or any other arbitrary number) columns.
x_mutatis_mutandis_x
Forum Contributor
Posts: 160
Joined: Tue Apr 17, 2012 12:57 pm

Re: Huge database, server side cache, store > 180 000 files?

Post by x_mutatis_mutandis_x »

califdon wrote:I disagree. There is no magic number.
Which is why I said "recommended", not "required" :) . It's more of a good practice to adopt when constructing database objects.
califdon wrote:There are many situations in which a properly normalized set of data will require more than 20 (or any other arbitrary number) columns.
If the nature of data demands more columns, split to another table logically, by having frequently queried on columns in one, and less likely in the other, with one-one mapping. It is relatively easier to maintain a table with fewer columns.
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: Huge database, server side cache, store > 180 000 files?

Post by califdon »

x_mutatis_mutandis_x wrote:
califdon wrote:There are many situations in which a properly normalized set of data will require more than 20 (or any other arbitrary number) columns.
If the nature of data demands more columns, split to another table logically, by having frequently queried on columns in one, and less likely in the other, with one-one mapping. It is relatively easier to maintain a table with fewer columns.
I would be interested in knowing what that advice is based upon. My experience is pretty deep, including teaching and commercial development, and I have not found that to be true. At any rate, it illustrates that there is still much of relational database schema design that is more art than science.
x_mutatis_mutandis_x
Forum Contributor
Posts: 160
Joined: Tue Apr 17, 2012 12:57 pm

Re: Huge database, server side cache, store > 180 000 files?

Post by x_mutatis_mutandis_x »

califdon wrote:
x_mutatis_mutandis_x wrote:
califdon wrote:There are many situations in which a properly normalized set of data will require more than 20 (or any other arbitrary number) columns.
If the nature of data demands more columns, split to another table logically, by having frequently queried on columns in one, and less likely in the other, with one-one mapping. It is relatively easier to maintain a table with fewer columns.
I would be interested in knowing what that advice is based upon. My experience is pretty deep, including teaching and commercial development, and I have not found that to be true. At any rate, it illustrates that there is still much of relational database schema design that is more art than science.
It wasn't my intention to challenge your experience. It only provides a cleaner design, and easier maintainability. For example: If you have a different archival strategy for a set of columns in a table, than others it makes your life easy by having them in different table; Or when most of the columns contain a null/default value most of the time.

Most likely you would see a table with a huge number of columns is when there are columns like 'data1', 'data2',.. in which case it's better to transpose them as rows like: key, data_type, data, instead of: key, data1, data2,.. so that when you have a new data3 you do not have to create another column.
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: Huge database, server side cache, store > 180 000 files?

Post by califdon »

x_mutatis_mutandis_x wrote:It wasn't my intention to challenge your experience.
No offense taken.
x_mutatis_mutandis_x wrote:It only provides a cleaner design, and easier maintainability. For example: If you have a different archival strategy for a set of columns in a table, than others it makes your life easy by having them in different table; Or when most of the columns contain a null/default value most of the time.
Those are legitimate exceptions, but I always emphasize beginning with general rules, departing from them when special requirements suggest better arrangements. The general rule that I perceive is that a table should have as many columns as it requires, no more, no less, and that there is no recommended arbitrary "maximum" number. In the absence of specific problems that may apply in a particular case, the number of columns should not be a consideration. I still don't see a general advantage for maintainability of additional tables, while admitting that specific projects might have conditions that could provide such an advantage.
x_mutatis_mutandis_x wrote:Most likely you would see a table with a huge number of columns is when there are columns like 'data1', 'data2',.. in which case it's better to transpose them as rows like: key, data_type, data, instead of: key, data1, data2,.. so that when you have a new data3 you do not have to create another column.
Again, in a specific situation, that might be true, but in general, I believe "clean design and easier maintainability" are always best served by adhering to the relationship of a table for every definable entity, which is the essence of the relational model as conceived by Dr. E. F. Codd.

In any case, we are rather hijacking this thread, which I don't want to do any further. I'm enjoying our exchange of ideas and am certainly willing to continue it in another thread, but I think I should make this my last post in this thread.

Cordially,

Don
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: Huge database, server side cache, store > 180 000 files?

Post by pickle »

Let's assume the database structure is fine.

If the numbers change less often than they are viewed, it would make sense to perform as many of those calculations in advance as possible, so they don't need to be repeated.

Alternatively, you can just provide the raw data to the browser and let the client do the work. I'm thinking perhaps XML & XSLT (not sure if this kind of logic is possible in XSLT), or javascript.

Your cache is a good idea, but you should institute some expiry of some sort - ie: delete all cache files older than a day (or an hour, or whatever is best). That should alleviate any concerns. Maybe find some way to cache the most requested combinations - ie: if one particular combination gets requested ever 3 minutes, cache it. One that gets requested once a week - maybe not.

I will also mirror other replies to say that the calculations you're doing for each request can probably be optimized. Are you doing the math in your query or in PHP. Generally moving that sort of logic to SQL is quicker (though not always).
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
x_mutatis_mutandis_x
Forum Contributor
Posts: 160
Joined: Tue Apr 17, 2012 12:57 pm

Re: Huge database, server side cache, store > 180 000 files?

Post by x_mutatis_mutandis_x »

califdon wrote:
x_mutatis_mutandis_x wrote:It wasn't my intention to challenge your experience.
No offense taken.
x_mutatis_mutandis_x wrote:It only provides a cleaner design, and easier maintainability. For example: If you have a different archival strategy for a set of columns in a table, than others it makes your life easy by having them in different table; Or when most of the columns contain a null/default value most of the time.
Those are legitimate exceptions, but I always emphasize beginning with general rules, departing from them when special requirements suggest better arrangements. The general rule that I perceive is that a table should have as many columns as it requires, no more, no less, and that there is no recommended arbitrary "maximum" number. In the absence of specific problems that may apply in a particular case, the number of columns should not be a consideration. I still don't see a general advantage for maintainability of additional tables, while admitting that specific projects might have conditions that could provide such an advantage.
x_mutatis_mutandis_x wrote:Most likely you would see a table with a huge number of columns is when there are columns like 'data1', 'data2',.. in which case it's better to transpose them as rows like: key, data_type, data, instead of: key, data1, data2,.. so that when you have a new data3 you do not have to create another column.
Again, in a specific situation, that might be true, but in general, I believe "clean design and easier maintainability" are always best served by adhering to the relationship of a table for every definable entity, which is the essence of the relational model as conceived by Dr. E. F. Codd.

In any case, we are rather hijacking this thread, which I don't want to do any further. I'm enjoying our exchange of ideas and am certainly willing to continue it in another thread, but I think I should make this my last post in this thread.

Cordially,

Don

Yeah we got off topic, and I was feeling guilty as well :P. Good discussion, and I did enjoy the exchange of ideas as well :)
Post Reply