Page 1 of 1
Huge database, server side cache, store > 180 000 files?
Posted: Fri Apr 20, 2012 6:31 pm
by Symthic
Heya.
I have a question about few things.
I have static MySQL database with around 60 rows and 340 columns -> 20400 Values
I need to get ALL these values, make few advanced calculations and print them to a table.
This of course causes performance issues since I'm getting over 300 page views per minute (5/s) on rush hours.
What's best way around this?
I did implement server-side cache. It stores the pages as plain html code. If the file is not found for current settings (sorting and few others), it writes new file, and if its found, it includes that file.
This reduced the php work time by 95-99%
BUT, there's a problem in this method: I have around 180 000 different setting combinations for the table.
This would mean that by time, the cache would contain 180 000 files. Around 18Gigabytes of html files..... doesn't sound too good.
Could someone tell me best way to do this?
Re: Huge database, server side cache, store > 180 000 files?
Posted: Fri Apr 20, 2012 9:30 pm
by califdon
I would be surprised if a database that small would be causing any problem, even with that much traffic. It's more likely the query you are using and/or the kind of calculations you are referring to. Show us the query and at least some idea of what kinds of calculations you are performing.
Re: Huge database, server side cache, store > 180 000 files?
Posted: Sat Apr 21, 2012 6:34 am
by Symthic
Yes, the problem indeed is in processing the values.
For almost every number, there's if/switch to select font color, size, column width etc.
And on top of that the calculations. ie for every line (60 total) you have to output different variations of following depending on user's settings 40 times: x / y + 60 / z * ( ceil ( 100 / v ) - 1 )
Making all calculations takes around 0.2 second for Athlon 255 on my home server. When I publish the site, it will be on average VPS host where it would take even longer.
Re: Huge database, server side cache, store > 180 000 files?
Posted: Sat Apr 21, 2012 11:19 am
by califdon
I have never faced such an issue, so I am only making a guess here, but I think that trying to create that many html files to avoid the calculations would only serve to make the problem worse. If it were my project, I would re-assess the value of having that much variability in a web page. There are some things that are just not worth doing.
Re: Huge database, server side cache, store > 180 000 files?
Posted: Mon Apr 23, 2012 2:11 am
by Weirdan
340 columns? What kind of data you store there?
Re: Huge database, server side cache, store > 180 000 files?
Posted: Mon Apr 23, 2012 12:40 pm
by x_mutatis_mutandis_x
340 columns for a table is way too many. It is recommended to have a table not more than 20 columns. Re-evaluate the way you store your data, consider normalizing your table. Do that first, before you can look into alternative work arounds.
Querying over rows using joins is cheaper than querying over so many columns, because you can index your row data. When you query over the columns, the database first has to fetch the metadata of the columns and then the results..
Re: Huge database, server side cache, store > 180 000 files?
Posted: Mon Apr 23, 2012 1:13 pm
by califdon
x_mutatis_mutandis_x wrote:340 columns for a table is way too many. It is recommended to have a table not more than 20 columns.
I disagree. There is no magic number. The one and only rule is to normalize your data and whatever number of columns results from that is the right number.
I agree with the rest of your comments.
It also seems to me that 340 columns is very likely to be excessive, but not just because of the number. There are many situations in which a properly normalized set of data will require more than 20 (or any other arbitrary number) columns.
Re: Huge database, server side cache, store > 180 000 files?
Posted: Mon Apr 23, 2012 1:30 pm
by x_mutatis_mutandis_x
califdon wrote:I disagree. There is no magic number.
Which is why I said "recommended", not "required"

. It's more of a good practice to adopt when constructing database objects.
califdon wrote:There are many situations in which a properly normalized set of data will require more than 20 (or any other arbitrary number) columns.
If the nature of data demands more columns, split to another table logically, by having frequently queried on columns in one, and less likely in the other, with one-one mapping. It is relatively easier to maintain a table with fewer columns.
Re: Huge database, server side cache, store > 180 000 files?
Posted: Mon Apr 23, 2012 1:49 pm
by califdon
x_mutatis_mutandis_x wrote:califdon wrote:There are many situations in which a properly normalized set of data will require more than 20 (or any other arbitrary number) columns.
If the nature of data demands more columns, split to another table logically, by having frequently queried on columns in one, and less likely in the other, with one-one mapping. It is relatively easier to maintain a table with fewer columns.
I would be interested in knowing what that advice is based upon. My experience is pretty deep, including teaching and commercial development, and I have not found that to be true. At any rate, it illustrates that there is still much of relational database schema design that is more art than science.
Re: Huge database, server side cache, store > 180 000 files?
Posted: Mon Apr 23, 2012 2:41 pm
by x_mutatis_mutandis_x
califdon wrote:x_mutatis_mutandis_x wrote:califdon wrote:There are many situations in which a properly normalized set of data will require more than 20 (or any other arbitrary number) columns.
If the nature of data demands more columns, split to another table logically, by having frequently queried on columns in one, and less likely in the other, with one-one mapping. It is relatively easier to maintain a table with fewer columns.
I would be interested in knowing what that advice is based upon. My experience is pretty deep, including teaching and commercial development, and I have not found that to be true. At any rate, it illustrates that there is still much of relational database schema design that is more art than science.
It wasn't my intention to challenge your experience. It only provides a cleaner design, and easier maintainability. For example: If you have a different archival strategy for a set of columns in a table, than others it makes your life easy by having them in different table; Or when most of the columns contain a null/default value most of the time.
Most likely you would see a table with a huge number of columns is when there are columns like 'data1', 'data2',.. in which case it's better to transpose them as rows like: key, data_type, data, instead of: key, data1, data2,.. so that when you have a new data3 you do not have to create another column.
Re: Huge database, server side cache, store > 180 000 files?
Posted: Mon Apr 23, 2012 4:00 pm
by califdon
x_mutatis_mutandis_x wrote:It wasn't my intention to challenge your experience.
No offense taken.
x_mutatis_mutandis_x wrote:It only provides a cleaner design, and easier maintainability. For example: If you have a different archival strategy for a set of columns in a table, than others it makes your life easy by having them in different table; Or when most of the columns contain a null/default value most of the time.
Those are legitimate exceptions, but I always emphasize beginning with general rules, departing from them when special requirements suggest better arrangements. The general rule that I perceive is that a table should have as many columns as it requires, no more, no less, and that there is no recommended arbitrary "maximum" number. In the absence of specific problems that may apply in a particular case, the number of columns should not be a consideration. I still don't see a general advantage for maintainability of additional tables, while admitting that specific projects might have conditions that could provide such an advantage.
x_mutatis_mutandis_x wrote:Most likely you would see a table with a huge number of columns is when there are columns like 'data1', 'data2',.. in which case it's better to transpose them as rows like: key, data_type, data, instead of: key, data1, data2,.. so that when you have a new data3 you do not have to create another column.
Again, in a specific situation, that might be true, but in general, I believe "clean design and easier maintainability" are always best served by adhering to the relationship of a table for every definable entity, which is the essence of the relational model as conceived by Dr. E. F. Codd.
In any case, we are rather hijacking this thread, which I don't want to do any further. I'm enjoying our exchange of ideas and am certainly willing to continue it in another thread, but I think I should make this my last post in this thread.
Cordially,
Don
Re: Huge database, server side cache, store > 180 000 files?
Posted: Mon Apr 23, 2012 4:54 pm
by pickle
Let's assume the database structure is fine.
If the numbers change less often than they are viewed, it would make sense to perform as many of those calculations in advance as possible, so they don't need to be repeated.
Alternatively, you can just provide the raw data to the browser and let the client do the work. I'm thinking perhaps XML & XSLT (not sure if this kind of logic is possible in XSLT), or javascript.
Your cache is a good idea, but you should institute some expiry of some sort - ie: delete all cache files older than a day (or an hour, or whatever is best). That should alleviate any concerns. Maybe find some way to cache the most requested combinations - ie: if one particular combination gets requested ever 3 minutes, cache it. One that gets requested once a week - maybe not.
I will also mirror other replies to say that the calculations you're doing for each request can probably be optimized. Are you doing the math in your query or in PHP. Generally moving that sort of logic to SQL is quicker (though not always).
Re: Huge database, server side cache, store > 180 000 files?
Posted: Tue Apr 24, 2012 9:11 am
by x_mutatis_mutandis_x
califdon wrote:x_mutatis_mutandis_x wrote:It wasn't my intention to challenge your experience.
No offense taken.
x_mutatis_mutandis_x wrote:It only provides a cleaner design, and easier maintainability. For example: If you have a different archival strategy for a set of columns in a table, than others it makes your life easy by having them in different table; Or when most of the columns contain a null/default value most of the time.
Those are legitimate exceptions, but I always emphasize beginning with general rules, departing from them when special requirements suggest better arrangements. The general rule that I perceive is that a table should have as many columns as it requires, no more, no less, and that there is no recommended arbitrary "maximum" number. In the absence of specific problems that may apply in a particular case, the number of columns should not be a consideration. I still don't see a general advantage for maintainability of additional tables, while admitting that specific projects might have conditions that could provide such an advantage.
x_mutatis_mutandis_x wrote:Most likely you would see a table with a huge number of columns is when there are columns like 'data1', 'data2',.. in which case it's better to transpose them as rows like: key, data_type, data, instead of: key, data1, data2,.. so that when you have a new data3 you do not have to create another column.
Again, in a specific situation, that might be true, but in general, I believe "clean design and easier maintainability" are always best served by adhering to the relationship of a table for every definable entity, which is the essence of the relational model as conceived by Dr. E. F. Codd.
In any case, we are rather hijacking this thread, which I don't want to do any further. I'm enjoying our exchange of ideas and am certainly willing to continue it in another thread, but I think I should make this my last post in this thread.
Cordially,
Don
Yeah we got off topic, and I was feeling guilty as well

. Good discussion, and I did enjoy the exchange of ideas as well
