Page 1 of 1
Lots of tables or lots of rows?
Posted: Wed Oct 08, 2008 3:58 am
by Kadanis
Just wondering what the consensus is on database design when managing large amounts of data.
The scenario is that clients create marketing campaigns (mircorsites and emails) which are then tracked. All the clicks on the microsites/messages are stored. This can lead to anything from a few data entries per campaign to hundreds of thousands depending on the success.
Is it better to store all the data from every campaign in one huge table with the campaign_id column, or to create an individual table for each campaign?
The system runs on MySQL 5, and currently uses the multiple table concept. However, starting in Jan 09 we have a major re-write/version update coming up and this gives us the opportunity to change things.
The system runs well as it is. As the management of the tables is quite tedious I was considering the change, however I'm worried that having everything in one table and using WHERE statements is going to make the data retrieval really sluggish.
Any ideas/comments appreciated
Cheers
Re: Lots of tables or lots of rows?
Posted: Wed Oct 08, 2008 1:33 pm
by califdon
I always give the same answer to this question. The database schema should always be designed according to well established rules based on the definitions of the entities and their relationships. Period. Well, maybe semicolon. In the case of gigantic data stores (many millions of rows) and high availability requirements (many simultaneous users), it may make sense to split up otherwise homogeneous data in a specific manner to accommodate the principal querying mode. In doing so, the designer should be dead sure that the benefits will outweigh the potential limitations or inefficiencies of splitting up related data.
In the case you cited, it should depend on what the defined scope of your data model is. Are you modeling discrete campaigns that will never be compared against each other or combined? Or are you modeling the universe of campaigns and their interactions? What kind of queries will be run against these data? In my opinion, there is never a logical answer to a question like, Which is better, lots of tables or lots of rows. That sort of question will inevitably lead to useless answers.
Re: Lots of tables or lots of rows?
Posted: Thu Oct 09, 2008 3:57 am
by Kadanis
Hmm, thank-you for the post califdon. You've certainly given me something to think about.
I think we'll take this opportunity to really decide on how this data will be used, and then we can decide on what's best
Cheers

Re: Lots of tables or lots of rows?
Posted: Thu Oct 09, 2008 4:05 am
by onion2k
We do something very similar* in our email marketing application. We eventually decided to go so far as actually having different databases for client data in order to make it easy to backup, replicate, remove if the client left, etc. It also means that we can make things more secure by having MySQL users that don't have any access to data belonging to other databases (although I think you can have table specific permissions for users in MySQL as of version 5... we wrote this thing back on MySQL 3.23).
It still didn't work too well mind, because in one of our client's database there's two tables that need to be joined on a text key where one table has 35,000 rows and the other has 4.8 million rows ... it takes a loooooooong time. I want to denormalise it a bit so it runs faster but I've never found the time.
* Eg exactly the same.
Re: Lots of tables or lots of rows?
Posted: Thu Oct 09, 2008 4:15 am
by Kadanis
My initial ideas are that we move from having a table per campaign to maybe 1 per client. As sometimes clients may want to cross reference campaigns, but in the main I think that the table per campaign may be the best solution for us.
I think my main beef with it is all the tables that sit there with 20 records in them because a campaign didn't do so well, seems pointless splitting them out for that, but then a biggy comes in with the 1million records and would hammer the system if it was in a single table.
Design meeting ahoy !
Re: Lots of tables or lots of rows?
Posted: Thu Oct 09, 2008 4:33 am
by VladSun
What about using vertical partitioning?
http://dev.mysql.com/tech-resources/art ... oning.html
It will keep all of your data in a single table, while the per-campaign-queries will work only on a part of it.