Need some MySQL advice?!
Moderator: General Moderators
- Pyrite
- Forum Regular
- Posts: 769
- Joined: Tue Sep 23, 2003 11:07 pm
- Location: The Republic of Texas
- Contact:
Need some MySQL advice?!
I have a survey script that puts the results into a MySQL table. There are 107 questions on the survey with different datatypes, so therefore I have a table with 107 columns for the results to go into each time somebody takes the survey.
Now, another man I work with thinks I should just have like 3 columns, one for the question # and one for the answer. So instead of having one record for one person with 107 columns, have 107 records with only 3 columns. He somehow thinks that there won't be any speed difference doing it this way, but I disagree and think it would be much slower to select 107 records, when you just select one with 107 fields.
This is what I told him in an email.
"Because there are over 100 questions in that survey, do you think it would
be faster that way? Because everytime someone takes the survey, like 107
records would be created. And I think MySQL will insert columns faster than it will records (I'll have to check into that). But also, for our survey
stats/reports that I make .. selecting over 100 records just for one survey
will take ages. Imagine if 100 students take this survey, and Jody wants to
see a report of the averages for all, that means my query has to select over 10, 000 records."
And this is what he had to say about it:
"I doubt if there's a significant difference in the data retrieval time --
since the data is normally cached (unless the system runs out of cache
space) all it has to do is the very fast logical/calculational operation to
find where the data is located and go to the data locations in RAM/cache and
get it.
The advantage of the vertical structure rather than the horizontal is
conceptual clarity and flexibilty. That is, if next year there 150
questions, the tables don't need to be modified, or if there are only 75,
again there's no need to modify tables and no waste space if they're not
modified. If additional questions are added, the new set can be handled
without table modification. Generally, the "wide/flat" structure is
inferior to the "tall" structure."
So is he right about all that or am I? Just need some input from some other MySQL geeks on this debate. Of course, if I go his route, I would have to make the 3 columns of type BLOB or something to be able to handle any of the data it stores, as with my way, each field is its own datatype to match the survey question. And I think I read in the MySQL manual that you using binary datatypes for fields is much slower?
Now, another man I work with thinks I should just have like 3 columns, one for the question # and one for the answer. So instead of having one record for one person with 107 columns, have 107 records with only 3 columns. He somehow thinks that there won't be any speed difference doing it this way, but I disagree and think it would be much slower to select 107 records, when you just select one with 107 fields.
This is what I told him in an email.
"Because there are over 100 questions in that survey, do you think it would
be faster that way? Because everytime someone takes the survey, like 107
records would be created. And I think MySQL will insert columns faster than it will records (I'll have to check into that). But also, for our survey
stats/reports that I make .. selecting over 100 records just for one survey
will take ages. Imagine if 100 students take this survey, and Jody wants to
see a report of the averages for all, that means my query has to select over 10, 000 records."
And this is what he had to say about it:
"I doubt if there's a significant difference in the data retrieval time --
since the data is normally cached (unless the system runs out of cache
space) all it has to do is the very fast logical/calculational operation to
find where the data is located and go to the data locations in RAM/cache and
get it.
The advantage of the vertical structure rather than the horizontal is
conceptual clarity and flexibilty. That is, if next year there 150
questions, the tables don't need to be modified, or if there are only 75,
again there's no need to modify tables and no waste space if they're not
modified. If additional questions are added, the new set can be handled
without table modification. Generally, the "wide/flat" structure is
inferior to the "tall" structure."
So is he right about all that or am I? Just need some input from some other MySQL geeks on this debate. Of course, if I go his route, I would have to make the 3 columns of type BLOB or something to be able to handle any of the data it stores, as with my way, each field is its own datatype to match the survey question. And I think I read in the MySQL manual that you using binary datatypes for fields is much slower?
-
microthick
- Forum Regular
- Posts: 543
- Joined: Wed Sep 24, 2003 2:15 pm
- Location: Vancouver, BC
-
microthick
- Forum Regular
- Posts: 543
- Joined: Wed Sep 24, 2003 2:15 pm
- Location: Vancouver, BC
Well, ok, if it's just for this one survey, then doing it your way is fine. But this does mean hard-coding your questions into your survey, right?
But don't you think that at some point in the future you may be able to reuse this survey code you're writing now? Maybe for this client, maybe for another, either way, it'll save you time. And you can still bill for the time you WOULD have spent.
But don't you think that at some point in the future you may be able to reuse this survey code you're writing now? Maybe for this client, maybe for another, either way, it'll save you time. And you can still bill for the time you WOULD have spent.
why not just try both methods, see which is faster, and stick with it. i think that's the only way you are ever gonna know and settle the argument.
Personally, i think your method would be best in this pyrite, as it just seems like it would be faster to search one records, instead of making 107 different connections, disconnections, queries, and frees...
just my 2 cents
Personally, i think your method would be best in this pyrite, as it just seems like it would be faster to search one records, instead of making 107 different connections, disconnections, queries, and frees...
just my 2 cents
- Pyrite
- Forum Regular
- Posts: 769
- Joined: Tue Sep 23, 2003 11:07 pm
- Location: The Republic of Texas
- Contact:
Suppose I go his route. Doesn't MySQL have a way to INSERT multiple records at once?? I'm thinking when someone takes the survey, I'd have to call INSERT with a loop or something to INSERT 107 records, but I think MySQL has a solution for that?
Figured, someone here would know about SELECT'ing loads and stuff.
Figured, someone here would know about SELECT'ing loads and stuff.
Of course it has:Pyrite wrote:I'm thinking when someone takes the survey, I'd have to call INSERT with a loop or something to INSERT 107 records, but I think MySQL has a solution for that?
Code: Select all
mysql> use test;
Database changed
mysql> create table tst(a int,b int);
Query OK, 0 rows affected (0.22 sec)
mysql> insert into tst values(1,2),(3,4);
Query OK, 2 rows affected (0.23 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> select * from tst;
+------+------+
| a | b |
+------+------+
| 1 | 2 |
| 3 | 4 |
+------+------+
2 rows in set (0.23 sec)lol... that's why i said i wouldn't WANT to go the other guy's route, is because that's exactly what you would be doing based on the description..Infolock, why would you issue 107 queries to get 107 rows? lol
oops... i see now, i just did it again. I meant making more queries than you needed to get those rows when it could just be done with one query...
*smack head* no wonders i failed my cisco final today