Page 1 of 1

which is most efficient query?

Posted: Fri Aug 20, 2004 6:01 am
by phait
Hi,

Current situation:
A little theory before we acually do any implementation. Myself and another developer are unsure as to which is the best method in terms of speed to use for storing a number of ids against one record.

We have a table that records stoppages on waterways. Any one stoppage can exist across more than one waterway. We would also need at some point in the future to be able to query a waterway and find all the stoppages on that waterway.

Suggestions:
I have suggested that we store the waterway ids in a separate intermediate table which will record each stoppage against a waterway id. Two columns would exists: waterway_id and stoppage_id. So, if a stoppage is applied to three waterways then three records are created in the intermendiate table for the one stoppage [one to many]. Indexes can then be created on both fields in the intermediate table to aid searching.

The other developer has suggested storing the waterway ids as a csv string in a varchar column and then iterating over that column to find the relationships of waterways to a stoppage. An Index would be applied over the waterways column to help speed the searching.


My concern is that I would rather SQL do the work than php. For me, I think the query will be slowed by having to search through the csv values for each column rather than just searching over a column that will only contain one value, this may however be incorrect and so I would appreciate any feedback on which is more appropriate or a.n.other solution.


TIA

Posted: Fri Aug 20, 2004 8:04 am
by hawleyjr
Your solution is the better way to handle this. You need to think reusability. What if another non-php application needs to get to your data...? Do you expect them to know that they have to loop through data to get a result? Option one is much more of a standard. Google [google]normalization[/google].

Posted: Fri Aug 20, 2004 10:57 am
by lostboy
gotta agree with hawley. its easier to query for a particular point as well as a group...

Posted: Sun Aug 22, 2004 12:05 pm
by McGruff
I'd always go with the properly normalised db first. Consider denormalising later if you really do have performance issues.

http://www.oreilly.com/catalog/javadtab ... r/ch02.pdf

Posted: Sun Aug 22, 2004 12:09 pm
by hawleyjr
McGruff wrote:Consider denormalising later if you really do have performance issues.
Do you really think there are performance issues using normalization? It may take more effort from the database to query but I've never experienced any lag time using multiple many to many relational tables.

Posted: Sun Aug 22, 2004 12:16 pm
by McGruff
If I remember rightly, the phpBB table structure is denormalised to make certain queries more efficient by reducing the number of JOINS. Don't quote me though.

I agree you shouldn't be scared to use JOINS. Only denormalise if there really is a good reason to do so. A properly normalised database is much more flexible during development or if you make changes to the app later on.