Multiple tables or single?

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Multiple tables or single?

Post by alex.barylski »

I'm working on a project right now that has a weird schema (at least to me it seems amatuerish).

Basically It's a business directory with each business listed in several tables

Code: Select all

Provinces, Cities, Major Category, Minor Category, Sub-Category
Essentially 5 levels. Would a table such as:

Code: Select all

pkid, ppid, name
Be an efficient approach considering there are millions of records?

Where an example might be:

Code: Select all

 
1, 0, Manitoba
2, 1, Winnipeg
3, 2, Business & Professional
4, 3, Advertising
5, 4, Internet
 
A breadcrumb for such a structure might look like:

Code: Select all

Manitoba > Winnipeg > Business & Professional > Advertising > Internet
I believe the way it is setup now, is something like

Code: Select all

Major:
pkid, name
 
Minor:
pkid, major_id, name
 
Sub:
pkid, major_id, minor_id, name
This seems awfully complex and not very sensible but I'm wondering if maybe it's more efficient or something??? When dealing with millions of records?

Seems it would be fast enough to use a single table and depending on the ID of the record selected working backwards to determine the parents would be fast as well?

My only concern right now, is that businesses can be listed under multiple categories and that is perhaps why they chose this multiple table schema.

Comments?
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: Multiple tables or single?

Post by califdon »

To me it is always proper to use normalized tables, at least as your beginning design. If, as you develop a project, you identify specific problems that may be solved by denormalizing, you might analyze the trade-offs. But there are several costs associated with denormalized data, such as storage space and simplicity of query logic. With today's technology, millions of records should not be an issue. I would have to see evidence of degraded performance before I would consider denormalizing.
Post Reply