Database Design [Normalisation]

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Database Design [Normalisation]

Post by jayshields »

Hi guys,

At the moment I regularly update a website which buys and sells cars. It is set up all wrong (I did it years ago), and now I've decided I need to redesign it.

First of all, I need to redesign the database, but I just don't know where to start; designing databases isn't my strong point.

I have about 200 vehicles to put in a database.

Should I make a seperate table for each different make of vehicle? Or just dump every vehicle in one table?

I know very little about normalisation (nor do I even know how to spell it, lmao (is it a z?)), but as far as I know a table should never contain the same value in a field twice, so thinking on that logic, I should have a seperate table for each make, but then it leads me to think I will need a seperate table for each model of vehicle, because each make will probably have more than one of atleast one model. Also, there are some prices likely to be the same.

The entries I will need are: make, model, image filename, year, registration, extra details, price; but I am stumped on which I would make the primary key, so an ID number would be logical, but if I have seperate tables for makes, and have an auto incremented primary key ID field, the numbers will clash across the tables.

I hope you understand what I am trying to say and can point me in the right direction.

Thanks alot.
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Post by pickle »

Normalization (spelled with a 'z' in Canada and UK, with an 's' in US as far as I know) is about storing as little information as possible. So, if you 've got a car with two prices (for some reason), then rather than having two entries with all fields being duplicated, normalization would say to put the price into a new table, and tie them together with the id number.

So, if you regularly have cars with two different values for a particular field, you should move that field into its own table. Other than that, you don't need to normalize anything. It makes sense to me to put one entry per car in your table, with an auto incrementing id number for each entry.
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Post by jayshields »

first off - thanks for your input.

but i didnt really get your point. :oops:

in my post i didnt mean that a car would have 2 seperate prices, i was just underlining the fact that i dont really understand normalisation by saying that atleast 2 cars in the database will probably have the same price, not 2 prices each.

the extra details field would be the only field that would need to take more than one entry, but personally i cant see the point in making an extra table for it, especially since the amount of entries to be made will be different each time, so i wouldnt know how many fields to put in the extra details table if i made one.

my idea for that field would be to store the values like this: value1<br>value2<br>value3<br> and so on, because that would be the way it will be outputted.

you didnt answer my original question either, do you reckon i should make a seperate table for each make, or just use one table?

sorry for sounding like a bit of a bast*rd; i just cant think of any other way to word it! :)
cmitcham
Forum Newbie
Posts: 4
Joined: Tue Apr 06, 2004 8:46 pm

Post by cmitcham »

sounds like one table with the fields you mention to me. you definately want the unique auto-incremented id field you mentioned.

one example of a reason for a seperate model table would be if the image was going to be generic, ie all the silverado pickups for sell showed the same image. then a seperate table would let you store that generic image one time only. but i'm guessing that your image is of the specific vehicle being sold, so it, like all the fields you mention, do fit in a single table.

facts like joe's car and betty's van selling for the same amount do not justify a table holding common prices to be joined into.

if (when) you expand and do want additional tables, every table should have a unique auto-incrementing id, and they can even all be named "id". sql and the computer will not be confused, though you and i might. so you call them user_id in the users table, model_id in the model table, and junk_id in the junk table.

if your extra_details starts looking like alot of repetition, that would be a place to look for adding a table.

calvin.
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Post by jayshields »

thanks for your input. i think i am going to go for just one big table now judging on the points you made. thanks again.
User avatar
jwalsh
Forum Contributor
Posts: 202
Joined: Sat Jan 03, 2004 4:55 pm
Location: Cleveland, OH

Post by jwalsh »

I disagree. Look into 2nd or 3rd normal form.

I would in the least map your makes into it's own table, and your models into its own table, as well as the owner.

Code: Select all

OWNER TABLE
-----------------
UserID AutoNumber Pri Key
Name
OtherUserDetails

CAR TABLE
-----------------
CarId AutoNumber Pri Key
UserID (Maps to Owner Table to show who ownes the car) (Foreign Key)
ModelID (Maps to Model Table to show car model) (Foreign Key)
Condition (Maps to Condition Table) (Foreign Key)
Year
Color
Price
OtherCarDetails

MAKE TABLE
---------------
MakeID AutoNumber Pri Key
Name
OtherMakeDetails

MODEL TABLE
-----------------
ModelID AutoNumberPriKey
MakeID (Maps to Make Table to show manufacturer) (Foreign Key)
OtherMakeDetails

CONDITION TABLE
----------------------
ConditionID AutoNumberPriKey
Name
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Post by pickle »

Like I said, normalization's goal is to store as little data as possible. So, if you've got cars of 4 different makes, you could put the make in it's own table, then refer to it in the main table:

Code: Select all

Main table:
Entry 1:
   Model: T
   Make_id: 1
   Owner: Henry

Entry 2:
   Model: Mustang
   Make_id: 1
   Owner: Joe

Entry 3:
   Model: 911
   Make_id: 2
   Owner: Bismark


Make table:
Entry 1:
   id: 1
   name: Ford

Entry 2:
   id: 2
   name: Porsche
How much you should normalize is really up to you. I don't think it would be improper to have one table with all the data in it. From the way you've described your situation, this is how I'd probably do it. If you ever want to be able to store different values for the same property of a single car, though, that should be broken out into its own table.
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Post by jayshields »

the thing is. theres no model details, no make details, no owner.

the only fields are the ones ive stated: make, model, image filename, year, registration, extra details, price.

theres about 20 different makes, and about 200 cars. the same model for a particular make might appear twice or three times.

im pretty sure i should just keep it as one big table or seperate tables for the makes. the only real thing holding me back from just going with the make tables is that the table structures for each make will be identical, with the table name being the only difference - so in my opinion, i can see no real point in it.

SELECT * FROM cars WHERE make = 'Alfa Romeo';

is almost as easy as:

SELECT * FROM alfa_romeo;

and saves the hassle of having to create 20 or so tables with exactly the same structure.
User avatar
andre_c
Forum Contributor
Posts: 412
Joined: Sun Feb 29, 2004 6:49 pm
Location: Salt Lake City, Utah

Post by andre_c »

just to clarify...

everyone agrees that you shouldn't make a separate table for each make
nobody in their right mind would think otherwise

now something to consider is having a table of makes and a table of cars, as was suggested already
with each row on the cars table having a MakeID field pointing to the correct record on the makes table

that's the very basic idea behind normalization
timvw
DevNet Master
Posts: 4897
Joined: Mon Jan 19, 2004 11:11 pm
Location: Leuven, Belgium

Post by timvw »

deltawing
Forum Commoner
Posts: 46
Joined: Tue Jun 14, 2005 2:55 pm

Post by deltawing »

If your happy just to have that information, then that's fine. I don't see any reason for more than one table. But if you think there's a chance you might want to expand the amount of information in the future, then you should maybe see about using multiple tables.

For example, what if, in the future, you want to include what country a car is from. You could just add a field to your one table to specify the country. Or, if you had a table of makes, referenced by id in the table of models, then you could just add a country field to the "make" table. Ok, so that's just a trivial example, and there wouldn't actually be any problem with just having a country field in the main table. But what about if the data gets more complex?

You should, just to make sure your adopting the right plan, ask yourself questions. For example, "Can a model change its name?" "Could one model have more than one make?" Now the answer to the second question might possibly be yes, because its possible that two companies will make a car cooperatively. I can't think of an example right now, but I'm sure you get the point. If you decide, that it is necessary to decide the answer is yes, then that means you have a many-many relationship between makes and cars, which is where it all gets a bit complecated, and you would need intermediate tables.

I don't actually think it is necessary to answer that question yes (as I said, I can't think of an example), but I'm just illustrating a point of the sort of questions you should be asking yourself. But on the whole, if you think it is very unlikely that your going to expand the amount of information, and there aren't any of those confusing relationships, just go with the one table. Its not perfect normalisation, but perfect normalisation isn't always the most practical choice.
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Post by jayshields »

nice point deltawing.

but in reference to your point "Will a model ever change its name?", I dont quite understand your point. I mean, if I add a car to the database, and then I realise I made a mistake when typing in the model name, I would just edit the field, if I had a seperate table for the model it would be just as easy. obviously you dont mean it like that but ho hum.

i see what you mean by the table for the makes and then using the make id in the main table but i cant see any real point, i would probably use a HTML form for adding data to the database, and I can see myself making a drop down box with all the makes in it. so if i need to use a new make i will just add it to the drop down box, but the chances of adding a new make are very slim.

thanks for all the input guys.
Post Reply