Page 1 of 1
How much information to store in tables
Posted: Fri Aug 19, 2005 8:19 am
by taldos
Regarding attributes in rows, what is good practive when it comes to how many attributes should be placed within a table.
for example, I currently place firstName Lastname and Username in one table called members.
Is it generally preferable to divide elements which are often going to be querried into their own tables for faster searches?
Posted: Fri Aug 19, 2005 8:23 am
by feyd
I will, quite often do something where fields queried often would be a seperate table from those directly related, but not queries so often. Although I'm not absolutely sure it's the "best" route, but it definitely is a route.

Posted: Fri Aug 19, 2005 9:29 am
by timvw
I don't have a strict count for them. I usually leave them the way i found the attributes while i was drawing an ERD..
Thus, if i find 20 attributes that belong to an entity, there will be 20 columns in that table. Although, i've found that after normalization there usually aren't more than +/- 5 attributes that are not in a repeating group.
If you end up performing a lot of joins to get that one column value from another table you might consider adding it ni the original table. Thus, introducing a little of controled redundancy. But, if you are using a decent DBMS you can create a view so you don't have to care about where the data really is coming from..

Posted: Fri Aug 19, 2005 9:58 am
by nielsene
I like my tables to be "narrow" that normally means fewer than 10 columns.
I will commonly split of "satellite" tables (creating a 1:1 mapping) when there is a logical collection of columns that belong together, but aren't commonly needed with the rest of the table. I think this is the same thing as feyd is saying.
My canonical example regards a login table:
Many people have it something like
Code: Select all
CREATE TABLE users (
userid SERIAL PRIMARY KEY, -- SERIAL is PostGreSQL "auto increment" both a type and a generator
username VARCHAR(15) NOT NULL UNIQUE,
hashed VARCHAR(47) NOT NULL,
email VARRCHAR(40) NOT NULL,
firstname VARCHAR(30) NOT NULL,
lastname VARCHAR(40) NOT NULL,
status_name VARCHAR(10) NOT NULL REFERENCES user_statuses ON UPDATE CASCADE ON DELETE RESTRICT,
sig TEXT NOT NULL DEFAULT '',
location TEXT NOT NULL DEFAULT '',
... // more profile type columns
);
I would normally split this up to:
Code: Select all
CREATE TABLE people (
peopleid SERIAL PRIMARY KEY,
firstname VARCHAR(30) NOT NULL,
lastname VARCHAR(40) NOT NULL
);
CREATE TABLE users (
userid SERIAL PRIMARY KEY,
personid INT NOT NULL REFERENCES people ON UPDATE CASCADE ON DELETE RESTRICT
username VARCHAR(15) NOT NULL UNIQUE,
hashed VARCHAR(47) NOT NULL,
status_name VARCHAR(10) NOT NULL REFERENCCES user_status ON UPDATE CASCADE ON DELETE RESTRICT
);
CREATE TABLE user_emails (
userid INT NOT NULL REFERENCES users ON UPDATE CASCADE ON DELETE CASCADE,
email_address VARCHAR(40) NOT NULL,
mail_to BOOLEAN NOT NULL DEFAULT TRUE,
PRIMARY KEY(userid, email_address)
);
CREATE TABLE user_profile (
userid INT NOT NULL REFERENCES users ON UPDATE CASCADE ON DELETE CASCADE
-- all the profile type columns
);
In practice I actually use the username as the primary key for all the user tables as I dislike surrogate keys and user_name fits the bill nicely. I push first/last name out to "people" as I often need to have records for non-users. Email isn't left in the users field as people often have alternate emails, some might be "no-mail" if its just for accepting mail requests from that address, etc. Or sometimes instead of "mail_to" I'll use 'main_addr' and its the only one that recieves mail "official" mail from the site.
The users table is basically hit on every single webpage request for authentication. Therefore its a big win if the DBMS decides to cache it. Keeping the table narrow helps to encourage that and helps to ensure that keeping it cached doesn't push too many other important tabless out of the cache.
normalized
Posted: Fri Aug 19, 2005 10:10 am
by AnarKy
So long as the tables are normalized,
it should not matter. 3rd normal form is good enough
Re: normalized
Posted: Fri Aug 19, 2005 10:16 am
by nielsene
AnarKy wrote:So long as the tables are normalized,
it should not matter. 3rd normal form is good enough
I disagree. First I tend to go for full 5NF, migrating to 6NF as I get enough tools to generate all the constraints/views automatically.
Second, first you normalize, then you tune. When tuning I do NOT de-normalize. However I will split a table (which doesn't break normalization).
All databases have a concept or "Work Memory" or "Sort Memory" etc. If the tables involved in a join fit into the "Work/Sort" memory (across all simultaneous queries) then you "win" the DB doesn't have to swap to disc. That advocates keeping the row "width" low. Yes don't want the preformence overhead of having to do three extra joins every query (even if hiding the decomposition behind a view), but if you don't commonly need those extra columns, separate them out.
Posted: Fri Aug 19, 2005 12:21 pm
by McGruff
You might be interested in
Java Database Best Practices.
Posted: Fri Aug 19, 2005 12:36 pm
by nielsene
I'd be very, very wary of any "Java Database Best Practices" the Java community as a whole doesn't tend to "understand" Relational Databases. (Mainly because of the different models -- Object versus Relational.) The Java community tends to blame the Relational Model for everything.
On a quick glance, that chapter makes at least three glaring mistakes from a Relational model view point: Primary Key!="special index", advoacates using NULLs, and it is strongly against 1:1.
Re: normalized
Posted: Sat Aug 20, 2005 2:14 am
by AnarKy
nielsene wrote:AnarKy wrote:So long as the tables are normalized,
it should not matter. 3rd normal form is good enough
I disagree. First I tend to go for full 5NF, migrating to 6NF as I get enough tools to generate all the constraints/views automatically.
I meant for small apps and inexperienced developers...
going above 3NF is usually too much for some people to handle.
Posted: Sat Aug 20, 2005 2:52 am
by AGISB
If you go for 5NF wouldn't this rule out 1to1 relationships? I thought this actually is ruled out by 3Nf already.
As I am only using MySQL I am mostly using no 1 to 1 relationships. MySQL is pretty fast when it comes to use a lot of tables and data. I just try to use static tables as much as possible as this is a lot faster than dynamic and hd space is cheap.
Posted: Sat Aug 20, 2005 10:07 am
by nielsene
AGISB wrote:If you go for 5NF wouldn't this rule out 1to1 relationships? I thought this actually is ruled out by 3Nf already.
No, nothing in normalzation talks about when to
combine relations. Thus splitting an otherwise already normalized relation won't yield a situation where you are directed to re-combine it. You might have to further normalize the splits if you have non-key dependencies in the satellite after splitting -- but that's only possible if you didn't start from a 4NF starting point.
Posted: Sun Aug 21, 2005 7:02 am
by AGISB
But a 1 to 1 relationship gives you redudant data and even if it is the primary key. And redudant data is talked about in Normalisation. At least how I would see it.
Posted: Sun Aug 21, 2005 9:12 am
by nielsene
its not redundant data is the satellite table is maintained vai a foreing key update/delete constraint.