Page 2 of 2

Posted: Tue Aug 27, 2002 9:33 am
by mikeq
I knew this would get a good discussion going :wink:

As I said in my first post it is more Art than Science when it comes to DB design.
Splitting a table to multiple 1:1 tables doesn't to my knowledge break it either.
Agreed but why bother
I'll concede the point that some "simple" queires become more complicated with multiple tables, but I suspect that the majority of his queries only touch one of the subtables, the core table; so I think its a weak arguement.
Bit of an assumption...


Some RDBMS allow you to ...
Made my own assumption here and assumed he was using MySQL which doesn't give you all these nice features, but having said that DB design should be system independent.
Good relational db design is much more than just applying the rules of normalization. Often its knowing when to break them, too.
If developing a transactional database I have to say it is just about normalisation. If developing a database (Datawarehouse, Datamart) for reporting then it is the opposite end of the spectrum, de-normalised tables, pre-aggregated data, loads of duplicate data everywhere. If trying to develop 1 database to perform both of these functions, oh well...

So in a nutshell, splitting the table is not going to give JPlush76 any great optimisation benefits and it will make his application more complex to code and maintain, so why bother. This scenario is probably not the best to learn good DB design.

Mike

Posted: Tue Aug 27, 2002 10:26 am
by nielsene
I'll bite :) [very long post follows]

I don't think its too much of an assumption to assume that queries involving address/profile are less frequent than the login user table. The o.p's comments on the adress fields early in this thread made it apparent that its not used often and isn't even required data. On most sites how often do you view your own or another member's profile compared to how often you view "regular" content. Even most forums that inject avatars,sigs etc do so at post time not at view time; those that's i've inspected that inject at view time have moved those fields to the signon/user table.

Re: abilities of RDBMS, I'm clueless when it comes to MySQL, but most provide one or more of these tools (OK I just checked mysql's manual and apparently it doesn't offer any of these) If they can't, you can always do it in application logic. If I were a heavy user of MySQL, I'ld probably write some code to autogenerate my "helper cascade queries" as part of my database library. Yes a design should be relatively RDBMS independant, but neither MySQL nor PostGreSQL nor MS SQL nor even Oracle(to my knowledge) is even fully SQL 92 compliant so you're always going to have to use some work arounds. If MySQL doesn't support some feature, you should at least understand why some other system included it. If its functionality sounds useful, figure out how to mimic it.

Re: Normalization. I don't think I've ever seen a large transactional database that was entirely in 3NF (or BCNF, 4NF, or 5NF). One guilty feld is almost always storing zip, city, state in the address table along with streets. zip fully determines city & state. so it should be its own table. This is one place however where efficiency tells you that as you almost never use the street address without the city/state/zip that they should jsut stay together. Database design include normalization to help manage complexity, but must be balanced by efficiency. Normalization tells you nothing about what indices to create for example.

Yes I'll agree that it doesn't give any great benefit, but it may give a small one. It does make the application more complex, but more complex does not always equal more difficult to maintain. And in my mind the best scenario to learn db design is the one you're currently pscyhed for and actively working on. I believe he has a working prototype or similar running on this database, so he can fiddle with different database setups and change a few queries and see how he likes it. For a different project he'd need to design and write all the surrounding code. Sometimes adding a little bit of complexity at one level, simplifies the design at others tremendously.

We've only seen a snippet of his database schema. Regardless of how he decides to go with this table, this debate should help with other parts of his schema.

OK I just finished reading some more about MySQL, so I have a somewhat greater understanding of the framework most PHP users work within. I'm not sure how I would vary my advice now, but I think I agree more with my "debate opponent" than before, if MySQL is the DB used. However, many of the features I mentioned are planned for relatively soon releases (measuring soon by the projected version number, didn't dig to see if they have any target date). I think I can understand equating DB design with Normalization in the realm of MySQL. MySQL is a wonderful tool, but if you want to understand more about DB design you need to know what other systems offer that MySQL doesn't (and what MySQL offers that others don't), then if a missing feature solves a specific problem maybe it provides you with insight on how to implement a similar feature in application logic.

Posted: Tue Aug 27, 2002 11:18 am
by JPlush76
wow, thanks for all the great replies!

I'd definately like to have a solid database fundamental design. One table is DEFANITELY easier to work with but if breaking it up with 3 tables helps me learn better design, I might stick with that.

but lets say I do the 3 tables and when the user signs up the insert has to hit 2 tables... I couldn't make that instert statement in one line could I? Because of having the user_id created, how would the satellite tables know how to reference?

Posted: Tue Aug 27, 2002 11:24 am
by nielsene
Yes that's where it get more complicated. The typical process is

1. Insert into the core table
2. Retrieve the oid of the last inserted row
3. Query the table for the userid using the oid
4. Insert into the satellite tables

I looks like mysql_insert_id() actually combines step 2 and 3 so it would be a very useful function.

Posted: Tue Aug 27, 2002 11:41 am
by JPlush76
ah so I can do the insert into the first table, then do a

Code: Select all

select * from st_user where user_id = mysql_insert_id()
?

and then do a insert in the same way?

Posted: Tue Aug 27, 2002 11:49 am
by nielsene
I think it would look more like

Code: Select all

$query = "INSERT INTO st_users (....) VALUES (....);";
$result = mysql_query($query);
$insertedID = mysql_insert_id($result);
$query = "INSERT INTO st_user_addresses (userid,...) VALUES ($insertedID,...);";
$result = mysql_query($query);

Posted: Tue Aug 27, 2002 11:52 am
by JPlush76
ahhhh gotcha

read up on the manual that is persists on the clients machine until the DB is closed. that looks like a great solution to updating multiple tables. saves alot of logic, nice one

I'll test it out today

Posted: Tue Aug 27, 2002 3:12 pm
by mikeq
And make sure you handle it not inserting into any of your dependent tables, if this is the case you will need to roll back (delete already commited transactions) and display an error to the user to keep all of your dependent tables in sync.

This is where your application logic is critical.

Posted: Tue Aug 27, 2002 3:22 pm
by mikeq
Re: Normalization. I don't think I've ever seen a large transactional database that was entirely in 3NF (or BCNF, 4NF, or 5NF). One guilty feld is almost always storing zip, city, state in the address table along with streets. zip fully determines city & state. so it should be its own table. This is one place however where efficiency tells you that as you almost never use the street address without the city/state/zip that they should jsut stay together.
Can't argue with that, damn :D

The main database I use is Oracle and it takes some thinking to get around the limitations of MySQL (sub-queries, stored procedures, transactions (I know you can use InnoDB? tables), UNION to name a few).

I do agree with a lot of what you say nielsene, good banter :lol:

Posted: Wed Aug 28, 2002 6:37 am
by Johnm
mikeq wrote: The main database I use is Oracle .
Mike, if you don't mind I will keep you in mind as my CPS 541 RDBMS class progresses and I run into problems. At work I use strictly Informix and there is a substantial difference between Informix 7.3 and the Oracle 9i database we will use in class. Most of what I have learned about databases so far has been in a "baptismal by fire" environment here at work.

Direwolf

Posted: Wed Aug 28, 2002 10:29 am
by mikeq
No problem Johnm, if you need any help just let me know.

Mike

Posted: Wed Aug 28, 2002 12:40 pm
by 9902468
I'm lazy reader, and don't know if nielsene already told this but what db are you using? (mainly that is), and how can I retrieve the last insert id from a postgres db? If I begin transaction, do insert, do select currval(sequence name) and commit, shouldn't I have now the right id? Or is there a function to do this too? (Tried to look but didn't find.)
pg_last_oid() is used to retrieve the oid assigned to an inserted tuple (record) if the result resource is used from the last command sent via pg_query() and was an SQL INSERT. Returns a positive integer if there was a valid oid. It returns FALSE if an error occurs or the last command sent via pg_query() was not an INSERT or INSERT is failed.

OID field became an optional field from PostgreSQL 7.2. When OID field is not defined in a table, programmer must use pg_result_status() to check if record is is inserted successfully or not.
I don't have oid's to work with (So I was told atleast) and pg_result_status only gives back the type of transaction and wether it was succesful. So, what is the right way to retrieve that id number?

Thanks

-9902468

Posted: Wed Aug 28, 2002 12:50 pm
by nielsene
I'm using PostGreSQL, but all my tables have an OID (I'm still on a 7.1.x version).

However I had implemented a fake version before I learned about pg_last_oid. Basically as I built my insert string, I would also build the query string so I'd have an "INSERT INTO foo (name,name2) VALUES (value,value2)" and a "SELECT keyfield FROM foo where name=value and name2=value2;" Its not beautiful or anything, but it worked and it should work for you if you're stuck without OIDs....

Posted: Thu Aug 29, 2002 1:05 am
by 9902468
Ok I'll just have to try out something quick and dirty... ;(, but thanks anyway.

-9902468