Forum Searching - Relating Posts

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Re: Forum Searching - Relating Posts

Post by superdezign »

My head is wrapped up in table rows as objects rather than as data, so I always design my databases in the interest of becoming objects after I fetch the information. So, for me, it's easier to have a post as one type of object and a thread as another. To me, that way is easier, but that's based on the way that I program.

If your way is easier, then nevermind my post. Developers should develop whatever way helps them work the fastest. :D
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Forum Searching - Relating Posts

Post by Eran »

Except the latency of having to join the table to itself, are there actually any other disadvantages to my structure? I guess you could argue that it's disorganised..
It's highly unmaintainable, as you've encountered yourself when you tried to add some aggregate data. Also, all those subqueries you are running are potentially costly. That query would have looked much better with a normalized schema.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Re: Forum Searching - Relating Posts

Post by superdezign »

pytrin wrote:
Except the latency of having to join the table to itself, are there actually any other disadvantages to my structure? I guess you could argue that it's disorganised..
It's highly unmaintainable, as you've encountered yourself when you tried to add some aggregate data. Also, all those subqueries you are running are potentially costly. That query would have looked much better with a normalized schema.
Oh, and this. :3
User avatar
jackpf
DevNet Resident
Posts: 2119
Joined: Sun Feb 15, 2009 7:22 pm
Location: Ipswich, UK

Re: Forum Searching - Relating Posts

Post by jackpf »

pytrin wrote:
Except the latency of having to join the table to itself, are there actually any other disadvantages to my structure? I guess you could argue that it's disorganised..
It's highly unmaintainable, as you've encountered yourself when you tried to add some aggregate data. Also, all those subqueries you are running are potentially costly. That query would have looked much better with a normalized schema.
Wouldn't I require all those subqueries (and more) with a normalised structure though?
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Forum Searching - Relating Posts

Post by Eran »

Some of the subqueries and IF conditions (which by the way, I almost never find the need to use, which makes me wonder since you have several of those in one query) could be exchanged for simple aggregate functions such as SUM() and COUNT(). I thought about trying to rewrite that query to show that.. but it would take too much effort to understand exactly what you were trying to achieve. We could try reconstructing it together if you explained what's going on in there.
User avatar
jackpf
DevNet Resident
Posts: 2119
Joined: Sun Feb 15, 2009 7:22 pm
Location: Ipswich, UK

Re: Forum Searching - Relating Posts

Post by jackpf »

Ok, well I have one table, called `Forum`.

[sql]CREATE TABLE IF NOT EXISTS `Forum` (  `ID` int(11) NOT NULL AUTO_INCREMENT, #self explanatory  `Type` varchar(30) NOT NULL, #post type - either post, thread, or forum  `Forum` int(11) NOT NULL, #forum id  `Thread` int(11) NOT NULL, #thread id  `Subject` varchar(75) NOT NULL, #self explanatory  `Post` longtext NOT NULL, #self explanatory  `Author` varbinary(15) NOT NULL, #self explanatory  `Date` int(11) NOT NULL, #self explanatory  `Edit` varchar(30) NOT NULL, #last edit timestamp  `Stats` int(11) NOT NULL, #view counter  `Status` varchar(255) NOT NULL, #permissions and so on  `Mod` varchar(255) NOT NULL, #moderators  `Options` int(11) NOT NULL, #an options bitfield, for parsing  PRIMARY KEY  (`ID`),  KEY `Type` (`Type`),  KEY `Forum` (`Forum`),  KEY `Thread` (`Thread`)) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=2856 ;[/sql]

And here's my attempt at commenting the query:

[sql]SELECT`F`.*, #selecting all thread fieldsCOUNT(`P`.`ID`) AS `PostCount`, #the number of posts the thread hasIF(`F`.`Type`='forum', #if it's a forum, don't select tracking stats with a thread id, and sum them up, otherwise select the thread's(SELECT SUM(`D`.`PostCount`) FROM `Forum_Data` `D` WHERE `D`.`Alias`='jackpf' AND `F`.`ID`=`D`.`Forum`), (SELECT `D`.`PostCount` FROM `Forum_Data` `D` WHERE `D`.`Alias`='jackpf' AND `F`.`Forum`=`D`.`Forum` AND `F`.`ID`=`D`.`Thread`)) AS `Forum_Data_PostCount`,IF(`F`.`Type`='forum', #if it's a forum, sum up its threads' stats. otherwise, just its statsSUM(`P`.`Stats`), `F`.`Stats`) AS `Stats`,MAX(`P`.`ID`) AS `LastPost` FROM `Forum` `F` #the thread's last post's id INNER JOIN `Forum` `P` ON IF(`F`.`Type`='forum', `F`.`ID`=`P`.`Forum` /*OR `F`.`ID`!=`P`.`Forum`*/, `F`.`ID`=`P`.`Thread`/*)*/ AND `P`.`Type` IN('thread', 'post')) #this just joins the forum's/thread's posts on, for stats and stuff WHERE `F`.`Forum`='2009' OR SUBSTRING_INDEX(`F`.`Status`, ':', -1)=2 #the first condition selects only threads in this particular forum, the second selects all "announcements" GROUP BY `F`.`ID` #... ORDER BY IF(`F`.`Type`='forum', 0, 1), IF(SUBSTRING_INDEX(`F`.`Status`, ':', -1)>=1, 1, 2), `LastPost` DESC; #orders child forums, then stickies, then all normal threads[/sql]

I don't really see anything too wrong with my current system. But feel free to analyse it or whatever if you want :)

Cheers,
Jack.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Re: Forum Searching - Relating Posts

Post by superdezign »

I'm lost... Isn't a forum just a collection of threads, and a thread just a collection of posts?

Your forum has a subject, post, thread, and author. I don't get it. :|
User avatar
jackpf
DevNet Resident
Posts: 2119
Joined: Sun Feb 15, 2009 7:22 pm
Location: Ipswich, UK

Re: Forum Searching - Relating Posts

Post by jackpf »

Wait...well, the whole thing is a parent board/forum. The parent board has several forums, but because I use the same table for everything, not all fields are required for all types. For example, the `Author` and `Thread` fields would be left blank for a forum.

But yeah, this is how it works out:
tmp.jpg
tmp.jpg (36.04 KiB) Viewed 689 times
Isn't a forum just a collection of threads, and a thread just a collection of posts?
Yeah, but a forum can also be a collection of other forums.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Forum Searching - Relating Posts

Post by John Cartwright »

Heres the general direction a relational database will take you in (code below). You should be treating forums, threads, and posts as separate entities as superdezign hinted at.

Code: Select all

CREATE TABLE IF NOT EXISTS `forums` (
  `id` bigint(12) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `created` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
 
CREATE TABLE IF NOT EXISTS `threads` (
  `id` bigint(12) NOT NULL AUTO_INCREMENT,
  `forum_id` bigint(12) NOT NULL,
  `name` varchar(255) NOT NULL,
  `created` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `forum_id` (`forum_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
 
CREATE TABLE IF NOT EXISTS `posts` (
  `id` bigint(12) NOT NULL AUTO_INCREMENT,
  `thread_id` bigint(12) NOT NULL,
  `user_id` bigint(12) NOT NULL,
  `text` text NOT NULL,
  `created` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `thread_id` (`thread_id`,`user_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
 
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Forum Searching - Relating Posts

Post by Eran »

I don't really see anything too wrong with my current system. But feel free to analyse it or whatever if you want
Well... I meant you explained what exactly you were trying to achieve with that query, not display it again. What are all the bits and pieces of information you're trying to pull with this. Then we could try building something using a normalized structure (such as the one john suggested).
not all fields are required for all types
That's usually a very strong indicator you are using the same table for unrelated data structures.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Re: Forum Searching - Relating Posts

Post by superdezign »

Code: Select all

CREATE TABLE `forums` (
  `id` UNSIGNED int NOT NULL AUTO_INCREMENT,
  `parent_id` UNSIGNED int DEFAULT 0,      # 0 = is parent; 1+ = has parent
  `title` varchar(64) NOT NULL,
  `description` text,
  `date_created` datetime NOT NULL,
PRIMARY KEY(`id`));
 
CREATE TABLE `threads`(
  `id` UNSIGNED int NOT NULL AUTO_INCREMENT,
  `forum_id` UNSIGNED int NOT NULL,
  `date_created` datetime NOT NULL,
PRIMARY KEY (`id`));
 
CREATE TABLE `posts` (
  `id` UNSIGNED int NOT NULL AUTO_INCREMENT,
  `thread_id` UNSIGNED int NOT NULL,
  `author_id` UNSIGNED int NOT NULL,
  `content` text,
  `date_posted` datetime NOT NULL,
PRIMARY KEY (`id`));
Extending on ~John's tables, using my own preference of typing and naming. :P
Also, I made it possible to have parent and child forums. Top parents have a parent_id of 0. Children have a parent_id that is equivalent to the id of their parent forum.

Get all top forums:

Code: Select all

SELECT * FROM forums WHERE parent_id = 0 ORDER BY title ASC
Get children of a forum:

Code: Select all

SELECT * FROM forums WHERE parent_id = FORUM_ID ORDER BY title ASC
Get forum data:

Code: Select all

SELECT * FROM threads LEFT JOIN forums WHERE forums.id = threads.forum_id ORDER BY date_created DESC
Get thread:

Code: Select all

SELECT * FROM posts LEFT JOIN threads WHERE threads.id = posts.thread_id ORDER BY posts.date_posted ASC
Get OP:

Code: Select all

SELECT * FROM posts LEFT JOIN threads WHERE threads.id = posts.thread_id ORDER BY posts.date_posted ASC LIMIT 1
Most of the data handling would be done in PHP, but your database would be normalized. It'd also make it easier for us to help you with your queries. :P
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Forum Searching - Relating Posts

Post by John Cartwright »

Your missing a several keys from the database structure you posted 8O
User avatar
jackpf
DevNet Resident
Posts: 2119
Joined: Sun Feb 15, 2009 7:22 pm
Location: Ipswich, UK

Re: Forum Searching - Relating Posts

Post by jackpf »

Hmm....

Yeah...but, hmm....idk.

I would actually require more joins than i currently have. Eg, I do not need a join for fetching a thread, because the OP is the thread itself...

I'm still not convinced :P
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Forum Searching - Relating Posts

Post by John Cartwright »

And here I am thinking you have convinced yourself after writing queries such as

[sql]SELECT `F`.*,P.*, COUNT(`P`.`ID`) AS `PostCount`, IF(`F`.`Type`='forum', (SELECT SUM(`D`.`PostCount`) FROM `Forum_Data` `D` WHERE `D`.`Alias`='jackpf' AND `F`.`ID`=`D`.`Forum`), (SELECT `D`.`PostCount` FROM `Forum_Data` `D` WHERE `D`.`Alias`='jackpf' AND `F`.`Forum`=`D`.`Forum` AND `F`.`ID`=`D`.`Thread`)) AS `Forum_Data_PostCount`, IF(`F`.`Type`='forum', SUM(`P`.`Stats`), `F`.`Stats`) AS `Stats`, MAX(`P`.`ID`) AS `LastPost` FROM `Forum` `F`#LEFT OUTER JOIN `Forum` `P` ON IF(`F`.`Type`='forum', `F`.`ID`=`P`.`Forum`, `F`.`ID`=`P`.`Thread`) AND `P`.`Type` IN('thread', 'post')INNER JOIN `Forum` `P` ON IF(`F`.`Type`='forum', `F`.`ID`=`P`.`Forum` /*OR `F`.`ID`!=`P`.`Forum`*/, `F`.`ID`=`P`.`Thread`/*)*/ AND `P`.`Type` IN('thread', 'post')) WHERE (`F`.`Forum`='2009' OR SUBSTRING_INDEX(`F`.`Status`, ':', -1)=2)AND (`F`.`Subject` LIKE '%search_term%' OR `F`.`Post` LIKE '%search_term%' OR `P`.`Subject` LIKE '%search_term%' OR `P`.`Post` LIKE '%search_term%')AND ((`F`.`Type`='forum' AND `F`.`ID`!=`F`.`Forum`) OR `F`.`Type`='thread')GROUP BY `F`.`ID` ORDER BY IF(`F`.`Type`='forum', 0, 1), IF(SUBSTRING_INDEX(`F`.`Status`, ':', -1)>=1, 1, 2), `LastPost` DESC;[/sql]

I can't wait to see this in action with a relatively large database :P (I won't even touch on it's maintainability)

If your serious and you do not see the benefits of a normalized database, I would highly recommend you grab a book on relational databases.
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Forum Searching - Relating Posts

Post by Eran »

I would actually require more joins than i currently have. Eg, I do not need a join for fetching a thread, because the OP is the thread itself...
You would however eliminate most dependent subqueries and other clutter. Maintainability + performance against simplified schema. No dilemma for me.
Post Reply