then, when someone inputs a tag, if the tag already exists, increment the tag weight by 1. That way you can make a neato looking tag cloud... and you don't get duplicate tag entries.
edit: That is... now you won't have an insanely long table, and you still get to keep track of which tags are more popular than others.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
then, when someone inputs a tag, if the tag already exists, increment the tag weight by 1. That way you can make a neato looking tag cloud... and you don't get duplicate tag entries.
edit: That is... now you won't have an insanely long table, and you still get to keep track of which tags are more popular than others.
You would still have duplicates though? How would you associate the tag to an item_id if you incremented the value to a different item_id instead of adding a record... and for the tag cloud, couldn't you just group by a count of tags?
GeXus wrote:You would still have duplicates though? How would you associate the tag to an item_id if you incremented the value to a different item_id instead of adding a record... and for the tag cloud, couldn't you just group by a count of tags?
You would _not_ still have duplicates, and yes.
when inserting tag: check if tag already exists. If yes, then increment that tag weight by 1, if not then insert new tag into database.
that may cause one extra query for insertion, but you don't have to count through a table that could potentially have hundreds, thousands, or BILLIONS of records.
SELECT count(*) AS `count`, `tag` FROM `tags` GROUP BY `tag` ORDER BY `count` DESC
Something to that effect. Probably not the correct query, but definitely doable.
Last edited by s.dot on Fri Dec 21, 2007 4:53 pm, edited 1 time in total.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
chalks wrote:that may cause one extra query for insertion, but you don't have to count through a table that could potentially have hundreds, thousands, or BILLIONS of records.
Ok, maybe not billions.
YEAH! If my mother told me once, she told me a billion times: "Don't exaggerate!"
Last edited by califdon on Fri Dec 21, 2007 4:53 pm, edited 2 times in total.
Basically two things are happening at very close (if not the same) times. Which one finishes first (and consequently last) would determine the output of the script, function, etc. It's not a desirable situation to be in as programmers (at least in php) like things to happen in a predictable linear fashion.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
My site has tagging (although I call them labels). I did it with 3 tables... Article -> article_label_xref -> Label. It is a many to many relationship after all. A label is added to the Label table as a unique string with an autoincremented id, then any article it's used with gets a record in the cross reference table (using article_id, label_id). Counting them so you can build a cloud is pretty easy...
SELECT `ooer_label`.`label`,
COUNT(`ooer_article_label_xref`.`label_id`) AS total
FROM `ooer_label`
INNER JOIN `ooer_article_label_xref` ON `ooer_label`.`label_id` = `ooer_article_label_xref`.`label_id`
LEFT JOIN `ooer_article` ON `ooer_article.`article_id` = `ooer_article_label_xref`.`article_id`
WHERE 1
AND `ooer_article`.`deleted` = 0
GROUP BY `ooer_label`.`label`
ORDER BY total DESC
LIMIT 20
The LEFT JOIN and WHERE stuff isn't really necessary, it just makes sure tags that have been used on deleted articles aren't counted. If you were to remove the tags from the cross reference when something's deleted you could get rid of those bits.
A "has and belongs to many" relationship typically has a join table. Don't worry that it's super long, DB's are cool with that. Besides, the join table is only storing 2 integer columns, which is trivial.
I'd let MySQL do the heavy lifting for the tag cloud (with caching, of course)
I was going to do a tag system in the same table as what's being tagged. But how would I select all rows where the keywords/tag field has the searched-for-tag(s).
Table would look something like:
foo
bar
tags
tags would have a string like: "this,and,that,foo,bar".
But then how would I be able to select foo and bar where tags has $_GET['tags']?
Like your other thread, normalization is the way to go. Your just going to make things harder for you in the end and more than likely require more effort going down this path.