So, now with a good night's sleep and a fresh cup of coffee, let's try this agin.

I'll abandon my attempts to keep it terse.
I have been building a photo system for my family's photos. We're nearing the 20,000 photo mark right now. What I've chosen to build is a system where the user adds filters over time, thus shrinking the set of photos that they would be currently browsing. Something along the lines of, "Show me all the photos from Georgia with VW Beetles and my brother John."
Instead of some tree-style categories that I predetermine, I'm attempting to apply tagging to the system. The tags will be just another one of the filters the user can apply to the collection.
Code: Select all
CREATE TABLE media (
id int UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
filename varchar(128) NOT NULL,
mime_type varchar(128) NOT NULL,
media_type varchar(24) NOT NULL,
title varchar(128),
description text,
md5 varchar(32) NOT NULL,
user_id int UNSIGNED NOT NULL,
roll_id int UNSIGNED NOT NULL,
date datetime,
city varchar(128),
state varchar(128),
country varchar(128),
notes varchar(255),
FOREIGN KEY (user_id) REFERENCES users(id),
FOREIGN KEY (roll_id) REFERENCES rolls(id)
) engine=InnoDB;
Code: Select all
CREATE TABLE tags (
id int UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
name varchar(128) NOT NULL UNIQUE
) engine=InnoDB;
Code: Select all
CREATE TABLE media_tags (
media_id int UNSIGNED NOT NULL,
tag_id int UNSIGNED NOT NULL,
FOREIGN KEY (media_id) REFERENCES media(id),
FOREIGN KEY (tag_id) REFERENCES tags(id)
) engine=InnoDB;
The problem I'm having is with queries to get the intersection of a user-filtered set of media and the set of tags. This case comes up once the user has filtered the media by choosing a few tags. If the user chooses a few tags, then the media should be filtered to only the media where each photo has every tag the user has chosen. This is, of course, in addition to any other filters the user could have chosen, like say, roll_id or date.
I was reasonable successful getting all (and only) the media that matches each of the user's given tags. So, for example, if the user chooses tag 15, we show only the media that has that tag. If the user then chooses tag 6, we show only the media that has both tag 15 and tag 6. The best query I could come up with for this is (and this is the raw sql).
Code: Select all
SELECT m.id AS id FROM media m
INNER JOIN media_tags mt ON m.id=mt.media_id AND mt.tag_id IN (15,6)
GROUP BY m.id HAVING count(*)=2 ORDER BY id
This query is great, since if the user chooses other filters, I can add them in to a where clause, like so:
Code: Select all
SELECT m.id AS id FROM media m
INNER JOIN media_tags mt ON m.id=mt.media_id AND mt.tag_id IN (15,6)
WHERE roll_id=5 AND year(date)>1999
GROUP BY m.id HAVING count(*)=2 ORDER BY id
This means keeping track of the user's choices in an array, and imploding the id's of the tags. I also need to add the count of the array in the query. For user choices, this query is reasonable, since the number of tags they choose will most likely be small.
So, now, once the user is viewing a set of photos that has been filtered to these two tagsI decided (and it may be misguided) that I wanted to show the user a box of "Related" tags. I decided that a tag would be "Related" if every last photo in the current set had that tag. In our case, if the user filtered down to tags 15 and 6, I would expect the related tags box to display 15 and 6 but also maybe one or two other tags that the set happened to share.
So my challenge has been to write a query to give me only the set of tags that appear in each and every photo of a set....a set that has been defined by a couple tags that the user has already chosen. My attempts at this query have not been too successful. The only accurate one that I've got takes 20 seconds or so to run.
Code: Select all
SELECT DISTINCT t.id FROM tags t
INNER JOIN media_tags mt ON t.id=mt.tag_id AND mt.media_id IN
(SELECT m.id AS id FROM media m
INNER JOIN media_tags mt ON m.id=mt.media_id AND mt.tag_id IN (15,6)
GROUP BY m.id HAVING count(*)=2 ORDER BY id)
And just as a sidenote, elsewhere I am displaying the list of all the tags that appear at least once in a given collection of photos. That's a whole lot easier query. But in this case, I'm trying to get just the tags that each and every photo in a defined set has.