algorithm for related articles

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
kkonline
Forum Contributor
Posts: 251
Joined: Thu Aug 16, 2007 12:54 am

algorithm for related articles

Post by kkonline »

Hi everyone,
I am writing an article manager. At the end of each article page I want to display links of related articles (on basis of common tag search)

For example
If article A has tags: love, life, success, article B has tags:love and life, article C has tags: life and dreams

then at the end or article A, I want to give a link of Article B and Article C

However i am getting links of Article A, Article B and Article C for tag "love" and then again links of Article A, Article B and Article C for tag "life"

I would just like suggestions from you all how to resolve this issue?
The code is too big and would take lot of time to give it's details, but if you want i'll explain the code i have witten
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Re: algorithm for related articles

Post by Chris Corbyn »

Put the articles into an array while you search and index them by primary key. That way if the same article matches a different tag it will only replace itself in any previous occurrences. Either that, or avoid looping over that tags and try to intersect them directly. Or both :)

Code: Select all

$related = array();
foreach ($articlePool as $article) {
  $sharedTags = array_intersect($thisArticle->getTags(), $article->getTags());
  if (!empty($sharedTags)) {
    $related[$article->getId()] = $article;
  }
}
kkonline
Forum Contributor
Posts: 251
Joined: Thu Aug 16, 2007 12:54 am

Re: algorithm for related articles

Post by kkonline »

I use 3 tables: article, tag, and article_tag_xref. Obviously articles go in the article table with a unique id for each one. I then have a tag table in which I store each unique tag with an id. The article_tag_xref table is a simple many-to-many cross reference table storing an article id and a tag id. Eg:

Line number On/Off | Expand/Contract
Article table:

Article id Title
1 Article Number One
2 Article Number Two

Tag table

Tag id Article id wordid
1 34 3
2 43 3
3 21 2
4 23 1

Word id wordlist
1 love
2 life
3 success

Article_tag_xref table

THIS IS HOW I LINK ARTICLES AND TAGS,
First i check the wordid then the articleid in tags table and then correspond it to a article


So how do i perform the processes you told with respect to the above structure??
scriptah
Forum Commoner
Posts: 27
Joined: Sat Mar 15, 2008 8:58 pm
Location: Long Island, NY

Re: algorithm for related articles

Post by scriptah »

You could use a query to do so, if I understand the problem correctly.

Code: Select all

 
mysql> select * from article;
+----+-------+
| id | name  |
+----+-------+
|  1 | One   | 
|  2 | Two   | 
|  3 | Three | 
|  4 | Four  | 
+----+-------+
4 rows in set (0.00 sec)
 
mysql> select * from tags;
+----+-------+
| id | name  |
+----+-------+
|  1 | Love  | 
|  2 | Peace | 
|  3 | War   | 
|  4 | Hate  | 
+----+-------+
4 rows in set (0.01 sec)
 
mysql> select * from article_tags;
+------------+--------+
| article_id | tag_id |
+------------+--------+
|          1 |      1 | 
|          1 |      2 | 
|          1 |      3 | 
|          2 |      1 | 
|          3 |      1 | 
|          4 |      4 | 
+------------+--------+
6 rows in set (0.00 sec)
 
 
Lets take for example article with the id "1".
It has the tags "1","2","3".

Easy enough you can see the articles "2" and "3" share the tag "1" with article number "1".
So the output should be "Two" and "Three" which are the names of articles "2" and "3".

Code: Select all

 
mysql> SELECT article.name FROM article, article_tags WHERE article_tags.tag_id IN( 
SELECT article_tags.tag_id FROM article_tags WHERE article_id = '1' ) 
AND article.id = article_tags.article_id AND article.id <> '1';
+-------+
| name  |
+-------+
| Two   | 
| Three | 
+-------+
2 rows IN SET (0.00 sec)
 
Post Reply