search algorythm

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
rhecker
Forum Contributor
Posts: 178
Joined: Fri Jul 11, 2008 5:49 pm

search algorythm

Post by rhecker »

I need to add a feature to search the mysql database behind a website. We do not want to use Google Search or something like that.

The problem is dealing with all the search term variables. For instance, keeping whatever from appearing in a search for 'hate' but allowing hates and instances where punctuation follows the term.

I'm thinking that someone must have written a class to deal with these variables. I have looked at a bunch of classes, tutorials and scripts on the web but so far I have not found what I'm looking for. Any ideas?
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: search algorythm

Post by pickle »

If you set up using a MySQL FULLTEXT search and sort by relevance, matches to "hate" and "hates" will have a much higher relevance than "whatever".

You could also set up a Sphinx search server to handle all this for you. I've never used it, but it looks interesting. http://sphinxsearch.com/
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
rhecker
Forum Contributor
Posts: 178
Joined: Fri Jul 11, 2008 5:49 pm

Re: search algorythm

Post by rhecker »

Yes, thanks. I've already started working with FULLTEXT and it is better in a number of ways. I looked a little at Lucene and Spinx but they seem like too much for what I am after, and there would be a much greater learning curve.

So I am experimenting with FULLTEXT and so far so good, although it seems to have some limitations. It's not clear to me if I can somehow use LIKE_ so that loves will come up in a search for love, that sort of thing.
User avatar
califdon
Jack of Zircons
Posts: 4484
Joined: Thu Nov 09, 2006 8:30 pm
Location: California, USA

Re: search algorythm

Post by califdon »

There are all sorts of things you can do with LIKE, etc., but remember that natural language is complex. Do you want to return "gloves" when the search term is "love"? How about "beloved"? And then there's "loving", etc. As pickle said, if you order by relevance, then maybe filter for some minimum value of relevance, that may be the best you can do.
rhecker
Forum Contributor
Posts: 178
Joined: Fri Jul 11, 2008 5:49 pm

Re: search algorythm

Post by rhecker »

The problem with using FULLTEXT, unless I am missing something (I hope I am) is that it is impossible to modify how the search terms will be processed.

For example:
search: peace and love
FULLTEXT will search for "peace" and "love" separately and ignore the "and" (which is fine).

But what if I want to seach for the combination of PEACE and LOVE? I don't seee a way to specify that.
rhecker
Forum Contributor
Posts: 178
Joined: Fri Jul 11, 2008 5:49 pm

Re: search algorythm

Post by rhecker »

If the search term has multiple words, then it seems like preg_replace can be used to send the right definition to the mysql ful text query.

So if the search term is: "happy day" I would need it to become +happy +day

but
$term=preg_replace(" ", " +", $var);
does not produce this result.
Can someone tell me what would?
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: search algorythm

Post by pickle »

You can do that if you search in Boolean mode. Of course you will have to parse the search terms a bit, and you do lose automatic sorting by relevance, but that's not a big deal.

http://dev.mysql.com/doc/refman/5.1/en/ ... olean.html
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
Post Reply