Page 1 of 1

search engine with mysql

Posted: Tue Dec 14, 2004 3:33 pm
by josh
How can I search through a mysql database and check more then one field for the searched text, but order it by which field it was found in, and how many times...
For instance if I have 3 fields:
title, filename, and html
I want terms with the query in the title and filename and html to ouput first, then the terms with title and filename, then filename and html, then filename, then html.. but I dont want the same record outputted twice.
Also I am searching multiple tables useing a union.. I dont think this will matter but the results will also be page-inated

here is my current code (tested, and is currently in use on the website)
right now it only searches the titles... where the title includes all the terms entered, how can I also make it so it will search where like '%termone%' and like '%termtwo%' and then do it with or's that way a field containing all terms will be outputted first then a field with only some of the terms.

Is this even possible with mysql?

Code: Select all

<?php
function makelike() {
		$q=secure(urldecode($_REQUEST['query']));
		$q=explode(" ", $q);
		$like=NULL;
		$count=0;
		foreach($q as $term) {
			$count++;
			$like .="LIKE '%" . secure($term) . "%'";
			if ($count!=count($q)) {
				$like .= " and `title` ";
			}
		}
		return($like);
	}
	function make ($action, $system) {
		if ($action=='count') {
			$query="SELECT count(*) FROM $system WHERE `title` " . makelike() . " ; ";
		} else {
			$query="($action `title`, `filename`, `sys` FROM `" . $system . "` WHERE `title` " . makelike() . " ORDER BY `title`)";
		}
		return ($query);
	}
	function build($action) {
		$query=(make($action, 'ps2') . " UNION " . make($action, 'xbox') . " UNION " . make($action, 'pc'));
		return($query);
	}
	
	$query=make('count', 'ps2');
	$result = dbq($query);
	$found = $result[0];
	
	$query=make('count', 'xbox');
	$result = dbq($query);
	$found = $found + $result[0];
	
	$query=make('count', 'pc');
	$result = dbq($query);
	$found = $found + $result[0];
	
	
	$start = secure((int)$_REQUEST['start']);
	$perpage = 20;//change this to whatever number you like.
	
	$query=build('select') . "LIMIT $start, " . secure($perpage) . " ; ";
	$result=mysql_query($query) or die(mysql_error());
	echo '<table width=100% border=1 bordercolor=black>';
	$query='''' . htmlentities($_REQUEST['query']) . '''';
	echo '<tr><td colspan=2 align=center><b>Onlycheaters search results for ' . $query . '</b><BR>(Found <i>' . $found . '</i> results)</td></tr>';
	while (list($title, $filename, $sys) = mysql_fetch_array($result)) {
		echo ("<tr><td><a href='/$sys/$filename'>$title</a></td><td>$sys</td>");
	}
	echo '</table>';
	echo nextpage($perpage,$start,$_SERVER['PHP_SELF'], "", $found) . "</center>";
?>

I don't know even how to go about this... If I can get mysql to never return an item it has already returned by like putting distinct() around the while thing that would be a start, but I would have to get it to order the results in the correct way, giveing certaint fields more priority over others and also the issue with like '' OR like '' as to like '' AND like '' (give items with both terms a higher priority)

Posted: Tue Dec 14, 2004 6:02 pm
by pickle
Check here:
http://dev.mysql.com/doc/mysql/en/Fulltext_Search.html

Full text searching will allow you to just search a table (with some setup), and have the results returned in order of relevancy. Check the documentation for more in-depth details. I've personally used it before and it makes my life and coding MUCH easier.

You could set up multiple full text indices per table - 1 for the title, 1 for the filename, 1 for the contents, and check each result set accordingly.

Posted: Tue Dec 14, 2004 10:02 pm
by josh
Thank you works great, doesn't fullfill 100% of what I had hoped for but it works great.
One question, in the mysql manual it says you can search multiple indexes

Code: Select all

MATCH (title,html) AGAINST ('text')
This returns an error saying my index isnt a fulltext, which both are, and I tested it by searching for title and that worked then i searched for html and that also worked, MATCH (title,title) also worked.. any ideas on that?

MATCH(title) will do but I was hopeing to search the `html` field too. Also when I get it to search both fields is there a way to tell SQL to give the `title` field more importance then the html field?

Thank you again for all your help.

Posted: Wed Dec 15, 2004 9:50 am
by pickle
I think the manual also says that the index described in the MATCH section, needs to match the index described in the AGAINST section. If you want to search the HTML field too, either make a more comprehensive index to include the HTML field, or make another index entirely and do another search on that index.

Posted: Wed Dec 15, 2004 4:57 pm
by josh
Ok, If it requires doing 2 seperate searches then it is not effiecient enough for my needs. Searching the title would do fine.
How would I put a link:
did you mean xxxxxxxxx
when xxxxxx is the correct spelling of yyyyyyyyy
Would I just match each of their words against my entire database and order it by the score mysql gives it and if the score is less than an exact match it suggests the highest match?

Posted: Wed Dec 15, 2004 5:58 pm
by pickle
It wouldn't REQUIRE an extra search, it's just one possible approach. I would suggest, as you've probably realized and done, just modifying your index to include that field.

I'm not sure what you mean by puting a link? Are you just trying to link to the page? If so, then ya - just list out the URL's in the order they're returned from MySQL.

Posted: Thu Dec 16, 2004 8:48 am
by josh
No, I already know how to make a link with the search results, what I am trying to do is,
Go to http://www.google.com/search?hl=en&q=this+is+mispelt
It says Did you mean: this is misspelled
How would I do that, match each word entered against all fields in the database and then if any score is less then an exact match, pull out the highest match for that word and suggest it as the correct spelling?

Posted: Thu Dec 16, 2004 9:49 am
by pickle
Hmm, I'm not sure if this functionality does that. I think the items are returned in the order of relevance - which is a value you can retrieve. I don't know if you can tell if the spelling is 100% accurate based on relevancy. It's not hard to just do a spell check before submitting the query, if that's what your're looking for.

Are you trying to do something like Google? If so you're going to need much more computational power. I think how Google does it is they check their database for similarly spelled search items that have a higher return rate. You'll have to cache all that data if that's what you're thinking.

Posted: Thu Dec 16, 2004 6:35 pm
by josh
No, by no means am I going to try to even come close to searching over 8 billion documents like google is. I just have 13,000 records in a database and I have a file on my website that retreives them and puts the field `html` into a template, so I kind of have a 13,000 page website except not physically 13,000 pages, and I thought it would be nice to give my users a search engine so they won't have to spend 12 minutes looking for the "page" (record in the database) that they want to view.
I also log the searches that way I can put at the bottom of my results page:

people that searched for ____ also searches for ______

Now it will have incorrectly spelt terms in there.

How do I do a spellcheck, some of the things they would be searching for aren't in a dictionary. They might be searching for acronyms and things like that.

Posted: Thu Dec 16, 2004 6:56 pm
by magicrobotmonkey
you could build your own dictionary, then do the following
serch for the input
if no rows
check input against dict and return most similar word
else
return rows