Page 1 of 1

any RegExp guru's able to help a guy out?

Posted: Fri Oct 11, 2002 7:01 am
by neh
Hi

I am implementing a boolean search capability for a search engine I have written and figure on using preg_match_all to get the various arguments

so far to get the boolean AND terms working with:

/\+{1}\s{0,}\w+\s*/

and boolean NOT terms working with:

/\-{1}\s{0,}\w+\s*/

but I am at a loss (after many hours scrabbling about trying to learn enough perl regularexpression to fo it :() at how to then match the boolean OR (basically just search strings NOT preceeded by a + or - and 0 or more spaces)

I've gone through several tries but none have been successful

if you know your regexp and its a no brainer - I'd really appreciate a tip


many thanks
Ant :)

Posted: Fri Oct 11, 2002 10:05 am
by ReDucTor
do you mean something like this

if(substr($str,0,2)=="+ ")
// Positive
if(substr($str,0,2)=="- ")
// Negative

Posted: Fri Oct 11, 2002 10:18 am
by neh
no I mean something like this ..

Code: Select all

function regBoolExp($raw, $comp)
		{
			//parse the search strings to create a boolean set of regular expressions for the mysql provider
			//this is pretty nasty as it happens in the class .. I will have to put it in the provider
			
			if (strlen($raw) == 0)
			{
				return " ";
			}
			
			$words = explode(" ", trim($raw));
			$index = 0;
			$reg = "";
			
			while ($index < sizeof($words)):
				$temp = $wordsї$index];
				if (trim($temp) == "")
				{
					$index++;
					continue;	
				}
				
				if ($temp == "+")
				{
					$temp = $wordsї++$index];
					$reg .= (strlen($reg) != 0 ? " AND " : " ")."($comp REGEXP '\ї\ї:<:\]\]$temp\ї\ї:>:\]\]')";
					
				}
				elseif ($temp == "-")
				{
					$temp = $wordsї++$index];
					$reg .= (strlen($reg) != 0 ? " AND " : " ")."($comp NOT REGEXP '\ї\ї:<:\]\]$temp\ї\ї:>:\]\]')";
					
				}
				elseif (substr($temp, 0, 1) == "+")
				{
					$temp = substr($temp, 1, strlen($temp) - 1);
					$reg .= (strlen($reg) != 0 ? " AND " : " ")."($comp REGEXP '\ї\ї:<:\]\]$temp\ї\ї:>:\]\]')";
				}
				elseif (substr($temp, 0, 1) == "-")
				{
					$temp = substr($temp, 1, strlen($temp) - 1);
					$reg .= (strlen($reg) != 0 ? " AND " : " ")."($comp NOT REGEXP '\ї\ї:<:\]\]$temp\ї\ї:>:\]\]')";
				}
				else
				{ 
					$reg .= (strlen($reg) != 0 ? " OR " : " ")."($comp REGEXP '\ї\ї:<:\]\]$temp\ї\ї:>:\]\]')";
				}
				
				$index++;
			endwhile;
			
			$reg = "AND(".$reg.")";
			return $reg;
    }
as you can see I have sorted it out now anyway .. it appears regexps arent capable of what I wanted so I've had to do a faggy parse (taking the bugs in php4 explode into account ..#sighs#) then string all the single regexp together afterwards in a boolean big expression ..


;)

but cheers for asking :):)

Posted: Fri Oct 11, 2002 12:05 pm
by volka
try

Code: Select all

<html> 
<body><pre>
<?php 
function parse($query)
{
	$pattern = '/(ї+-]?)\s*(\w+)\s*/';
	preg_match_all($pattern, $query, $matches);
	return $matches;
}

print_r(parse('simplesearch'));
print_r(parse('+positivepattern'));
print_r(parse('-negativepattern'));
print_r(parse('and +now all -together'));
?></pre></body></html>