Page 1 of 1

Inserting ? into regex pattern

Posted: Sun Oct 14, 2007 6:28 am
by Stryks
I've just been having a look at some code in another post.
Kieran Huggins wrote:

Code: Select all

$original = "It's a nice day in a nice way.";
$values = array('glorious','wonderful','marvelous','good','grand'); 

function r($needle,$replacement_needles,$haystack){
	$haystack = preg_replace('/\b'.$needle.'\b/i',$replacement_needles[array_rand($replacement_needles)],$haystack,1);
	if (strpos($haystack,$needle)) $haystack = r($needle,$replacement_needles,$haystack);
	return $haystack;
}

echo r('nice',$values,$original);
It seems to work like a champ, but I tried to change $original from 'nice' to '?' and the call to r() likewise. It throws an error. I figure because ? has it's own meaning in regex. So I thought I'd call it with preg_quote() ...

Code: Select all

echo r(preg_quote('?'),$values,$original);
Still no go. Pretty basic , I know, but it's better to ask than to go on not knowing.

How can I process ? for insertion into a regex pattern?

Thanks

Posted: Sun Oct 14, 2007 6:54 am
by feyd
Unless there's good reason not to, I'd likely shift the preg_quote() call into the function where $needle is used in the pattern (but nowhere else.) I'd also add the delimiter parameter.

Posted: Sun Oct 14, 2007 7:04 am
by Stryks
feyd - Yeah, I had thought the same about the function, I guess I just wanted to know if it would escape it properly. It doesn't seem to.

As for the delimiter character ... I'm really not sure what to put in there. I tried it with '/' (the manual said it was the most common???) but still it just returns the original string.

The final pattern comes out as
/\b\?\b/i

Posted: Sun Oct 14, 2007 2:27 pm
by superdezign
Well, '/' is the delimiter. That pattern looks valid to me.

Like feyd said, put preg_quote() into the original function. As for your question mark, you should just escape it manually. preg_quote() doesn't destroy your regex, it only makes it so that it's compatible with any delimiter without having to know what it is beforehand. I think it escapes a few other things, but I've forgotten what they are.

Posted: Sun Oct 14, 2007 5:32 pm
by Stryks
Thanks for the reply. I'm still not really getting it though.

Code: Select all

<?php

$original = "It's a ? day in a ? way.";
$values = array('glorious','wonderful','marvelous','good','grand');

function r($needle,$replacement_needles,$haystack){
	$haystack = preg_replace('/\b'. preg_quote($needle, '/') .'\b/i', $replacement_needles[array_rand($replacement_needles)], $haystack, 1);
		if (strpos($haystack, $needle)) $haystack = r($needle, $replacement_needles, $haystack);
		return $haystack;
}


echo r('?', $values, $original);

?>
It's an endless loop.

I can escape the ? manually, but it just returns ...
It's a ? day in a ? way.
:?

Posted: Sun Oct 14, 2007 6:24 pm
by Kieran Huggins
lol - I just knew that would come back and bite me in the smurf!

It has to do with how regex defines "word boundary", or more to the point: how it defines a "word". A single question mark is, well, questionable 8)

Working version:

Code: Select all

$original = "It's a ? day in a ? way.";
$values = array('glorious','wonderful','marvelous','good','grand'); 

function r($needle,$replacement_needles,$haystack){
	$haystack = preg_replace('#([\s\b])('.preg_quote($needle,'#').')([\s\b])#i','$1'.$replacement_needles[array_rand($replacement_needles)].'$3',$haystack,1);
	if (stripos($haystack,$needle)) $haystack = r($needle,$replacement_needles,$haystack);
	return $haystack;
}

echo r('?',$values,$original);
If there are any other characters you want to count as a boundary, just add them to the [\s\b] character classes.