Page 1 of 1

How to make preg_replace() more greedy?

Posted: Fri Oct 02, 2009 12:58 pm
by jeff00seattle
Hi

I am using the PHP Regex function preg_replace(), and I am having an issue that it is not greedy enough.

Code: Select all

 
$extensions_array = array( 'htm', 'html', 'php', 'asp', 'aspx' );
/* begin Regex Code */
$extensions_str = trim(implode("|", $extensions)); // "htm|html|php|..."
$extensions_pattern = "/.(".$extensions_str.")/i";
$value = "/steve.html";
$result = preg_replace($extensions_pattern, "", $value);
/* end Regex Code */
 
The result is "/stevel" an extra "l", instead of as desired "/steve". This is because preg_replace() used the first match "htm" to replace.

To make this work, I had to reverse order by length the strings within the $extensions_array array

Code: Select all

$extensions_array = array( 'htm', 'html', 'php', 'asp', 'aspx' );
/* Reverse order by length using functor */
uasort($extensions_array, "string_reverse_sort_by_length_functor"); // new order array( 'html', 'aspx', 'htm', 'php', 'asp' );
/*... same Regex Code  as above ... */
 
/* Reverse order functor */
function string_reverse_sort_by_length_functor($val_1, $val_2)
{
  $retVal = 0;
  $firstVal = strlen($val_1);
  $secondVal = strlen($val_2);
  if($firstVal < $secondVal) 
  { $retVal = 1; } 
  else if($firstVal > $secondVal) 
  { $retVal = -1; }
  return $retVal;
}
The result is now correct, $result == "/steve"

Could I have done this differently? Is there another parameter that I could have added to $extensions_pattern, in addition to case-insensitive i to make the preg_replace more greedy?

Thanks

Jeff in Seattle

Re: How to make preg_replace() more greedy?

Posted: Fri Oct 02, 2009 1:23 pm
by mybikeisgreen
put html before htm. It will match that first.

Re: How to make preg_replace() more greedy?

Posted: Fri Oct 02, 2009 2:44 pm
by jeff00seattle
I did that as shown within the second coding example, and it works.

The extensions array I get externally, so they will be randomly order by length. And to fix that, with uasort(), it performed a reverse ordering by length the extensions.

What I was wondering if there is a greedy setting that I could use within preg_replace() whereby I can avoid having to call uasort().

Thanks for the reply.

Re: How to make preg_replace() more greedy?

Posted: Fri Oct 02, 2009 2:57 pm
by jeff00seattle
I just discovered is what I really need is an exact match, because my Regex pattern still leaks false positive matches.

For example, if my Regex pattern is "/.(htm|html|php|ftp)/i" and I pass through preg_replace() some (goofy) extension.

So if the value is /steve.phpp, then it will match with ".php", and the result will be /stevep, an undesired extral p.