Page 1 of 1

Need to remove specific whole words only

Posted: Sun May 06, 2007 8:29 pm
by SearchEngineNightmare
feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


I have been working on this for 3 days & 3 nights.    

I need to remove specific whole words from a string.  The string will already have been converted to all lowercase in a previous function.  The words that I need to remove:   the,  and, are

For instance,  I need to remove "the"  but not modify words that contain "the"   such as "theater"   or  "there"

I have tried using using arrays and str_replace but it is removing characters from the longer words.

So, I am now attempting regular expressions  but it seems to have no effect at all.    How would I do this with regular expressions?   I don't think the pattern is correct.

Code: Select all

<?php
$string = "and hand band the there theater are area care";
echo $string;
echo "</br>";



// try using arrays and str_replace

$eraseData = array( 
        'and' => "", 
        'the' => "", 
        'are' => "" 
        ); 
$newStr = str_replace(array_keys($eraseData), $eraseData, $string);


echo "after replace";
echo "</br>";
echo $newStr;
echo "</br>";


// try using regular expressions

$pattern = '#^and$#';
$replacement = '';
echo preg_replace($pattern, $replacement, $string);
echo "</br>";



$pattern = '#^the$#';
$replacement = '';
echo preg_replace($pattern, $replacement, $string);
echo "</br>";


$pattern = '#^are$#';
$replacement = '';
echo preg_replace($pattern, $replacement, $string);
echo "</br>";



?>

feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]

Posted: Sun May 06, 2007 8:43 pm
by John Cartwright

Code: Select all

preg_replace("#(?<=/s)(and|the)#i", '', $search);
Use a positive look ahead, this snippet should get you started ;)
I'm not regex guru, so maybe there is a better way

Posted: Mon May 07, 2007 3:34 am
by stereofrog
There's a "word boundary" assertion in pcre:

Code: Select all

'/\bthe\b/'
matches 'the' but not 'there'.