Need to remove specific whole words only

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
SearchEngineNightmare
Forum Newbie
Posts: 2
Joined: Sun May 06, 2007 7:31 pm

Need to remove specific whole words only

Post by SearchEngineNightmare »

feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


I have been working on this for 3 days & 3 nights.    

I need to remove specific whole words from a string.  The string will already have been converted to all lowercase in a previous function.  The words that I need to remove:   the,  and, are

For instance,  I need to remove "the"  but not modify words that contain "the"   such as "theater"   or  "there"

I have tried using using arrays and str_replace but it is removing characters from the longer words.

So, I am now attempting regular expressions  but it seems to have no effect at all.    How would I do this with regular expressions?   I don't think the pattern is correct.

Code: Select all

<?php
$string = "and hand band the there theater are area care";
echo $string;
echo "</br>";



// try using arrays and str_replace

$eraseData = array( 
        'and' => "", 
        'the' => "", 
        'are' => "" 
        ); 
$newStr = str_replace(array_keys($eraseData), $eraseData, $string);


echo "after replace";
echo "</br>";
echo $newStr;
echo "</br>";


// try using regular expressions

$pattern = '#^and$#';
$replacement = '';
echo preg_replace($pattern, $replacement, $string);
echo "</br>";



$pattern = '#^the$#';
$replacement = '';
echo preg_replace($pattern, $replacement, $string);
echo "</br>";


$pattern = '#^are$#';
$replacement = '';
echo preg_replace($pattern, $replacement, $string);
echo "</br>";



?>

feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

Code: Select all

preg_replace("#(?<=/s)(and|the)#i", '', $search);
Use a positive look ahead, this snippet should get you started ;)
I'm not regex guru, so maybe there is a better way
User avatar
stereofrog
Forum Contributor
Posts: 386
Joined: Mon Dec 04, 2006 6:10 am

Post by stereofrog »

There's a "word boundary" assertion in pcre:

Code: Select all

'/\bthe\b/'
matches 'the' but not 'there'.
Post Reply