Regex help for removing common words except those in quotes

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
sumeshkp
Forum Newbie
Posts: 2
Joined: Mon Nov 11, 2013 4:55 am

Regex help for removing common words except those in quotes

Post by sumeshkp »

I've got this regular expression below which removes common whole words($commonWords) from a string($input) an I would like to tweak it so that

1. it ignores those words (more than one word) in double or single quotes (like exact search in google search tab)
2. it remove words starting with hyphen ('-') but not those inside double or single quotes (like negative search in google search tab)

return preg_replace('/\b('.implode('|',$commonWords).')\b/i','',$input);

thanks
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Regex help for removing common words except those in quo

Post by Christopher »

You could use a regexp to first separate the text inside and outside of quotes. Then parse those with different rules.

I think you might be better off parsing the string word by word (or character by character) and maintaining state.
(#10850)
sumeshkp
Forum Newbie
Posts: 2
Joined: Mon Nov 11, 2013 4:55 am

Re: Regex help for removing common words except those in quo

Post by sumeshkp »

Makes sense. Thanks
Post Reply