Page 1 of 1

Regex help for removing common words except those in quotes

Posted: Mon Nov 11, 2013 4:59 am
by sumeshkp
I've got this regular expression below which removes common whole words($commonWords) from a string($input) an I would like to tweak it so that

1. it ignores those words (more than one word) in double or single quotes (like exact search in google search tab)
2. it remove words starting with hyphen ('-') but not those inside double or single quotes (like negative search in google search tab)

return preg_replace('/\b('.implode('|',$commonWords).')\b/i','',$input);

thanks

Re: Regex help for removing common words except those in quo

Posted: Mon Nov 11, 2013 9:39 pm
by Christopher
You could use a regexp to first separate the text inside and outside of quotes. Then parse those with different rules.

I think you might be better off parsing the string word by word (or character by character) and maintaining state.

Re: Regex help for removing common words except those in quo

Posted: Tue Nov 12, 2013 5:22 am
by sumeshkp
Makes sense. Thanks