Page 1 of 1

Ignoring spaces and punctuation

Posted: Wed Feb 08, 2006 1:56 pm
by CobraCards
I'm looking for a way to have my regex account for variations in spacing/punctuation. For example, if I'm looking for "gradepoint", I would want "gradepoint", "grade point", "grade-point", "gr.ade-po!i?nt." etc. all to match.

(Please note that this is only an example -- I can't be this specific in the actual code, which scans a block of text for any of thousands of keywords.)

First question -- is there a simple way to just say "ignore punctuation", like you can add "i" at the end to have it ignore case? 8)

If not, then I think I'm looking at something like this: (in English, using the "gradepoint" example)

letter G, possibly a non-alphanumeric character, letter R, possibly a non-alphanumeric character, letter A.... and so on.

What would be the syntax for this?

Thanks!

Posted: Wed Feb 08, 2006 2:25 pm
by feyd
sadly, there is no simple way like the i modifier.

Code: Select all

$needle = 'gradepoint';
$broken = array();
$part = '[^a-z0-9]*';
for($i = 0, $j = strlen($needle); $i < $j) {
  $broken[] = $needle{$i};
}
$needle = implode($part,$real);
fun.

Posted: Wed Feb 08, 2006 2:33 pm
by raghavan20
the easiest way I think is, first remove all punctuation using [^a-z0-9] then you again can run a simple preg_match or you can run a simple strstr to find out existance of a string.