Ignoring spaces and punctuation

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
CobraCards
Forum Newbie
Posts: 13
Joined: Fri Feb 03, 2006 1:40 pm

Ignoring spaces and punctuation

Post by CobraCards »

I'm looking for a way to have my regex account for variations in spacing/punctuation. For example, if I'm looking for "gradepoint", I would want "gradepoint", "grade point", "grade-point", "gr.ade-po!i?nt." etc. all to match.

(Please note that this is only an example -- I can't be this specific in the actual code, which scans a block of text for any of thousands of keywords.)

First question -- is there a simple way to just say "ignore punctuation", like you can add "i" at the end to have it ignore case? 8)

If not, then I think I'm looking at something like this: (in English, using the "gradepoint" example)

letter G, possibly a non-alphanumeric character, letter R, possibly a non-alphanumeric character, letter A.... and so on.

What would be the syntax for this?

Thanks!
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

sadly, there is no simple way like the i modifier.

Code: Select all

$needle = 'gradepoint';
$broken = array();
$part = '[^a-z0-9]*';
for($i = 0, $j = strlen($needle); $i < $j) {
  $broken[] = $needle{$i};
}
$needle = implode($part,$real);
fun.
User avatar
raghavan20
DevNet Resident
Posts: 1451
Joined: Sat Jun 11, 2005 6:57 am
Location: London, UK
Contact:

Post by raghavan20 »

the easiest way I think is, first remove all punctuation using [^a-z0-9] then you again can run a simple preg_match or you can run a simple strstr to find out existance of a string.
Post Reply