Page 1 of 1

Really annoyed with simple regex [SOLVED]

Posted: Mon May 18, 2009 10:19 am
by andylyon87
I need the most simple regex.

I don't know why this is taking me such a long time to get!!!

All I need is a regex for a string. The string is a blog entry so needs to allow the characters . , _ - + = ! " : ( ) ' and that is it!

I have done this so far but this just throws up an error with anything!

Code: Select all

 
eregi ("^[A-Za-z0-9\.,_\-\+\=\!\":()\']+{2,200}$", $string)
 
I think Im getting messed up somewhere!

All this regex is meant to do is stop anybody adding in code using the box and I thought it would be more simple than a str_replace.

Thanks in advance for any help.

I have tried reading everywhere on the net but all they show is finding areas of a string or verifying an email of a url

Re: Really annoyed with simple regex, seems to be a common theme

Posted: Mon May 18, 2009 12:23 pm
by John Cartwright
ereg is deprecated, and should be using preg_* instead. Add a few delimiters in there, fix the nothing to repeat error (caused by the extra + after the [] range -- because you also specified a fixed length), escape the brackets within the range, and you should be set

i.e., untested

Code: Select all

if (preg_match ("#^[A-Za-z0-9\.,_\-+\=\!\":\(\)']{2,200}$#", $string)) {
    echo 'ok';
}
 

Re: Really annoyed with simple regex, seems to be a common theme

Posted: Thu May 21, 2009 2:34 am
by prometheuzz
John Cartwright wrote:...

Code: Select all

if (preg_match ("#^[A-Za-z0-9\.,_\-+\=\!\":\(\)']{2,200}$#", $string)) {
    echo 'ok';
}
 
@OP, some people like to escape regex meta characters inside character classes for clarity although this is not really needed. Character classes can be seen as a little language inside regex with it's own meta characters, which are:

^ (negation, can only be placed at the start of a character class, otherwise it just matches a '^')
- (range operator only when placed NOT at the start or end of a character class, otherwise it matches just a '-')
] (closing character)

All other characters don't need escaping. But, like I said, some like to escape them for clarity, but note that the '=' and '!' are never meta character so escaping those are never needed. So the suggested regex can be written as this:

Code: Select all

"#^[A-Za-z0-9\.,_\-+=!\":\(\)']{2,200}$#"
But you can even do it like this:

Code: Select all

"#^[A-Za-z0-9.,_+=!\":()'-]{2,200}$#"
because the '.', '(', ')' are no meta characters inside a character class and the '-' also has no special meaning at the end of the character class.

Re: Really annoyed with simple regex, seems to be a common theme

Posted: Thu May 21, 2009 5:22 am
by andylyon87
Cheers guys, since I have had this basic string I have been able to adapt it and am kinda gettin the whole thing now.

Think I just needed a basic code frag to get going

Re: Really annoyed with simple regex, seems to be a common theme

Posted: Fri May 22, 2009 7:49 am
by prometheuzz
andylyon87 wrote:Cheers guys, since I have had this basic string I have been able to adapt it and am kinda gettin the whole thing now.

Think I just needed a basic code frag to get going
You're welcome.