Page 1 of 1

filter_var and regex (pass Unicode characters only)

Posted: Wed Jun 10, 2009 6:12 am
by spamyboy
I am trying to create input validation that would pass only characters and white-spaces.
This is where I am. Though I can't use [a-zA-Z] because input also might be kiliric, Baltic alphabet (eg. ?????Š??Ž, etc) or Asian etc.
Any ideas?

Code: Select all

var_dump(filter_var('ABC???', FILTER_VALIDATE_REGEXP, array("options"=>array("regexp"=>"/P{L}/u"))));

Re: filter_var and regex (pass Unicode characters only)

Posted: Wed Jun 10, 2009 7:43 am
by prometheuzz
Note that there should be a backslash before such character sets:

Code: Select all

\P{L}
But a capital P denotes any character other than a letter. You probably want:

Code: Select all

\p{L}
which matches a single letter in pretty much any language AFAIK.

Re: filter_var and regex (pass Unicode characters only)

Posted: Wed Jun 10, 2009 7:54 am
by spamyboy
This still passes trough 'ABC??&*' as a valid string, though it contains '&*'.

Code: Select all

filter_var('ABC??&*', FILTER_VALIDATE_REGEXP, array("options"=>array("regexp"=>"/\p{L}/u"))

Re: filter_var and regex (pass Unicode characters only)

Posted: Wed Jun 10, 2009 8:00 am
by prometheuzz
spamyboy wrote:This still passes trough 'ABC??&*' as a valid string, though it contains '&*'.

Code: Select all

filter_var('ABC??&*', FILTER_VALIDATE_REGEXP, array("options"=>array("regexp"=>"/\p{L}/u"))
I've never used this filter_var(...) but with the normal preg-functions, '/\p{L}/' will match any string containing a letter. Matching a string only when it solely consists of letters would be done like this: '/^\p{L}+$/'
^ denotes the start of the string, $ is the end of the string and + means "one or more".

HTH

Re: filter_var and regex (pass Unicode characters only)

Posted: Wed Jun 10, 2009 8:21 am
by spamyboy
Thank you, that helped me a lot.

Re: filter_var and regex (pass Unicode characters only)

Posted: Wed Jun 10, 2009 8:34 am
by prometheuzz
spamyboy wrote:Thank you, that helped me a lot.
You're welcome.