Page 1 of 1
filter_var and regex (pass Unicode characters only)
Posted: Wed Jun 10, 2009 6:12 am
by spamyboy
I am trying to create input validation that would pass only characters and white-spaces.
This is where I am. Though I can't use [a-zA-Z] because input also might be kiliric, Baltic alphabet (eg. ?????Š??Ž, etc) or Asian etc.
Any ideas?
Code: Select all
var_dump(filter_var('ABC???', FILTER_VALIDATE_REGEXP, array("options"=>array("regexp"=>"/P{L}/u"))));
Re: filter_var and regex (pass Unicode characters only)
Posted: Wed Jun 10, 2009 7:43 am
by prometheuzz
Note that there should be a backslash before such character sets:
But a capital P denotes any character other than a letter. You probably want:
which matches a single letter in pretty much any language AFAIK.
Re: filter_var and regex (pass Unicode characters only)
Posted: Wed Jun 10, 2009 7:54 am
by spamyboy
This still passes trough 'ABC??&*' as a valid string, though it contains '&*'.
Code: Select all
filter_var('ABC??&*', FILTER_VALIDATE_REGEXP, array("options"=>array("regexp"=>"/\p{L}/u"))
Re: filter_var and regex (pass Unicode characters only)
Posted: Wed Jun 10, 2009 8:00 am
by prometheuzz
spamyboy wrote:This still passes trough 'ABC??&*' as a valid string, though it contains '&*'.
Code: Select all
filter_var('ABC??&*', FILTER_VALIDATE_REGEXP, array("options"=>array("regexp"=>"/\p{L}/u"))
I've never used this filter_var(...) but with the normal preg-functions, '/\p{L}/' will match any string containing a letter. Matching a string only when it solely consists of letters would be done like this: '/^\p{L}+$/'
^ denotes the start of the string, $ is the end of the string and + means "one or more".
HTH
Re: filter_var and regex (pass Unicode characters only)
Posted: Wed Jun 10, 2009 8:21 am
by spamyboy
Thank you, that helped me a lot.
Re: filter_var and regex (pass Unicode characters only)
Posted: Wed Jun 10, 2009 8:34 am
by prometheuzz
spamyboy wrote:Thank you, that helped me a lot.
You're welcome.