Page 1 of 1

Find occurences of unicode characters in string

Posted: Thu Oct 29, 2009 6:57 am
by Sindarin
I need to prohibit filenames with everything but English characters and numbers but regexp and string function don't seem to work because they consider the Greek alphabet letters as part of the A-Z a-z sequence. Here's what I've tried:
if (strspn($str, "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789") != strlen($str))
{
echo 'invalid filename';
}
if (!preg_match("/^([-a-z0-9])+$/i", $str))
{
echo 'invalid filename';
}

Re: Find occurences of unicode characters in string

Posted: Thu Oct 29, 2009 1:20 pm
by requinix
Are you absolutely sure that $str has Greek letters? I'm not talking about what you typed into an HTML form - I mean, have you looked closely at $str and verified it had Greek letters in it?

Re: Find occurences of unicode characters in string

Posted: Fri Oct 30, 2009 6:30 am
by Sindarin
Yes, it did have Greek characters in it.

btw, I solved this by detecting the encoding of the string using mb_detect_encoding:

Code: Select all

    $encoding = mb_detect_encoding($str, 'auto');
    if ($encoding == 'ASCII')
    {
        return 1;
    }
    else
    {
        return 0;
    }