Find occurences of unicode characters in string

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
Sindarin
Forum Regular
Posts: 521
Joined: Tue Sep 25, 2007 8:36 am
Location: Greece

Find occurences of unicode characters in string

Post by Sindarin »

I need to prohibit filenames with everything but English characters and numbers but regexp and string function don't seem to work because they consider the Greek alphabet letters as part of the A-Z a-z sequence. Here's what I've tried:
if (strspn($str, "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789") != strlen($str))
{
echo 'invalid filename';
}
if (!preg_match("/^([-a-z0-9])+$/i", $str))
{
echo 'invalid filename';
}
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: Find occurences of unicode characters in string

Post by requinix »

Are you absolutely sure that $str has Greek letters? I'm not talking about what you typed into an HTML form - I mean, have you looked closely at $str and verified it had Greek letters in it?
User avatar
Sindarin
Forum Regular
Posts: 521
Joined: Tue Sep 25, 2007 8:36 am
Location: Greece

Re: Find occurences of unicode characters in string

Post by Sindarin »

Yes, it did have Greek characters in it.

btw, I solved this by detecting the encoding of the string using mb_detect_encoding:

Code: Select all

    $encoding = mb_detect_encoding($str, 'auto');
    if ($encoding == 'ASCII')
    {
        return 1;
    }
    else
    {
        return 0;
    }
Post Reply