Converting Diactritics (accents) to English Letters [SOLVED]

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
ApexDevil
Forum Newbie
Posts: 3
Joined: Fri Jan 08, 2010 12:25 pm

Converting Diactritics (accents) to English Letters [SOLVED]

Post by ApexDevil »

Hello,

I'm working on a site that contains names with a lot of international characters (such as Sv?tlé Vý?epní). While somebody from the Czech Republic could type this in just fine, the rest of the world may have a little difficulty! Somebody (lets say from america) searching for this item will type in svetle vycepni. Of course this will not return any results.

So what I'm looking to do is have a search table that contains both the standard Sv?tlé Vý?epní as well as Svetle Vycepni.. Unfortunately I can't find any good scripts to convert the special characters into their standard (looking) text counterparts. Can anybody here point me in the right direction as to how I could do this? I know there are a few scripts floating around that do this very thing, but hours of searching has brought me nothing...

Cheers!

Edit: I just wanted to update this thread to say that I'm not using any asian, islamic, etc letters. Everything is pretty limited to letters that are english readable if you remove the accents (sorry I dont know the proper name of these letters).

Scripts such as:

Code: Select all

$title = "Trípode G5";
$search = explode(",","ç,æ,œ,á,é,í,ó,ú,à,è,ì,ò,ù,ä,ë,ï,ö,ü,ÿ,â,ê,î,ô,û,å,e,i,ø,u");
$replace = explode(",","c,ae,oe,a,e,i,o,u,a,e,i,o,u,a,e,i,o,u,y,a,e,i,o,u,a,e,i,o,u");
$urlTitle = str_replace($search, $replace, $title);
are great for small subsets of the data, but does not cover all of the weird characters I run into (Sv?tlé Vý?epní is a perfect example).

Thanks!
Last edited by ApexDevil on Fri Jan 08, 2010 1:46 pm, edited 2 times in total.
Charles256
DevNet Resident
Posts: 1375
Joined: Fri Sep 16, 2005 9:06 pm

Re: Converting Special Characters for Search (UTF8)

Post by Charles256 »

Maybe http://us.php.net/manual/en/function.co ... string.php that will help. Check out some of the usage examples. Also maybe building an array of special chars and then another array of english equivalents and doing some replacing on strings. Just a thought. Let us know what you come up with please.

Edit:
I like the post below me better. :)
Last edited by Charles256 on Fri Jan 08, 2010 1:20 pm, edited 1 time in total.
User avatar
tr0gd0rr
Forum Contributor
Posts: 305
Joined: Thu May 11, 2006 8:58 pm
Location: Utah, USA

Re: Converting Diactritics (accents) to English Letters (UTF8)

Post by tr0gd0rr »

using strtr() seems like a good option: http://us2.php.net/strtr There are some good examples there--the "normaliza" function looks close to what you need.

You can also look at a full Unicode Charmap to get a more complete character list. Here is one I made: http://kendsnyder.com/sandbox/charmap/
User avatar
ApexDevil
Forum Newbie
Posts: 3
Joined: Fri Jan 08, 2010 12:25 pm

Re: Converting Special Characters for Search (UTF8)

Post by ApexDevil »

Charles256 wrote:Maybe http://us.php.net/manual/en/function.co ... string.php that will help. Check out some of the usage examples. Also maybe building an array of special chars and then another array of english equivalents and doing some replacing on strings. Just a thought. Let us know what you come up with please.
Hey Charles thanks for the reply. Unfortunately the cyr_string function would only work on a small bit of the data (actually.. I'm not sure if I even have any russian characters!). And another problem I've run into with these built in functions is they generally take one country's accents and convert to another.. they dont support the large majority of characters that are used.

And the reason I dont want to build my own is there are literally around 2000 characters that could technically be converted to english... though I'm looking at a subset of around 200ish.
User avatar
ApexDevil
Forum Newbie
Posts: 3
Joined: Fri Jan 08, 2010 12:25 pm

Re: Converting Diactritics (accents) to English Letters [SOLVED]

Post by ApexDevil »

Thanks for the replies everyone. It led me to find this gem (with a few modifications of my own):

Code: Select all

function clearUTF($s)
    {
        $r = '';
        $s1 = iconv('UTF-8', 'ASCII//TRANSLIT', $s);
        for ($i = 0; $i < strlen($s1); $i++)
        {
            $ch1 = $s1[$i];
            $ch2 = mb_substr($s, $i, 1);
    
            $r .= $ch1=='?'?$ch2:$ch1;
        }
        return $r;
    }
 
$search = explode(",", "`,'" . ',"'); 
 
$before = "Sv?tlé Vý?epní";
$after = clearUTF($before);
 
if($before != $after) {
    $after = str_replace($search, '', $after);
}
Credits to mirek at burkon dot org from php.net

Thanks again everyone!
User avatar
tr0gd0rr
Forum Contributor
Posts: 305
Joined: Thu May 11, 2006 8:58 pm
Location: Utah, USA

Re: Converting Diactritics (accents) to English Letters [SOLVED]

Post by tr0gd0rr »

That is a sweet solution! Thanks for sharing.
Post Reply