utf8_strtolower() Feedback

Coding Critique is the place to post source code for peer review by other members of DevNetwork. Any kind of code can be posted. Code posted does not have to be limited to PHP. All members are invited to contribute constructive criticism with the goal of improving the code. Posted code should include some background information about it and what areas you specifically would like help with.

Popular code excerpts may be moved to "Code Snippets" by the moderators.

Moderator: General Moderators

Post Reply
lkjkorn19
Forum Newbie
Posts: 1
Joined: Sat Sep 05, 2009 8:04 am

utf8_strtolower() Feedback

Post by lkjkorn19 »

Hi,

I have programmed a sort of utf8_strtolower function. Basically, it will replace any numeric HTML entity (in the form of ä, for example) to a lowercase equivalent if it exists (e.g. Ë (Ë) becomes ë (ë)).

Now, I have mapped all characters manually, I've got about 650 array elements. Following my example with ë, I would assign my array elements as follows:

Code: Select all

$utf8_strtox[203] = 235;
where the entity with the number 203 would be replaced to 235.

The function works and I can really guarantee that all of those non-ASCII characters will be in the form of a numeric HTML entity.

Here's the full code: http://pastebin.com/fbd61bb7

Example of use:

Code: Select all

echo utf8_strtolower('Ê Ш');
// will echo ê ш
I tried some tests, to create the array (with utf8_strtox_init()) it takes about 0.002023 seconds (average time out of 50 times)

Additionally, I created a random string with 250 HTML entities of uppercase characters and it took PHP an average time of 0.002347 seconds (average out of 50 times) to replace them to lowercase entities. So, in theory, replacing a 250 HTML entity-string will take roughly 0.005 seconds.

I'm not very knowledgeable about PHP efficiency and what eats up its memory and intend on implementing these functions in a popular CMS-script. Can someone tell me if there is anything I should be aware of, anything I should change, etc ?

Thank you. :)
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: utf8_strtolower() Feedback

Post by pickle »

I take it you've done this because strtolower() isn't UTF-8 compatible?

If you have to manually create the matchup (ie: Ë -> ë), I'd do it this way:

Code: Select all

$string = "This is the character: %#203;";
$updated_string = utf8_strtolower($string);
 
function utf8_strtolower($string)
{
  $search = array('%#203;','etc');
  $replace = array('%#235;','etc');
 
  return str_replace($search,$replace,$string);
}
Unless of course I'm missing something.
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
Post Reply