utf8_strtolower() Feedback
Posted: Sat Sep 05, 2009 8:19 am
Hi,
I have programmed a sort of utf8_strtolower function. Basically, it will replace any numeric HTML entity (in the form of ä, for example) to a lowercase equivalent if it exists (e.g. Ë (Ë) becomes ë (ë)).
Now, I have mapped all characters manually, I've got about 650 array elements. Following my example with ë, I would assign my array elements as follows:
where the entity with the number 203 would be replaced to 235.
The function works and I can really guarantee that all of those non-ASCII characters will be in the form of a numeric HTML entity.
Here's the full code: http://pastebin.com/fbd61bb7
Example of use:
I tried some tests, to create the array (with utf8_strtox_init()) it takes about 0.002023 seconds (average time out of 50 times)
Additionally, I created a random string with 250 HTML entities of uppercase characters and it took PHP an average time of 0.002347 seconds (average out of 50 times) to replace them to lowercase entities. So, in theory, replacing a 250 HTML entity-string will take roughly 0.005 seconds.
I'm not very knowledgeable about PHP efficiency and what eats up its memory and intend on implementing these functions in a popular CMS-script. Can someone tell me if there is anything I should be aware of, anything I should change, etc ?
Thank you. :)
I have programmed a sort of utf8_strtolower function. Basically, it will replace any numeric HTML entity (in the form of ä, for example) to a lowercase equivalent if it exists (e.g. Ë (Ë) becomes ë (ë)).
Now, I have mapped all characters manually, I've got about 650 array elements. Following my example with ë, I would assign my array elements as follows:
Code: Select all
$utf8_strtox[203] = 235;The function works and I can really guarantee that all of those non-ASCII characters will be in the form of a numeric HTML entity.
Here's the full code: http://pastebin.com/fbd61bb7
Example of use:
Code: Select all
echo utf8_strtolower('Ê Ш');
// will echo ê шAdditionally, I created a random string with 250 HTML entities of uppercase characters and it took PHP an average time of 0.002347 seconds (average out of 50 times) to replace them to lowercase entities. So, in theory, replacing a 250 HTML entity-string will take roughly 0.005 seconds.
I'm not very knowledgeable about PHP efficiency and what eats up its memory and intend on implementing these functions in a popular CMS-script. Can someone tell me if there is anything I should be aware of, anything I should change, etc ?
Thank you. :)