Page 2 of 2

Posted: Tue Aug 29, 2006 6:37 am
by Ambush Commander
Opcode caches speed parse times: the engine still has to execute that code and build an in-memory representation of the lookup table. I'm going off observations by MediaWiki developers that unserialize is extremely fast.

Posted: Tue Aug 29, 2006 6:48 am
by wei
the usual bottle necks are disk IO, network delays, thus a large performance can be gained by first looking at these two problems. Tools such as ptrace, strace are very useful to find where the bottle necks might be.

http://www.schlossnagle.org/~george/tal ... e%20pdf%22

there was a pdf slide using ptrace a few weeks back, lost the link.

Posted: Tue Aug 29, 2006 10:09 am
by Weirdan
well, you could convert your string from source to target charset using //IGNORE option, then convert back and look for differences. Every char missed from double-converted string should be encoded using html entity.

Not an elegant solution, of course :) I would prefer to have an ability to set //TRANSLIT callback.

Posted: Tue Aug 29, 2006 1:19 pm
by Ollie Saunders
Those of you who think you know about performance might be able to help Astions.

Posted: Tue Aug 29, 2006 1:37 pm
by Ambush Commander
Actually, Weirdan, that's quite an interesting solution. It solves the need to build lookup tables, although you still need to be able to parse the UTF-8 to figure out what precisely to put into the html entity.

Posted: Tue Aug 29, 2006 8:25 pm
by Ambush Commander
I did more thinking about the technique, and it only works nicely for fixed-length encodings (especially 8-bit ASCII-compatible ones). Everything else and you have to implement a character gobbler (I'm sure that's not the term for it) for each encoding you want to support, which is almost as bad as having to setup lookup tables.

Posted: Wed Aug 30, 2006 2:29 am
by Weirdan
usually there's one 'main' encoding with widest character range, with multiple input and output encoding (done via iconv or something). So there's a need for only one 'gobbler' (still not sure what did you mean by that).

Posted: Wed Aug 30, 2006 1:21 pm
by Ambush Commander
You're right, brainfart. :-P Actually, it's a pretty elegant solution. I think I'll do that.