Page 1 of 2
Performance of PHP4 objects vs procedural and C++.
Posted: Fri Oct 07, 2005 6:29 am
by Heavy
I wrote a very cool parser in PHP4 that can take "any" kind of textual syntax definition and test a document for an exact match. Then I translate the results into php code and do whatever I like with it. I find its use in not just template engine stuff with fancy expression support, but also to produce complete scripting languages, useful when designing a complex CMS for example (which is exactly how I use it).
However, the bastard is awfully SLOW. It can take up to 10 seconds to process just one request, if the document is longer than a few 50 lines.
I realize that my code might be full of bloat, although it is actually clean enough to be hard to optimize further without drastical changes to the code design.
So I am wondering if objects in PHP are really slow, and I will benefit from rewriting it into a big procedural hack with a large associative array as the data container in place of the objects I use today - or if it is a dead end project and I should consider porting this to C++ and make it an extension to PHP and install it as a special library on the server?
It isn't very much data I am handling, and I am having a hard time spotting any particular bottle neck in the code. I have run it through some profiling as well, but it is a recursive code design which makes it quite difficult to understand what the results mean.
What do you say? are PHP4 objects significantly slower than their procedural counterparts, or do I need to abandon PHP to speed things up?
Posted: Fri Oct 07, 2005 7:57 am
by Heavy
I guess I could stage a test case with a similar process, A being OOP and B being procedural, and try to measure the differences. Anyone else already done this?
One thing I am suspicious about is that I cannot figure out if it is the fine granularity of the results that make things slow. Each resulting document token is one element in an associative array that gets quite lengthy. 1000 entries isn't unusual for a normal xml based template. Maybe it is this array that gets nasty when its large. (And the process involves quite some array_merge() operations)
A linked list must be iterated through for the element X to be found, making the lookup very slow if the list is long. The access time to the "last minus one" element grows in proportions to the length of the list. (I didn't look that up in any book right now so if it isn't exactly true, just read my point

).
How's the access times to a PHP associative array? I have always assumed it is quite fast, and haven't faced the question until now.
Posted: Fri Oct 07, 2005 8:23 am
by feyd
php4's object handling is insanely loose. Execution speeds in php generally aren't the best, but it makes up for it in flexibility. If you used a script preparser to cache and condense your code it will execute faster, but only so much really..
Comparing execution times between php and C is probably futile; C will likely win hands down every time, by far.. Something you could research into is creating a Zend API extension of your code if you are unable to optimize it further.
Using a string parser as opposed to regex would often speed things more, if you aren't already.
Posted: Fri Oct 07, 2005 8:31 am
by Weirdan
There were bugs in handling of large arrays, dunno if they were fixed and how much of them left unnoticed. Usually people don't use large arrays in php, you know...
Here's some relevant info:
http://marc.theaimsgroup.com/?l=php-dev ... 130231&w=2
http://bugs.php.net/bug.php?id=19499
Posted: Fri Oct 07, 2005 8:36 am
by Heavy
feyd:
According to profiling it isn't the preg_matching that is time consuming.
And the times I have measured are within the script. BTW, the relation between script parsing time and script execution time is so great that compiled state caching is quite insignificant.
I am not comparing PHP performance to C performance. I wonder if the OOPing in PHP is the lag or if it would be just as slow if I convert it to procedural PHP instead. If it looks like a dead end, I probably will try to investigate the efforts required to port the parser into a php extension.
Posted: Fri Oct 07, 2005 9:01 am
by Buddha443556
PHP usually executes OO or precedural code just as fast. Sounds like your doing two things PHP isn't good at large arrays and recursion. Don't screw up the nice OO code, go with C++ and make an extension. Just my 2 cents.
Posted: Fri Oct 07, 2005 9:04 am
by Heavy
Weirdan wrote: Usually people don't use large arrays in php, you know...
Don't they?

Well, this is a parser. I need to store the data somewhere along the way to the result.
Thanks. I am running a linux laptop so I don't know if all the points made matter. (I could of course run the tests myself. But the day ends just about now so... I feel like doing something else.)
Alright. A giant super cool universal ultra flexible parser maybe is beyond what PHP is supposed to handle efficiently

Anyone knows if the array backend has improved with Zend Engine 2 ? They do mention the scripts run a little faster. If the "little faster" comes from optimization of the exact issue I am having, then maybe it'll speed things up significantly.
What I am really doing too often:
* Fire off new subparsers to look ahead in the doc = instantiating new parser objects.
* appending results on success to existing $arrDoc (which is a two dimensional array [1])
So it might be that the very two things PHP has been unnecessarily slow at have improved (greatly) in PHP5. But I don't have access to PHP5 just yet. I am running gentoo and the PHP5 ebuilds are not released as stable yet. Oh! the overlay site [2] is up. Maybe I can get php5 to work for me tonight!
[1] it holds stuff like name of the token, actual chunk string, syntax depth, etc., etc.
[2]
http://php.portage-overlays.org/
Posted: Fri Oct 07, 2005 9:21 am
by Heavy
Buddha443556 wrote:PHP usually executes OO or precedural code just as fast. Sounds like your doing two things PHP isn't good at large arrays and recursion. Don't screw up the nice OO code, go with C++ and make an extension. Just my 2 cents.
You know, I was in the process of changing some stuff in my house back around easter this year. It is only a temporary solution because I actually need to make some bigger changes before any lady would find it pleasant... So I was in the middle of doing this first temporary thing when a friend said:
- Well... Why not make the big changes immediately instead of building this now and tearing it away next year?
And that was a short snappy line that made me totally flip over. Just like that, I stopped working on what I was doing and started planning "THA BIG CHANGE" instead. Of course it all crossed all deadlines you could imagine and even today (6 warm healthy months later) I haven't produced any clearly visible results. But I really DO have a nice paper to show off with stating how great it will be... next year... maybe... If I can focus on the problem...
What you just said felt just like that, and this is my weakest spot, I tell you. You said what the programming perfectionist inside me wanted to hear and "BAM" I want to do it.
Hahaha... weeell weeeell... I'd better count to 2048 and calm down. I hate the effects of this kind of inspiration. (Like pain in the back and pale skin

).
Posted: Fri Oct 07, 2005 9:54 am
by Buddha443556
What you just said felt just like that, and this is my weakest spot, I tell you. You said what the programming perfectionist inside me wanted to hear and "BAM" I want to do it.
Hahaha... weeell weeeell... I'd better count to 2048 and calm down. I hate the effects of this kind of inspiration. (Like pain in the back and pale skin Rolling Eyes).
LOL What you want for 2 cents?
Like pain in the back and pale skin
Nope not an easy solution and could it be a real pain in the ***.
Would be nice if they fixed this in PHP5.
Posted: Fri Oct 07, 2005 10:41 am
by Heavy
2046 ... 2047 ... 2048
I think I'll try the effects of running php5 first.
Posted: Fri Oct 07, 2005 10:51 am
by feyd
may want to try out 5.1 as well.. with the really fast call mode.. that may help a bit.

Posted: Fri Oct 07, 2005 11:23 am
by Heavy
Posted: Fri Oct 07, 2005 11:40 am
by McGruff
Ten seconds does sound a bit long, but I don't know exactly what you're doing or what size of files you're working with. If you'd like to compare and contrast another parser implementation, check out
SimpleTest (simpletest/parser.php).
Posted: Fri Oct 07, 2005 11:36 pm
by Christopher
Speed is usually more about algorithm choice than anything else. We can't tell from what you have written why your script is slow so can't really give you much advice regarding choice of paradigm or PHP version. I doubt that it is OOP or the PHP version that is causing it to be slow, unless you are doing something repeatedly that just happens to be slow because of the parser or some library. Show some code.
Posted: Sat Oct 08, 2005 4:21 am
by Heavy
My parser is quite general.
Because of this I need to do lots of checking/data juggling. I've been wondering whether I should optimize it with the shotgun or the scalpel, ie scrap the design initiative or just find small pieces could be improved.
So I accept that my design might be the error. I am however reluctant to start over.
I comment sparingly. Usually only put comments in the code to describe the bigger thoughts, as my naming is quite descriptive. (I think

)
The code of the main file is 700 lines, full of debugging statements and I am not sure the comments are up to date.
http://widarsson.com/stuff/class.codeparser.php.txt
There are two classes. You need to scroll down to the last page to see the second.