That sounds like a process that would be slow no matter what language you used. And no matter what language you used I would recommend that you found a library or program specifically for that kind of drawing/plotting. No language that I know of does graphics without external libraries or applications.cj5 wrote:My issue arose with two scripts I was working on. One script I am currently working on is a shapefile parser. Now, if you are not familiar with shapefiles, they are typically very bulky, containing a huge amount of data (i.e. a map of the US and all it's states include all of the coordinate mappings for each state, in order to plot it's shape). In most cases the script will run long but not timeout, even with streamlined code in place. My worry is when I use GD to draw the shapes based on the mappings, that the script will become even more load intensive, and will become less practical.
Coding for speed, and dodging timeouts
Moderator: General Moderators
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
(#10850)
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
He is using an external library...namely GDAnd no matter what language you used I would recommend that you found a library or program specifically for that kind of drawing/plotting.
Didn't BASIC have built in support for drawing...albeit a very limited drawing canvas, but still...No language that I know of does graphics without external libraries or applications
Cheers
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
In that case, I'm confused by what you mean by plotting?I believe from his comments that the time consuming part is plotting his mapping data to give to GD. Although the GD rendering part may also be time consuming
If he has a file full of coordinates, they are already plotted and just need to be rendered...unless of course he's planning on representing the coordinates in a fashion other than how they are stored. 3D graph or something?
In which case simple viewport clipping should do the trick and make a huge difference...no need for any third party libraries, etc...
Of course this assumes he's using 2D data and not 3D data, in which case a third party library would likely speed up development time substantially...and a third party extension would be a great performance boost
p.s-Note: to the author...you could also use something like: http://www.flashmaps.com/ so everything is rendered on the client side...from my understanding, you basically provide these Flash map generators an XML file or CSV with your coordinate data in a format the Flash understands and it's renders your data real time...not sure if the one I link to does that or requires generation via server side, but in any case, I bet there are maps which will render client side...so if performance becomes an issue, just look into those or search Google
Cheers
Hey there,
The points that Hockey mentioned about the rendering of the maps are very important for that stage of the process. I should hope that the page that is used to view the rendered map doesn't re-render on every hit. I would build the cached version offline and update when it's required. Caching is the single biggest factor in increasing your web performance. Cache everything you can.
As for PHP being interpreted and being "slow", you have to make that judgement call yourself. The whole point of an interpreted language is that in order to see your changes, you simple save and refresh. You don't need to compile/link or any of that junk. Development speed is more valuable than processing speed to a PHP developer.
Also remember that PHP 5.1 is about 10% faster than a 4.0.x version. If you have some cash (or can bill the client) get the Zend optimizer - it half compiles your text PHP scripts into a bytecode similar to what Java does. You can get another 10% there. Finally, if you really want to get raw blazing power think abotu writing an extension in C. That way you can call it in your scripts as before, but have plenty of raw speed.
As for why count isn't optimized. Loops are optimized based on the ability to unwind a loop into a linear structure:
gets unwound as (although this is technically in assembly language):
some really good compilers would see that the whole loop was pointless and just assign 3 to $blah. If you embed the count() in the for definition, it HAS to call that function every iteration.
Optimization is hard without knowing the underlying structure.
The points that Hockey mentioned about the rendering of the maps are very important for that stage of the process. I should hope that the page that is used to view the rendered map doesn't re-render on every hit. I would build the cached version offline and update when it's required. Caching is the single biggest factor in increasing your web performance. Cache everything you can.
As for PHP being interpreted and being "slow", you have to make that judgement call yourself. The whole point of an interpreted language is that in order to see your changes, you simple save and refresh. You don't need to compile/link or any of that junk. Development speed is more valuable than processing speed to a PHP developer.
Also remember that PHP 5.1 is about 10% faster than a 4.0.x version. If you have some cash (or can bill the client) get the Zend optimizer - it half compiles your text PHP scripts into a bytecode similar to what Java does. You can get another 10% there. Finally, if you really want to get raw blazing power think abotu writing an extension in C. That way you can call it in your scripts as before, but have plenty of raw speed.
As for why count isn't optimized. Loops are optimized based on the ability to unwind a loop into a linear structure:
Code: Select all
$blah=0;
for($i=0; $i<4; $i++) {
$blah=$i;
}Code: Select all
$blah=0;
$blah=1;
$blah=2;
$blah=3;Optimization is hard without knowing the underlying structure.
I was just wondering how this optimizes? Do you have a code example that performs some empirical testing?An optimization which I discovered...that also deals with arrays...
I personally found that setting an array element to FALSE instead of unset() and then calling:
Code: Select all
array_values(array_filter($arr))
The above cleans recalculates the indicies and removes all elements which are FALSE...in the reverse order, but still
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
ToucheDevelopment speed is more valuable than processing speed to a PHP developer.
While this is true in compiled languages, I'm not sure PHP itself would perform this kind of optimization analsys, as it would cause more harm than good...perhaps Zend optimizer does this when compiling to byte code, but...I can't see PHP interpreter doing this...As for why count isn't optimized. Loops are optimized based on the ability to unwind a loop into a linear structure
While I agree a good compiler would likely do this, I would argue that, any interpreted language wouldn't go that far, and likely just interpret code as is, leaving architechure specific optimizations to the programmer...some really good compilers would see that the whole loop was pointless and just assign 3 to $blah. If you embed the count() in the for definition, it HAS to call that function every iteration
When dealing with compiled languages there are many caveats which a compiler can and will take care of, such as loop unwinding, using shifts in place of normal DIV instructions, etc...but often these are ignored in interpreted environments, because they are machine agnostic, so optimizations like this *might* not make sense...
Here is an article I wrote on the subject:
http://www.codeproject.com/cpp/profiler.asp#xx1444259xx
Ignore the comments by people informing me that a good compiler would perform these optimizations automatically, as they clearly didn't get the *point* of the article
It was intended to be an overall introduction into optimizing, etc...obviously I was aware of compilers abilities to auto-optimize
Anyways, many of those principles could likely be applied to interpreted languages as again I would argue that interpreters likely don't optimize to much, as it would require a fare amount of code analsys...best left for a compilation step, not interpretation...
The point is, I agree count() performs poorly but it's not because PHP doesn't unwind loops (cuz clearly it never does) but rather because every iteration of a loop must execute it's expressions, in this case count().
After some thought, I've come to the conclusion this is impossible to optimize at the interpreter level because there is no way of telling whether the programmer really meant to cache the value of count().
What if inside the loop, elements are added to the array count() is counting? That value must change, so trivial caching anyways, simply wouldn't work...
Thus the reason I'm willing to bet PHP developers haven't bothered to optimize it any...
Not sure what you meant here, but I assume you mean system architecture, in which case...PHP would be impossible to perform low-level optimizations, as it's capable of running on *almost* any computer on the planet, from Mainframes to 286's in my basementOptimization is hard without knowing the underlying structure.
I'm sure your tired of hearing me talk so i'll make it breif as possibleI was just wondering how this optimizes? Do you have a code example that performs some empirical testing
Me and Feyd has a disscussion a while back about how PHP allocates memory, etc...
From that disscussion and doing some research I concluded that:
1) unset() deallocates the memory previously used, so other parts of the script can now use that memory. How this de-allocation works is beyond me as I was not interested in reading the PHP C source and learning first hand.
From what I read it seemed people where having positive experiences calling unset on large arrays *before* functions returned...however this was determined (from what I could tell) by people using Windows resource moniter to observe memory usage...
They were saying that by calling unset() before returns, resource monitor indicated reduced memory consumption...
This suggests that unset() is calling the systems underlying malloc, free, etc functions
However I am also under the impression that PHP likely uses it's own memory allocation functions as PHP has built in garbage collection facilities, so I wonder that PHP's unset() perhaps might call it's own zend_free() type function, which (to my understanding) doesn't actually free memory so anything else on the system can use it, but rather marks it as availble space only under the context of the currently executing script.
I'm starting to loose myself here, as this topic is rather complex, especially as no one but the PHP develoeprs really know whats oging on with memory allocation, etc...
Anyways, my reasoning as to why calling array_filter() instead of unset() goes like this, while understanding what I said above:
unset() does actually free memory...as in...the system can use it, not just the script...
Freeing memory, requires re-adjusting memory descriptor tables, etc...which is a potentially an intensive process...but then the rest of the system can use it, which is generally a good idea...but not always
Assigning a element to FALSE requires very little outside of an assignment operation...as it's not likely freeing memory under PHP's context or under the systems context...it's only marking as potentially free as you can change FALSE back to something else if desired...(I assume)
array_filter() however free's the memory, atleast under the context of PHP, not sure if it's system wide like I assume unset() is...
So...
In a loop, if you are calling unset() every iteration and it is indeed freeing memory system wide, that is taxing on the system...
Even if it is freeing it under only the context of PHP...thats going to require more managment than a trivial assignment.
Marking an elelement as FALSE and then at the end of script execution calling a single array_filter() is likley going to be much faster, as it just makes sense from a technical standpoint...so I likley wasn't day dreaming when I profiled that code and noticed the difference...
All databases that I know of, in fact use this flag and continue technique whenever they can...as it's much faster than removing a record from disk, recalculating offsets and indicies, etc...
you can consider array_filter() the likely faster equivelant of doing this:
Code: Select all
$cnt = count($arr);
for($i=0; $i<$cnt; $i++){
if($arr[$i] === FALSE)
unset($arr[$i]);
}This is my take on the subject anyways
Cheers