Page 1 of 1

better a / better b

Posted: Mon Feb 20, 2006 3:40 pm
by llfitness
feyd | Please use

Code: Select all

and

Code: Select all

tags where appropriate when posting code. Read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url][/color]


I've written a shopping system for our site, with a focus on search engine optimization and templates. Because I read that functions are faster than classes, and no functions is faster (from a compile standpoint) than functions, I wrote pretty much the whole thing with all but a few functions. I've decided that in some cases I absolutely should have used functions when I didn't, and that there are some other places where I could have written faster code, so I'm going back through.

In the example here, I'm trying to add an artifically-generated URL to an array. The array will be passed to a function called array2xml which will, duh, turn the array in to XML. I wasn't quite sure whether it would be faster to have MySQL generate the artificial URL, or to go back through and add it in... so I wrote a script to test this. The results were totally confusing.

In the script below, there is a base case (no URL added), then option 1, then option 2.

I tested the code with the MySQL generation as option 1 and using PHP to add it to the array in option 2. Then I switch options 1 and 2. No matter which one was in option 1, option 1 was always faster. The code to run the timer seems fool proof. What's the deal? Which one should be faster?

The times I got from my program are below. I'm guessing that the first recorded time is a fluke event.

Any comments on (a) which should be faster / better or (b) why this test isn't working as I would expect, ie. with a clear winner, would be appreciated.

Code: Select all

<?PHP
require_once '../environment.php';
require_once '../sessions.php';

// We are testing to see which of two options is faster:
//
// option 1: have MySQL create the URL string and place it appropriately
// option 2: go back through the array returned by MySQL and add the URL.
// 
// Suspect Option 1 will be faster. Also, test a base case to see how much time
// it takes MySQL to generate the URLs.
//
// First Run:
// basecase: 0.0177 s
// option 1: 0.0238 s
// option 2: 2.6358 s
//
// Second Trial:
// basecase: 0.0176
// option 1: 0.0238
// option 2: 0.0323
//
// I ran several more tests. I didn't get any more results above .04 seconds. 
// Now swapping option 1 and 2 just to verify.
//
// typical results:
// basecase: 0.0178
// option 1: 0.0279
// option 2: 0.0305
// 
// why? totally confusing.
//
// Now trying without the foreach of option 1, previously option 2.
// Essentially, basecase and option 1 will be the same.
//
// basecase: 0.0171
// option 1: 0.0179
// option 2: 0.0254
//
// in several trials, option 1 always took slightly longer than basecase, which 
// isn't necessarily the expected result. You would think that once in a while, 
// since they are the same query, option 1 would be faster.
//
// .. 
// removed an "echo" statement. This seemed to reduce the problem. It took about .008
// seconds (guess) off the time of option 1... now o1 is faster than basecase 1 in 5 times.
//
// SWITCHING o1 and o2 again.
//
// Since o2 doesn't have the array manipulation, you would expect it to be around
// the same speed as the base case, faster 1/2 of the time, slower 1/2 of the time, and
// o1 to be slightly slower than both every time, since it has more work to do.
//
// 0.0180
// 0.0235
// 0.0185
//
// 0.0180
// 0.0226
// 0.0175
//
// OK now adding back in the array manipulation, what happens? Also, I pulled an
// 'as name' out of the basecase and o1, since it wasn't in o2.
//
// 0.0176
// 0.0238
// 0.0326
//
// 0.0177
// 0.0239
// 0.0321
//
// OK o2 is clearly longer, as we would expect. Switch o1 and o2
//
// 0.0181
// 0.0281
// 0.0308
//
// Having checked it twice, I'm sure the math on those timers is right. Changing
// some variable names to ... try to get some consistancy.

$timer = microtime();
$timer = explode(" ", $timer);
$begin_basecase = $timer[1] + $timer[0];

$this_configuration_sql = "
	SELECT
		configuration_name, tagline, configuration_code, 
		configuration.meta_keywords, configuration.meta_description, 
		short_description, long_description, popularity, 
		configuration.keywords, configuration.image_id, descriptor, manufacturer.manufacturer_id, requires_bosa, 
		is_consumable, manufacturer_name, manufacturer_description, 
		is_hidden_price, price_display_format, status
	FROM 
		configuration, manufacturer, product
	WHERE 
		product.product_id = configuration.product_id AND 
		manufacturer.manufacturer_id = product.manufacturer_id
	LIMIT 200";

$this_configuration_info = &db_call($this_configuration_sql, SQL_RESULT_SET);

$time = microtime();
$time = explode(" ", $time);
$begin_step1 = $time[1] + $time[0];

$this_configuration_sql = "
	SELECT
		configuration.product_id, configuration.configuration_id,
		configuration_name, tagline, configuration_code, 
		configuration.meta_keywords, configuration.meta_description, 
		short_description, long_description, popularity, 
		configuration.keywords, configuration.image_id, descriptor, manufacturer.manufacturer_id, requires_bosa, 
		is_consumable, manufacturer_name, manufacturer_description, 
		is_hidden_price, price_display_format, status
	FROM 
		configuration, manufacturer, product
	WHERE 
		product.product_id = configuration.product_id AND 
		manufacturer.manufacturer_id = product.manufacturer_id
	LIMIT 200";

$this_configuration_set = &db_call($this_configuration_sql, SQL_RESULT_SET);

foreach($this_configuration_set as $array_id => $array_elements)
	{
	$this_configuration_set[$array_id]['url'] = '/products/' . $array_elements['descriptor'] . '/' . $array_elements['configuration_name'] . '/' . $array_elements['product_id'] . '/' . $array_elements['configuration_id'] . '/1';
	}

$timex = microtime();
$timex = explode(" ", $timex);
$begin_step2 = $timex[1] + $timex[0];

$this_configuration_sql = "
	SELECT
		configuration_name, tagline, configuration_code, 
		configuration.meta_keywords, configuration.meta_description, 
		short_description, long_description, popularity, 
		configuration.keywords, configuration.image_id, descriptor, manufacturer.manufacturer_id, requires_bosa, 
		is_consumable, manufacturer_name, manufacturer_description, 
		is_hidden_price, price_display_format, status,
		concat('/products/', descriptor, '/', configuration_name, '/', configuration.product_id, '/', configuration.configuration_id, '/1') as url
	FROM 
		configuration, manufacturer, product
	WHERE 
		product.product_id = configuration.product_id AND 
		manufacturer.manufacturer_id = product.manufacturer_id
	LIMIT 200";

$this_configuration_info = &db_call($this_configuration_sql, SQL_RESULT_SET);
	
$timer = microtime();
$timer = explode(" ", $timer);
$end_step2 = $timer[1] + $timer[0];

$query_time = $begin_step1 - $begin_basecase;
$step_1_time = $begin_step2 - $begin_step1;
$step_2_time = $end_step2 - $begin_step2;

trigger_error("TIME SUMMARY\n\n	Query: " . $query_time . "\nStep 1: " . $step_1_time . "\nStep 2: " . $step_2_time, E_USER_NOTICE);
error_reporter();
?>

feyd | Please use

Code: Select all

and

Code: Select all

tags where appropriate when posting code. Read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url][/color]

Posted: Mon Feb 20, 2006 4:00 pm
by feyd
You should move the explosions of the times you get to after all the runs are completed. It's skewing your times a bit.

I believe option 1 has a slight advantage over option 2, no matter what actual code that is because it's earlier in the runtime. However, because MySQL is compiled, it can often beat out PHP which still has to interpret the code into the internal calls. It also has the advantage of memory resident caching and other optimizations that PHP can often not have because it does not stay resident in and of itself.

Re: better a / better b

Posted: Mon Feb 20, 2006 9:47 pm
by Christopher
llfitness wrote:Because I read that functions are faster than classes, and no functions is faster (from a compile standpoint) than functions, I wrote pretty much the whole thing with all but a few functions. I've decided that in some cases I absolutely should have used functions when I didn't, and that there are some other places where I could have written faster code, so I'm going back through.
I am assuming that you have known and well documented performance problems that you have been unable to solve. And that your server is currently taking 30+ second to response to any request because of the time it takes to execute the PHP code even when it is not accessing any sub-system or database.
llfitness wrote:Any comments on (a) which should be faster / better or (b) why this test isn't working as I would expect, ie. with a clear winner, would be appreciated.
You code is all SQL and timing code, and almost no PHP application code. Where is the massive performance problem?

Posted: Tue Feb 21, 2006 11:35 am
by llfitness
I am assuming that you have known and well documented performance problems that you have been unable to solve..
No, quite the opposite. My company rents its own server specifically for the purpose of running our PHP code. I'm just looking to ensure that our code generates our websites as quickly as possible. To that end, I want to know which code would be faster: Generating the text string from within MySQL or in PHP.

Optimally, I'm shooting for all pages to be generated in .09 seconds or less. Some of ours are running at .2 seconds, so I'm trying to get that cut down.

Posted: Tue Feb 21, 2006 11:41 am
by feyd
Although I don't know your server's specs, I would suspect it is a major source of bottlenecks. Some other things to look into is PHP (core level) caching mechanisms, memory caching and hot loading, upgrading to PHP 5.1 (with the fast load options enabled)

Posted: Tue Feb 21, 2006 2:49 pm
by Christopher
llfitness wrote:No, quite the opposite. My company rents its own server specifically for the purpose of running our PHP code.
The I would recommend creating well designed and maintainable code, then see if there really are any performance problems.
llfitness wrote:I'm just looking to ensure that our code generates our websites as quickly as possible.
Then use C and assembler mods to something like the Tux webserver.
llfitness wrote:To that end, I want to know which code would be faster: Generating the text string from within MySQL or in PHP.
The real answer is that it depends. But I have found that database access is the major part of most page execution time.
llfitness wrote:Optimally, I'm shooting for all pages to be generated in .09 seconds or less. Some of ours are running at .2 seconds, so I'm trying to get that cut down.
I guess I would ask why? Given that user experience is a range of factors and you can't control network load. Ultimately any sort of performance targets like those are probably solved with database tuning/caching, hardware and front end caches.

Posted: Tue Feb 21, 2006 7:45 pm
by John Cartwright
You should consider caching your dynamic pages.. that would seriously reduce load times. Although as mentioned before, most of the bottleneck will likely be database calls, unless your doing some significant PHP processing (such as regex).

I would recommend first identifying chunks of code that take the longest to process, and determine if the cost of time to fix this is greater or less than the cost of upgrading your server. It is likely less expensive to upgrade your server.

Posted: Tue Feb 21, 2006 9:29 pm
by Christopher
Looking at your SQL, the query times should be nearly identical for each. Query times are largely dictated by the WHERE clause and the amount of data returned. Generally calculations done in the database server have little effect. So it makes sense that any PHP code would be the difference (about 0.005 seconds).

fine, but which is BETTER?

Posted: Tue Mar 07, 2006 9:36 am
by llfitness
OK then, from a usability / readability / manageability perspective, which is better? I want to be able to perform queries and dump them straight in to an array2xml function but I also want to include these dynamically generated URLs. Should I generate the URLs through MySQL or go back through the results and add in the URLs?

Posted: Tue Mar 07, 2006 8:16 pm
by lastcraft
Hi...
llfitness wrote: Optimally, I'm shooting for all pages to be generated in .09 seconds or less. Some of ours are running at .2 seconds, so I'm trying to get that cut down.
The web browser deliberately does nothing for the first .25 seconds of a page load. Reducing the script running time below that is pointless.

Is it response times for users, or server load that is bothering you? If it's web server load, then the script running time is irrelevant. A loaded server will report higher times without changing the script. If it's DB server load, then you enter a complex area.

Server loads are usually simple economics. How much will it cost you to add a web server? $400 a year? Maybe you spend $1000 for the DB servers. Could you afford $200 on extra RAM?

If your hardware costs are more than that, say you have several DB servers, why not buy some DB consultancy at $300 per hour? You will spend many days trying to optimise queries if you don't know the SQL engine.

Every 3ish days that you spend optimising directly costs your company $1000. Even more if it's a growing company, because of lost opportunity cost. Factor in holidays, office space, learning the job and sick days and it's $600 per working day. By tomorrow all of your hard work will have been in vain, as you will actually be losing money.

I'd play the hardware card and go back to writing features.

yours, Marcus