Page 2 of 2

Posted: Mon Oct 08, 2007 3:51 pm
by mrkite
scottayy wrote: The point of that example is you can't fix a problem before it exists. Let it exist first, and then optimize.
The problem exists the moment you write it.. especially for web apps. You can never be sure how much traffic you'll get tomorrow.

http://www.acm.org/ubiquity/views/v7i24_fallacy.html
http://blogs.msdn.com/ricom/archive/200 ... 43245.aspx

If you've ever said "premature optimization is the root of all evil" you should read those two articles.

Posted: Mon Oct 08, 2007 4:52 pm
by ReverendDexter
Well, there's something to be said here for proper timing of optimization, too. Optimizing for 30K hits/day will not be optimized for 1K hits/day. That's the whole point of optimizing, is that you're tuning the software to the situation. Until you're in the situation, you can't know exactly what it is, so how can you tune for it?

Yes, you should do best practice for what you believe the situation will be. However, I would argue that, symantically, you can *NOT* "optimize" for an unencountered situation. It's like expecting the unexpected.

Posted: Mon Oct 08, 2007 5:37 pm
by mrkite
ReverendDexter wrote:Well, there's something to be said here for proper timing of optimization, too. Optimizing for 30K hits/day will not be optimized for 1K hits/day.
A site optimized for 30K hits a day will run rather peachy if it only gets 1K hits a day. The same can't be said for the other way around.

There's no excuse to for not stress testing your application. Flood makes it easy to bombard a server and profile it. For stress testing mysql, just write a script to create thousands of bogus rows.

Testing web apps is fairly easy, only two major vectors change over time. The amount of data in your mysql database, and the number of simultaneous users. It's not a mystery. It's not difficult to see how your scripts will stand up to extremes in either area.

If you wait until you have problems, then you'll be stuck trying to fix it while the server is practically unusable.

Posted: Mon Oct 08, 2007 6:38 pm
by aliasxneo
scottayy wrote:I didn't restructure my database. And I didn't write that much code. But even if I had to do so, yes. I couldn't pinpoint that that was going to be the problem ahead of time. Plus I would've been making php run more code at the beginning than I had to. Plus there's a lot more room for error. The point of that example is you can't fix a problem before it exists. Let it exist first, and then optimize.
And if you have this rare skill of foreseeing problems as in my case? I know for a fact that my website is going to get huge traffic, it's non-negotiable, I don't need to wait for it to come and then fix it if I already know it's coming. I understand your point, but it simply doesn't work in my case. I see the problem, and instead of pointlessly waiting for it to arise, I'm taking simple precautions to try and degrade/prevent it when it does arise. Thanks for the link though, I'll check it out.

Oh, and while I'm here, I was reading a small guide on indexing, and it mentioned something about updating indexes but failed to give any code. Is updating indexes something MySQL does automatically or is there a query I need to run?

Posted: Mon Oct 08, 2007 8:14 pm
by s.dot
The indexes get updated each time a row is altered/deleted/updated. The more indexes/data you have, the slower the indexing becomes, although it's rarely ever a problem.

Posted: Mon Oct 08, 2007 8:20 pm
by Benjamin
This is not premature optimization. The traffic is known. The ability for a single server to handle this traffic varies greatly on the type and complexity of the queries. On a site getting this much traffic there should be funds available for a DBA. If not, you'll be lucky if the server survives the first 3 hours after you go live with just a few configuration changes.

You're most likely going to have to set up some slave servers and rewrite a good chunk of your PHP code so that it queries the slave servers. Good luck and have fun!

Posted: Mon Oct 08, 2007 8:22 pm
by s.dot
aliasxneo wrote:And if you have this rare skill of foreseeing problems as in my case? I know for a fact that my website is going to get huge traffic, it's non-negotiable, I don't need to wait for it to come and then fix it if I already know it's coming. I understand your point, but it simply doesn't work in my case. I see the problem, and instead of pointlessly waiting for it to arise, I'm taking simple precautions to try and degrade/prevent it when it does arise. Thanks for the link though, I'll check it out.

Oh, and while I'm here, I was reading a small guide on indexing, and it mentioned something about updating indexes but failed to give any code. Is updating indexes something MySQL does automatically or is there a query I need to run?
You might guess a few of the problems correctly, and you might even beat them to the punch. But I can almost guarantee that if you get this high volume traffic, you will find new situations that you didn't think of. A code you suspected of being optimized really may just be slowing things down.

You can go with your own theory :) But my theory is it ain't broke till it's broke. Then you fix it. I don't like guessing, so I'm not about to guess about a particular problem. I'll wait till it's fact.

Posted: Tue Oct 09, 2007 10:10 am
by ReverendDexter
mrkite wrote:A site optimized for 30K hits a day will run rather peachy if it only gets 1K hits a day. The same can't be said for the other way around
Very true, but it's not optimized for that volume of traffic.

Posted: Wed Oct 10, 2007 3:52 pm
by VisualD
scottayy wrote: You might guess a few of the problems correctly, and you might even beat them to the punch. But I can almost guarantee that if you get this high volume traffic, you will find new situations that you didn't think of. A code you suspected of being optimized really may just be slowing things down.

You can go with your own theory :) But my theory is it ain't broke till it's broke. Then you fix it. I don't like guessing, so I'm not about to guess about a particular problem. I'll wait till it's fact.
The point is that you dont have to guess, if you've managed to create a site with a performance problem that hasnt already been discussed 1000's of times in the various communications mediums then I take my hat off to you. Forewarned is Forearmed and all that.

Things like good indexing and caching arent "Premature optimisation", they are best practice. As someone else said, experience is the key here and waiting for (scaling) problems to arise (which implies bad design), is wrong on a number of levels. Waiting for the problem to arise before taking preventative action is like treating the symptoms but not the disease.

To the original poster, in terms of scalability personally I take a high estimate of my load/bwidth, add 50% percent and design to that. But even then im not thinking to myself "Right, im never going to imagine anything more. That figure is the highest it will ever be.")

Serious scalability is as much a matter of design as it is of implementation, and as im sure all developers know fixing a broken design after its been deployed and without affecting users is hard.

Also the type of design you create should be based on the expected use of the site, so a forum should be using partitioned tables or the equivalent, a blog should be using static caching (html generated from the db once, and then cached) etc...

If your asking how indexes work, then I suggest you take another look at your db schema, and possibly buy a book about your choice of rdbms. Indexes are a vital component in good scalability, every table should have an indexed primary key at least, and any commonly searched, filtered or sorted field should have an index.

Caching is generally best handled by your framework (php in this case) and sometimes also at the db tier. Its possible to roll your own caching routines using xml(or even csv, lol) as a static intemediary, but its not recommended. No point re-inventing the wheel.

OOP is your friend. Creating a good oop structure will allow you to optimise problem areas without affecting the rest of the app. As long as the interfaces stay the same the code inside can be gutted and rewritten with zero dependency issues (unit testing helps here). Again this requires good design, and is not easy to do right, but the benefit is substantial.

As someone else said, be prepared to throw your first version away, always. You will learn more about the problems by working through them than can ever be obtained by just thinking about it. The second version you write will be far superior by way of foreknowledge and practice, and its often quicker to rewrite than to refactor.
So I should just release my website as is (without any prior modifications) to the public and only try to fix it when the public complains?
Yes and no. There comes a point where you have to say ok, I've done enough here, its not perfect, but its close enough that any further modifications will be reconfiguring existing code/schema and not replacing them in situ. Exactly where this point is, again, comes from experience. Always design and implement with performance in mind, its a lot easier in the long run.