Server Load Question: PHP vs mySQL
Posted: Sun Aug 08, 2010 11:11 pm
Mostly, I just want to confirm my thinking here and hear other points of view.
Let's suppose that you're building a project and you end up in a situation where you can *either*:
A: Request all the data you might need from a table or two and process it with php to send to the user only the data he needs.
B: Use a lot of queries to get the data you need, which will dramatically reduce the amount of processing you have to do in php and reduce the dataset you're working with dramatically.
In my thinking, the choice is really determined by the size of the dataset, and more specifically, how long it takes to return the data. If the total dataset is small and can be returned very quickly, it is better to use php for this, but if it is very large, it can become most desireable to limit the returned dataset as it will take longer and longer to return the whole thing.
As a hypothetical, let's suppose you're coding an index page for articles written by a small group of site administrators. You need to get all the authors from one table, and the article titles, dates, etc from another.
In this index page, you want to group the articles A: acording to date, B: by author, c: by title.
So pseudocode for method A:
[text]
query to get all the authors and load them into an array
query to get all the relevant article bits and load them into an array
Sort/copy Arrays to dump a list of the 10 most recent articles by any author
foreach author
Sort/copy arrays to dump a list of of the 10 most recent articles by a specific author
sort/copy arrays to dump a list of the first 10 articles organised by title
[/text]
So pseudocode for method B:
[text]
query to get the 10 most recent articles
query to get their authors
output the data
query the authors
foreach author
query to get the 10 most recent articles by a specific author
output the data
query to get a list of the first 10 articles organised by title
output the data
[/text]
By my thinking:
If we're talking about 5 (or so) authors writing a total of an article every other day (or about 180 articles per year total) method A is probably better.
If we're talking about a large community writing maybe a total of 20 articles a day (like a forum, with each post being an "article") method B is better.
Though at some point, you need to break it up a bit. Like in case B, it's probably OK to return the last 50 articles in a single query, and in case A, limit the query to articles written in the past year. We could build on this and make it horribly complicated, but I'm *really* trying to keep it simple.
Let's suppose that you're building a project and you end up in a situation where you can *either*:
A: Request all the data you might need from a table or two and process it with php to send to the user only the data he needs.
B: Use a lot of queries to get the data you need, which will dramatically reduce the amount of processing you have to do in php and reduce the dataset you're working with dramatically.
In my thinking, the choice is really determined by the size of the dataset, and more specifically, how long it takes to return the data. If the total dataset is small and can be returned very quickly, it is better to use php for this, but if it is very large, it can become most desireable to limit the returned dataset as it will take longer and longer to return the whole thing.
As a hypothetical, let's suppose you're coding an index page for articles written by a small group of site administrators. You need to get all the authors from one table, and the article titles, dates, etc from another.
In this index page, you want to group the articles A: acording to date, B: by author, c: by title.
So pseudocode for method A:
[text]
query to get all the authors and load them into an array
query to get all the relevant article bits and load them into an array
Sort/copy Arrays to dump a list of the 10 most recent articles by any author
foreach author
Sort/copy arrays to dump a list of of the 10 most recent articles by a specific author
sort/copy arrays to dump a list of the first 10 articles organised by title
[/text]
So pseudocode for method B:
[text]
query to get the 10 most recent articles
query to get their authors
output the data
query the authors
foreach author
query to get the 10 most recent articles by a specific author
output the data
query to get a list of the first 10 articles organised by title
output the data
[/text]
By my thinking:
If we're talking about 5 (or so) authors writing a total of an article every other day (or about 180 articles per year total) method A is probably better.
If we're talking about a large community writing maybe a total of 20 articles a day (like a forum, with each post being an "article") method B is better.
Though at some point, you need to break it up a bit. Like in case B, it's probably OK to return the last 50 articles in a single query, and in case A, limit the query to articles written in the past year. We could build on this and make it horribly complicated, but I'm *really* trying to keep it simple.