I'd say no. Just think about it. Unless you'll cache the results somewhere (which isn't a good idea in a forum system), you'll be running a query for every post count that ever appears on any page. Every time someone flips the page in a thread, bang, another set of queries for each user. That's an extra query for every user. Instead of just querying for flat data from the user table, you'll be running a "dynamic" query on the entire posts table for each user. Ouch. That'll quickly become too slow to be manageable. You'd clearly notice the difference once the forum gets to the medium-sized range. Maybe even in the lower-medium ranges depending on your hardware.seodevhead wrote:Well with this being said, is it still efficient to use COUNT() for my purposes? Ideally I would love to use count() just because it is so easy and will always return the right answer. Thanks for the good input guys.
How should I determine # of posts
Moderator: General Moderators
- John Cartwright
- Site Admin
- Posts: 11470
- Joined: Tue Dec 23, 2003 2:10 am
- Location: Toronto
- Contact:
These querys were done in iterations of 50,000
Use of SELECT COUNT(*) as `postCount` FROM `posts`
Test1
Total Rows: 5000
Total Average: 0.241804
Test2
Total Rows: 10000
Total Average: 0.2348
Test3
Total Rows: 15000
Total Average: 0.231798
Test4
Total Rows: 100000
Total Average: 0.249216s
From these benchmarks, we can see that even a user with 100,000 post is processed just as fast as a user with a mere 5000 posts.
hmmmmm....
Use of SELECT `postcount` FROM `users` LIMIT 1
Test1
Total Rows: 1
Total Average: 0.332324
According to these benchmarks, the use of count(*) is actually faster than selecting the row value.
Use of SELECT COUNT(*) as `postCount` FROM `posts`
Test1
Total Rows: 5000
Total Average: 0.241804
Test2
Total Rows: 10000
Total Average: 0.2348
Test3
Total Rows: 15000
Total Average: 0.231798
Test4
Total Rows: 100000
Total Average: 0.249216s
From these benchmarks, we can see that even a user with 100,000 post is processed just as fast as a user with a mere 5000 posts.
hmmmmm....
Use of SELECT `postcount` FROM `users` LIMIT 1
Test1
Total Rows: 1
Total Average: 0.332324
According to these benchmarks, the use of count(*) is actually faster than selecting the row value.
Last edited by John Cartwright on Sat Dec 03, 2005 8:01 pm, edited 2 times in total.
- AKA Panama Jack
- Forum Regular
- Posts: 878
- Joined: Mon Nov 14, 2005 4:21 pm
That test is flawed because it isn't testing a comparison value. IE WHERE user_id='$userid' in the query.
Most databases will have the total record count stored for a table and readily retrievable without having to scan the database table. The query you used will pull that value and not scan the entire table looking for matches. If there is a WHERE statement in the SQL statement then it is forced to scan every record in the table counting those that match the WHERE statement. The more complex the WHERE statement the more time it takes to get the count of records.
Say you have a table with 100,000 records and 10,000 of those records are for user_id number 2.
SELECT COUNT(*) as `postCount` FROM `posts`
will be faster than
SELECT COUNT(*) as `postCount` FROM `posts` where user_id='2'
and getting the post count from the users record will be alot faster than
SELECT COUNT(*) as `postCount` FROM `posts` where user_id='2'
And in many cases all of the information for a user is already retrieved at the start so there isn't an extra query call needed to get their post count and that saves a ton of time.
As a side note the use of the LIMIT flag is good for records at the beginning of a large table but not as much for records at the end of the table. Say you have a 1 million record table and want two records. One record is at position 1 in the table and the second record is at position 1 million in the table and you use LIMIT 1 on both. The record at position 1 million in the table will take as long to retrieve as if you didn't use LIMIT 1. Even with that it is best to use LIMIT flags where you can in your queries as it does reduce the overall load on the database server and speeds things up.
Most databases will have the total record count stored for a table and readily retrievable without having to scan the database table. The query you used will pull that value and not scan the entire table looking for matches. If there is a WHERE statement in the SQL statement then it is forced to scan every record in the table counting those that match the WHERE statement. The more complex the WHERE statement the more time it takes to get the count of records.
Say you have a table with 100,000 records and 10,000 of those records are for user_id number 2.
SELECT COUNT(*) as `postCount` FROM `posts`
will be faster than
SELECT COUNT(*) as `postCount` FROM `posts` where user_id='2'
and getting the post count from the users record will be alot faster than
SELECT COUNT(*) as `postCount` FROM `posts` where user_id='2'
And in many cases all of the information for a user is already retrieved at the start so there isn't an extra query call needed to get their post count and that saves a ton of time.
As a side note the use of the LIMIT flag is good for records at the beginning of a large table but not as much for records at the end of the table. Say you have a 1 million record table and want two records. One record is at position 1 in the table and the second record is at position 1 million in the table and you use LIMIT 1 on both. The record at position 1 million in the table will take as long to retrieve as if you didn't use LIMIT 1. Even with that it is best to use LIMIT flags where you can in your queries as it does reduce the overall load on the database server and speeds things up.