Strategies on Pagination
Moderator: General Moderators
- volomike
- Forum Regular
- Posts: 633
- Joined: Wed Jan 16, 2008 9:04 am
- Location: Myrtle Beach, South Carolina, USA
Strategies on Pagination
I wanted to hear some of your strategies on paginating record results from the database. Note in my case I'm using PostgreSQL and I have LIMIT and OFFSET available to me.
If you've already beat this issue like a dead horse, please give me the hyperlink to that discussion. Thanks.
1. Right now, I'm using a GET form post on my search form because that's what I typically see in search forms, rather than handsome URLs, and it permits me to provide hyperlinks in other parts of the website that allow me to store a query, such as going to employer profiles and being able to click on a hyperlink to see jobs from just that employer.
2. So I guess the first thing I need to do is get the count of rows in the query. If the query is huge, I hope my SELECT will return that value correctly to me. On MS SQL Server, for instance, it used to be inaccurate unless you cycled through all the records -- don't know about MySQL or PostgreSQL.
3. I want to display 10 job search results per page. So, I have to divide the row count by that value and display 1, 2, 3, 4 ... up to the sum of that division. I take it that if they don't change the form fields in the search from their last search, that I can cache this count in a cookie or session var?
4. Next I need to display First, Prev, Next, and Last wrapped around that 1, 2, 3, 4, ...etc stuff. The First is easy -- it goes to OFFSET 1, LIMIT 10. The Last is slightly easy -- based on the value I computed from step 3, I do OFFSET of that value, LIMIT 10. The catch is that someone may add a record here or there, and so by doing a LIMIT 10 on the last page, I can hide that fact unless they start a different query.
5. The Prev needs to remember what page they are currently on and change hyperlink such that the OFFSET value has the page before it.
6. The Next needs to remember what page they are currently on and change the hyperlink such that the OFFSET value has the page after it.
7. Prev needs to be disabled on the first page. Next needs to be disabled on the last page.
8. What gets tricky is when the user jumps around on the various parts of the query -- the Prev and Next need to keep up with those changes.
9. As I build the SQL, I'm avoiding JOINs where possible. If a certain criteria is not used in the advanced search, I will not JOIN for its data.
10. I'm not caching this search result data into a temporary table.
11. I'm not using a PostgreSQL view -- to me, a view won't speed up an ad-hoc query, right?
If you've already beat this issue like a dead horse, please give me the hyperlink to that discussion. Thanks.
1. Right now, I'm using a GET form post on my search form because that's what I typically see in search forms, rather than handsome URLs, and it permits me to provide hyperlinks in other parts of the website that allow me to store a query, such as going to employer profiles and being able to click on a hyperlink to see jobs from just that employer.
2. So I guess the first thing I need to do is get the count of rows in the query. If the query is huge, I hope my SELECT will return that value correctly to me. On MS SQL Server, for instance, it used to be inaccurate unless you cycled through all the records -- don't know about MySQL or PostgreSQL.
3. I want to display 10 job search results per page. So, I have to divide the row count by that value and display 1, 2, 3, 4 ... up to the sum of that division. I take it that if they don't change the form fields in the search from their last search, that I can cache this count in a cookie or session var?
4. Next I need to display First, Prev, Next, and Last wrapped around that 1, 2, 3, 4, ...etc stuff. The First is easy -- it goes to OFFSET 1, LIMIT 10. The Last is slightly easy -- based on the value I computed from step 3, I do OFFSET of that value, LIMIT 10. The catch is that someone may add a record here or there, and so by doing a LIMIT 10 on the last page, I can hide that fact unless they start a different query.
5. The Prev needs to remember what page they are currently on and change hyperlink such that the OFFSET value has the page before it.
6. The Next needs to remember what page they are currently on and change the hyperlink such that the OFFSET value has the page after it.
7. Prev needs to be disabled on the first page. Next needs to be disabled on the last page.
8. What gets tricky is when the user jumps around on the various parts of the query -- the Prev and Next need to keep up with those changes.
9. As I build the SQL, I'm avoiding JOINs where possible. If a certain criteria is not used in the advanced search, I will not JOIN for its data.
10. I'm not caching this search result data into a temporary table.
11. I'm not using a PostgreSQL view -- to me, a view won't speed up an ad-hoc query, right?
- jaoudestudios
- DevNet Resident
- Posts: 1483
- Joined: Wed Jun 18, 2008 8:32 am
- Location: Surrey
Re: Strategies on Pagination
From what I understand VIEW is slower as it is your query on top of the queries required to generate that virtual table.
- allspiritseve
- DevNet Resident
- Posts: 1174
- Joined: Thu Mar 06, 2008 8:23 am
- Location: Ann Arbor, MI (USA)
Re: Strategies on Pagination
We've been discussing pagination in the Skeleton forum a bit. There are very few differences between Skeleton's existing A_Pager and my pagination class, but I will explain mine briefly to give you some ideas.
First, I have a class called Collection that basically wraps an array. You can add items by calling $collection->add ('key', 'value'), remove items by calling $collection->remove ('key'), and (most importantly for pagination) find the total number of items by calling $collection->count() and get a subset of a collection by calling $collection->slice ('offset', 'length').
Next, I have a class called Paginator that handles all of the pagination logic. If I pass it a collection and a number of items per page, and then set the current page, I can do just about anything with the dataset that the collection wraps:
Using these two classes, I could paginate an array of any size and create any view logic based on combinations of the above methods. (Test me... I'd love to be proven wrong!)
However, you specifically asked about paginating items from the DB. Basically, Collection implements an interface that has count() and slice(). I could pass in any object that implemented that interface, which essentially would be a strategy object. You could create a Collection class that on count() would query the database for the number of items with certain criteria, and on slice() would retrieve a subset of those items. Or, if you prefer, you could make the query yourself and pass it as an array to a generic Collection object. I prefer the first method as it's more efficient, but not everyone does. That way the Paginator only knows about Pagination logic, and the Collection deals with the data retrieval behind a common interface.
Finally, I have a PaginationHelper class that takes a Paginator class and a Template class. Using combinations of the above methods, it can give the template simple variables to determine various displays. For example:
Hopefully that gives you some ideas.
First, I have a class called Collection that basically wraps an array. You can add items by calling $collection->add ('key', 'value'), remove items by calling $collection->remove ('key'), and (most importantly for pagination) find the total number of items by calling $collection->count() and get a subset of a collection by calling $collection->slice ('offset', 'length').
Next, I have a class called Paginator that handles all of the pagination logic. If I pass it a collection and a number of items per page, and then set the current page, I can do just about anything with the dataset that the collection wraps:
Code: Select all
$paginator = new Paginator ($collection, 10);
$paginator->setPage (3);
$paginator->getFirstPage();
$paginator->getLastPage();
$paginator->getCurrentPage();
$paginator->count();
$paginator->offset ($page);
$paginator->isValid ($page);
$paginator->getItems();However, you specifically asked about paginating items from the DB. Basically, Collection implements an interface that has count() and slice(). I could pass in any object that implemented that interface, which essentially would be a strategy object. You could create a Collection class that on count() would query the database for the number of items with certain criteria, and on slice() would retrieve a subset of those items. Or, if you prefer, you could make the query yourself and pass it as an array to a generic Collection object. I prefer the first method as it's more efficient, but not everyone does. That way the Paginator only knows about Pagination logic, and the Collection deals with the data retrieval behind a common interface.
Finally, I have a PaginationHelper class that takes a Paginator class and a Template class. Using combinations of the above methods, it can give the template simple variables to determine various displays. For example:
Code: Select all
function next() {
return (($this->paginator->getCurrentPage() + 1) < $this->paginator->getLastPage()) ? true : false;
}Code: Select all
<?php if ($next): ?><!-- Display next link --><?php endif; ?>- jaoudestudios
- DevNet Resident
- Posts: 1483
- Joined: Wed Jun 18, 2008 8:32 am
- Location: Surrey
Re: Strategies on Pagination
I did more research, in some cases VIEWs can be faster.jaoudestudios wrote:From what I understand VIEW is slower as it is your query on top of the queries required to generate that virtual table.
NB:they do lack indexes
- volomike
- Forum Regular
- Posts: 633
- Joined: Wed Jan 16, 2008 9:04 am
- Location: Myrtle Beach, South Carolina, USA
Re: Strategies on Pagination
allspiritseve:
A PostgreSQL result is a resource object. Are you saying you convert this into an array and then tack these properties on? Wouldn't that be a system hog with every query?
A PostgreSQL result is a resource object. Are you saying you convert this into an array and then tack these properties on? Wouldn't that be a system hog with every query?
Re: Strategies on Pagination
Normally youd pull back like 10 - 50 rows max, so not really...
- allspiritseve
- DevNet Resident
- Posts: 1174
- Joined: Thu Mar 06, 2008 8:23 am
- Location: Ann Arbor, MI (USA)
Re: Strategies on Pagination
What do you do with it otherwise? I assume you're probably converting it into an array or object at some point before displaying it. Also, if you subscribe to the 2 separate queries paradigm, then getItems() is going to return whatever the Collection sends it. There's no need to loop through the data for anything, even just to count it. So theoretically you could pass a resource, and then convert it into an array for display in the template... I don't see why you'd want to though.volomike wrote:A PostgreSQL result is a resource object. Are you saying you convert this into an array and then tack these properties on? Wouldn't that be a system hog with every query?
- volomike
- Forum Regular
- Posts: 633
- Joined: Wed Jan 16, 2008 9:04 am
- Location: Myrtle Beach, South Carolina, USA
Re: Strategies on Pagination
You guys remind me -- with LIMIT and OFFSET, yeah, it's only like 10 rows of data at a time -- not a big server hit, and not a big hit to convert into an array or a collection object with extra class methods.
- allspiritseve
- DevNet Resident
- Posts: 1174
- Joined: Thu Mar 06, 2008 8:23 am
- Location: Ann Arbor, MI (USA)
Re: Strategies on Pagination
Just a note, if you were to create the collection class like I described above, you wouldn't need to convert anything. It's just basically a wrapper so that the Pagination class can query Collections in a standard manner, whether they are actual collections or just proxies that query the db.volomike wrote:You guys remind me -- with LIMIT and OFFSET, yeah, it's only like 10 rows of data at a time -- not a big server hit, and not a big hit to convert into an array or a collection object with extra class methods.
- volomike
- Forum Regular
- Posts: 633
- Joined: Wed Jan 16, 2008 9:04 am
- Location: Myrtle Beach, South Carolina, USA
Re: Strategies on Pagination
Okay, so what's the best strategy to figuring out, with the least CPU/memory cost, the number of records returned from the query? I hate to use COUNT(*) if there's a faster mechanism. Note, I'm using PostgreSQL on this one.
I see there's pg_num_rows(), but don't know how accurate that is when you have like 50,000 rows coming back.
I see there's pg_num_rows(), but don't know how accurate that is when you have like 50,000 rows coming back.
Last edited by volomike on Tue Jan 13, 2009 7:43 pm, edited 1 time in total.
- allspiritseve
- DevNet Resident
- Posts: 1174
- Joined: Thu Mar 06, 2008 8:23 am
- Location: Ann Arbor, MI (USA)
Re: Strategies on Pagination
What's wrong with COUNT(*)?volomike wrote:Okay, so what's the best strategy to figuring out, with the least CPU/memory cost, the number of records returned from the query? I hate to use COUNT(*) if there's a faster mechanism. Note, I'm using PostgreSQL on this one.
- volomike
- Forum Regular
- Posts: 633
- Joined: Wed Jan 16, 2008 9:04 am
- Location: Myrtle Beach, South Carolina, USA
Re: Strategies on Pagination
Well, with COUNT(*), you have to run the query twice with every time they click Next, Next, Prev, Prev, Prev, etc.
You have to run it once to get the count of the total number of rows in the query.
You have to run it again to go get the offset page of that query.
Now, one strategy is to run it once the first time they do a particular query, cache it in a session object or cookie, and only refresh that cached value if they change the search form's set of values.
You have to run it once to get the count of the total number of rows in the query.
You have to run it again to go get the offset page of that query.
Now, one strategy is to run it once the first time they do a particular query, cache it in a session object or cookie, and only refresh that cached value if they change the search form's set of values.
Re: Strategies on Pagination
innoDB caches row counts.
- volomike
- Forum Regular
- Posts: 633
- Joined: Wed Jan 16, 2008 9:04 am
- Location: Myrtle Beach, South Carolina, USA
Re: Strategies on Pagination
After researching the PostgreSQL docs, they tell you to use a stored procedure and a cached connection to cycle a cursor through the rows. Or, if that's not easily possible (or worth the aggravation), then do a COUNT(*) (without offset and limit) on the first time one does the query, cache that value unless the query changes, and then use OFFSET and LIMIT as one cycles through the query.