Strategies on Pagination

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

User avatar
volomike
Forum Regular
Posts: 633
Joined: Wed Jan 16, 2008 9:04 am
Location: Myrtle Beach, South Carolina, USA

Strategies on Pagination

Post by volomike »

I wanted to hear some of your strategies on paginating record results from the database. Note in my case I'm using PostgreSQL and I have LIMIT and OFFSET available to me.

If you've already beat this issue like a dead horse, please give me the hyperlink to that discussion. Thanks.

1. Right now, I'm using a GET form post on my search form because that's what I typically see in search forms, rather than handsome URLs, and it permits me to provide hyperlinks in other parts of the website that allow me to store a query, such as going to employer profiles and being able to click on a hyperlink to see jobs from just that employer.

2. So I guess the first thing I need to do is get the count of rows in the query. If the query is huge, I hope my SELECT will return that value correctly to me. On MS SQL Server, for instance, it used to be inaccurate unless you cycled through all the records -- don't know about MySQL or PostgreSQL.

3. I want to display 10 job search results per page. So, I have to divide the row count by that value and display 1, 2, 3, 4 ... up to the sum of that division. I take it that if they don't change the form fields in the search from their last search, that I can cache this count in a cookie or session var?

4. Next I need to display First, Prev, Next, and Last wrapped around that 1, 2, 3, 4, ...etc stuff. The First is easy -- it goes to OFFSET 1, LIMIT 10. The Last is slightly easy -- based on the value I computed from step 3, I do OFFSET of that value, LIMIT 10. The catch is that someone may add a record here or there, and so by doing a LIMIT 10 on the last page, I can hide that fact unless they start a different query.

5. The Prev needs to remember what page they are currently on and change hyperlink such that the OFFSET value has the page before it.

6. The Next needs to remember what page they are currently on and change the hyperlink such that the OFFSET value has the page after it.

7. Prev needs to be disabled on the first page. Next needs to be disabled on the last page.

8. What gets tricky is when the user jumps around on the various parts of the query -- the Prev and Next need to keep up with those changes.

9. As I build the SQL, I'm avoiding JOINs where possible. If a certain criteria is not used in the advanced search, I will not JOIN for its data.

10. I'm not caching this search result data into a temporary table.

11. I'm not using a PostgreSQL view -- to me, a view won't speed up an ad-hoc query, right?
User avatar
jaoudestudios
DevNet Resident
Posts: 1483
Joined: Wed Jun 18, 2008 8:32 am
Location: Surrey

Re: Strategies on Pagination

Post by jaoudestudios »

From what I understand VIEW is slower as it is your query on top of the queries required to generate that virtual table.
User avatar
allspiritseve
DevNet Resident
Posts: 1174
Joined: Thu Mar 06, 2008 8:23 am
Location: Ann Arbor, MI (USA)

Re: Strategies on Pagination

Post by allspiritseve »

We've been discussing pagination in the Skeleton forum a bit. There are very few differences between Skeleton's existing A_Pager and my pagination class, but I will explain mine briefly to give you some ideas.

First, I have a class called Collection that basically wraps an array. You can add items by calling $collection->add ('key', 'value'), remove items by calling $collection->remove ('key'), and (most importantly for pagination) find the total number of items by calling $collection->count() and get a subset of a collection by calling $collection->slice ('offset', 'length').

Next, I have a class called Paginator that handles all of the pagination logic. If I pass it a collection and a number of items per page, and then set the current page, I can do just about anything with the dataset that the collection wraps:

Code: Select all

$paginator = new Paginator ($collection, 10);
$paginator->setPage (3);
$paginator->getFirstPage();
$paginator->getLastPage();
$paginator->getCurrentPage();
$paginator->count();
$paginator->offset ($page);
$paginator->isValid ($page);
$paginator->getItems();
Using these two classes, I could paginate an array of any size and create any view logic based on combinations of the above methods. (Test me... I'd love to be proven wrong!)

However, you specifically asked about paginating items from the DB. Basically, Collection implements an interface that has count() and slice(). I could pass in any object that implemented that interface, which essentially would be a strategy object. You could create a Collection class that on count() would query the database for the number of items with certain criteria, and on slice() would retrieve a subset of those items. Or, if you prefer, you could make the query yourself and pass it as an array to a generic Collection object. I prefer the first method as it's more efficient, but not everyone does. That way the Paginator only knows about Pagination logic, and the Collection deals with the data retrieval behind a common interface.

Finally, I have a PaginationHelper class that takes a Paginator class and a Template class. Using combinations of the above methods, it can give the template simple variables to determine various displays. For example:

Code: Select all

function next() {
    return (($this->paginator->getCurrentPage() + 1) < $this->paginator->getLastPage()) ? true : false;
}

Code: Select all

<?php if ($next): ?><!-- Display next link --><?php endif; ?>
Hopefully that gives you some ideas.
User avatar
jaoudestudios
DevNet Resident
Posts: 1483
Joined: Wed Jun 18, 2008 8:32 am
Location: Surrey

Re: Strategies on Pagination

Post by jaoudestudios »

jaoudestudios wrote:From what I understand VIEW is slower as it is your query on top of the queries required to generate that virtual table.
I did more research, in some cases VIEWs can be faster.
NB:they do lack indexes
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Strategies on Pagination

Post by josh »

User avatar
volomike
Forum Regular
Posts: 633
Joined: Wed Jan 16, 2008 9:04 am
Location: Myrtle Beach, South Carolina, USA

Re: Strategies on Pagination

Post by volomike »

allspiritseve:

A PostgreSQL result is a resource object. Are you saying you convert this into an array and then tack these properties on? Wouldn't that be a system hog with every query?
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Strategies on Pagination

Post by josh »

Normally youd pull back like 10 - 50 rows max, so not really...
User avatar
allspiritseve
DevNet Resident
Posts: 1174
Joined: Thu Mar 06, 2008 8:23 am
Location: Ann Arbor, MI (USA)

Re: Strategies on Pagination

Post by allspiritseve »

volomike wrote:A PostgreSQL result is a resource object. Are you saying you convert this into an array and then tack these properties on? Wouldn't that be a system hog with every query?
What do you do with it otherwise? I assume you're probably converting it into an array or object at some point before displaying it. Also, if you subscribe to the 2 separate queries paradigm, then getItems() is going to return whatever the Collection sends it. There's no need to loop through the data for anything, even just to count it. So theoretically you could pass a resource, and then convert it into an array for display in the template... I don't see why you'd want to though.
User avatar
volomike
Forum Regular
Posts: 633
Joined: Wed Jan 16, 2008 9:04 am
Location: Myrtle Beach, South Carolina, USA

Re: Strategies on Pagination

Post by volomike »

You guys remind me -- with LIMIT and OFFSET, yeah, it's only like 10 rows of data at a time -- not a big server hit, and not a big hit to convert into an array or a collection object with extra class methods.
User avatar
allspiritseve
DevNet Resident
Posts: 1174
Joined: Thu Mar 06, 2008 8:23 am
Location: Ann Arbor, MI (USA)

Re: Strategies on Pagination

Post by allspiritseve »

volomike wrote:You guys remind me -- with LIMIT and OFFSET, yeah, it's only like 10 rows of data at a time -- not a big server hit, and not a big hit to convert into an array or a collection object with extra class methods.
Just a note, if you were to create the collection class like I described above, you wouldn't need to convert anything. It's just basically a wrapper so that the Pagination class can query Collections in a standard manner, whether they are actual collections or just proxies that query the db.
User avatar
volomike
Forum Regular
Posts: 633
Joined: Wed Jan 16, 2008 9:04 am
Location: Myrtle Beach, South Carolina, USA

Re: Strategies on Pagination

Post by volomike »

Okay, so what's the best strategy to figuring out, with the least CPU/memory cost, the number of records returned from the query? I hate to use COUNT(*) if there's a faster mechanism. Note, I'm using PostgreSQL on this one.

I see there's pg_num_rows(), but don't know how accurate that is when you have like 50,000 rows coming back.
Last edited by volomike on Tue Jan 13, 2009 7:43 pm, edited 1 time in total.
User avatar
allspiritseve
DevNet Resident
Posts: 1174
Joined: Thu Mar 06, 2008 8:23 am
Location: Ann Arbor, MI (USA)

Re: Strategies on Pagination

Post by allspiritseve »

volomike wrote:Okay, so what's the best strategy to figuring out, with the least CPU/memory cost, the number of records returned from the query? I hate to use COUNT(*) if there's a faster mechanism. Note, I'm using PostgreSQL on this one.
What's wrong with COUNT(*)?
User avatar
volomike
Forum Regular
Posts: 633
Joined: Wed Jan 16, 2008 9:04 am
Location: Myrtle Beach, South Carolina, USA

Re: Strategies on Pagination

Post by volomike »

Well, with COUNT(*), you have to run the query twice with every time they click Next, Next, Prev, Prev, Prev, etc.

You have to run it once to get the count of the total number of rows in the query.
You have to run it again to go get the offset page of that query.

Now, one strategy is to run it once the first time they do a particular query, cache it in a session object or cookie, and only refresh that cached value if they change the search form's set of values.
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Strategies on Pagination

Post by josh »

innoDB caches row counts.
User avatar
volomike
Forum Regular
Posts: 633
Joined: Wed Jan 16, 2008 9:04 am
Location: Myrtle Beach, South Carolina, USA

Re: Strategies on Pagination

Post by volomike »

After researching the PostgreSQL docs, they tell you to use a stored procedure and a cached connection to cycle a cursor through the rows. Or, if that's not easily possible (or worth the aggravation), then do a COUNT(*) (without offset and limit) on the first time one does the query, cache that value unless the query changes, and then use OFFSET and LIMIT as one cycles through the query.
Post Reply