Page 1 of 1
sessions - large ones
Posted: Wed Jun 21, 2006 10:39 pm
by olegkikin
ok here it goes...
I need to store a list of integers on the server side. In fact, I don't care if it gets to the client side. The list is around 30,000-50,000 elements long. The list is individual for each visitor.
Currently I use sessions, and it's gotten pretty slow. session_start() takes around a second, which sucks.
That's how I store them:
$_SESSION['list'][$i] = $x;
So I need a faster way to store long arrays on the server side.
Anybody knows an alternative?
thanks!
Posted: Wed Jun 21, 2006 10:50 pm
by bdlang
Huh? What is the application of storing all these values? You're not using an RDBMS of some sort to store data?
Posted: Wed Jun 21, 2006 10:55 pm
by olegkikin
That's the list of genes that user selects from the database. Basically the list of rows from the table. Then user will perform certain operations on that subset of genes.
Posted: Wed Jun 21, 2006 10:59 pm
by bdlang
Hmm. Interesting. The application requires the user store all the data in a session from page to page? Have you thought about a 'pagination' theory where you could work on a subset of data per script, and then as the user links from page-to-page have it only call a certain subset of the data based on previous values or what was previously done? It's the same sort of context, in my mind, anyway.
Posted: Wed Jun 21, 2006 11:00 pm
by feyd
depending on the format of each element, there are various ways of compacting the data down to reach smaller sizes for the file requests. Gambler's Code Snippet for an alternate to
serialize() comes to mind as a possible generic solution. At any rate, you will need to get familiar with
session_set_save_handler() and it's required functionality. For an example of its use, look at the database sessions thread referenced in Useful Posts.
Posted: Wed Jun 21, 2006 11:03 pm
by olegkikin
bdlang: Can you give more details? What's the technique?
Posted: Wed Jun 21, 2006 11:04 pm
by olegkikin
feyd: i thought of serializing, but then, array of integers is pretty compact. Serializing it into a string would make it worse, wouldn't it?
Posted: Wed Jun 21, 2006 11:07 pm
by feyd
the default for session storage uses serialize(). integers are not compacted to their binary form with it.
Posted: Wed Jun 21, 2006 11:09 pm
by bdlang
olegkikin wrote:bdlang: Can you give more details? What's the technique?
Well, the general idea of 'pagination' is to show the user only N records per page, a subset of all records in the database. It's pretty standard practice on any PHP/CMS driven website where you have <prev and next> links with all the page numbers between. Each link is basically broken down to retrieve N records at a certain starting point, defined by the SQL queries' LIMIT keyword. Some simple math determines how many to retrieve and at what starting point based on N. You don't show 10,000 records to the user, you show 100, then link to the next 100, etc ad nauseum.
Anyway, my idea of a solution to your problem, depending on how it's currently implemented, would be to only work on a subset of data at once, per page. I don't know how well this fits in with how the data is processed, however. It could be all data is required all at once in your case. If only a bit is required at once, work on that bit, then link to the next set.
Posted: Wed Jun 21, 2006 11:12 pm
by olegkikin
bdlang: It has nothing to do with pagination. The user only sees 20 rows on each page, but he can select up to 50,000 rows (by going through pages, for instance, or all at once).
feyd: so what you say is that array of integers in $_SESSION is stored as a string???
Posted: Wed Jun 21, 2006 11:28 pm
by feyd
olegkikin wrote:feyd: so what you say is that array of integers in $_SESSION is stored as a string???
test it:
Code: Select all
[feyd@home]>php -r "echo serialize(123123), PHP_EOL, serialize(array(123123, 321321, 234234, 432432, 345345, 543543, 456456, 654654));"
i:123123;
a:8:{i:0;i:123123;i:1;i:321321;i:2;i:234234;i:3;i:432432;i:4;i:345345;i:5;i:543543;i:6;i:456456;i:7;i:654654;}
Posted: Wed Jun 21, 2006 11:34 pm
by olegkikin
thanks! this is crazy though... Terribly inefficient. It usues around 10-14 bytes to store one integer.
It basically means I will have to write my own serialize() function. I can pack each number into 3 bytes, so that would make it 3-4 times faster, minus the overhead of compression-decompression.
I was hoping there's an easier way.
but thanks anyway

Posted: Wed Jun 21, 2006 11:39 pm
by feyd
There is some possible points of adding more efficiency depending on the actual nature of the data.. such as it's entirely integers, each child array is a fixed length, etc. If you post your algorithm, we can test various implementations of it and other algorithms to see which is most efficient and determine if you want to keep it small or simple processing and the mix between them.
Posted: Wed Jun 21, 2006 11:44 pm
by olegkikin
It's basically an array of integers in the range from 1 to 16 million (so requires 3 bytes). The list is sorted, and numbers don't repeat. You can assume the numbers are random.