"Accidental" Session ID Collision...

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

"Accidental" Session ID Collision...

Post by nielsene »

This is a spin off from the thread in General Discussions:
session id
That I think could benefit from increase exposure here.

The initial question was, whats the chance of generating the same session id twice.

If we posit that the session id is derived from a 32 bit integer we have about 4E9 possible session identifiers.

Treating this as a birthday collision problem we have an equation like

Code: Select all

x:=2^32 -1
p(n):=1 - ї x! / ((x-n)!(x^n)]
After a few iterations to reduce this to a form that doesnt't overflow I was using: (in python or 'R'... here shown in psudeo-php)

Code: Select all

$x=pow(2,32)-1;
$temp=1.0;
for ($i=0;$i<1000000;$i++) {
  $temp*=($x-$i);
  $temp/=$x;
  $p=1.0-$temp;
  if (($i %10000)==0) echo "$i --- $p";
}
// Code probably has slightly serious accumulation of
// round-off errors, but general shape and ratio should be
// good enough for proof of concept and general "scale"
Found that at about 65-70K the probability approaches 0.5 and about about 200K the probability is asympototic to 1.

Thus if you have ~65K simultaneous sessions, the session collision chance is extremely high and an attacker could guess an id one time in two.

Its very interesting that the percentage of the population required for p(n)=0.5 decreases with increasing population size. Its about 1/15 in the birthday collision (23/365) but ~15/1M in this case (65K/4B).

The idea about manually checking if you've issued a given session id before issueing it is only a slight help if you have a large number of active sessions. It would stop you from accidentally causing a collision, but won't protect from the ease of a random guesss matching.

Changing to using a pair of 32bit tokeen is of course an option. I've run upto 5,000,000 and its still down in the 1E-07 probability range of collision then.

So in short, for "normal" sites the 32 bit session id should be safe; but I would hope that very large sites take some form of additional protection.
User avatar
shiflett
Forum Contributor
Posts: 124
Joined: Sun Feb 06, 2005 11:22 am

Post by shiflett »

The session identifier should be a 128 bit number, represented as 32 characters (bytes). This makes the likelihood of a collision negligible.

In order to explain this better, I need to use smaller numbers. If a session identifier were represented as 4 characters (instead of 32), it would represent a 16 bit number, not a 4 bit one. For example, f3c2 is
1111001111000010 (unless I made a mistake in my haste).

f = 1111
3 = 0011
c = 1100
2 = 0010
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

OK, yes I was a little surprised by the assertion that it was a 32 bit value, in the original thread referenced. Give that its a 128 bit with a 160bit option in PHP5, its a moot point.
Last edited by nielsene on Fri Jul 29, 2005 5:54 pm, edited 1 time in total.
User avatar
shiflett
Forum Contributor
Posts: 124
Joined: Sun Feb 06, 2005 11:22 am

Post by shiflett »

It's 128 bits. :-)
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

yup just noticed that and corrected my post while you posted...
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

could always use a certain 256-bit one :wink:
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Until quantum computers become feasible.

Of course, quantum computers would render all of our "oh, it's big enough" arguments useless.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

While the analyis no longer applies, I do think its quite interesting that the percentage of the available population space before the probability of a collision passes 0.5 decreases rapidly with increasing population size.

Ie, increased protection against accidental/random collisions is less than O(n) in population space.
theda
Forum Contributor
Posts: 332
Joined: Sat Feb 19, 2005 8:35 am
Location: USA

Post by theda »

Why not just do a 1024 bit one? :) < My illogical statement for the day.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

theda wrote:Why not just do a 1024 bit one? :) < My illogical statement for the day.
The longer the ID, the longer it takes to generate a good random one, not to mention that it requires more and more storage space to keep it around and takes longer to compare. Only under certain circumstances should someone even think of generating such a large number.
x01
Forum Newbie
Posts: 4
Joined: Wed Aug 03, 2005 3:38 pm

Post by x01 »

I have never tough of that.

Do you think its accurate, to loop trough sessions, 'till we get one thats not used?

After 10 billion ( 32 bytes ^ 36 chars ) unique visits, there would be no more session ID left. I know that a website needs a lot of visits and fame to achieve that, but imagine we get those hits. What would happen then?
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Hope that those sessions have expired by the time the ten billion visitors have come.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

x01 wrote:I have never tough of that.

Do you think its accurate, to loop trough sessions, 'till we get one thats not used?

After 10 billion ( 32 bytes ^ 36 chars ) unique visits, there would be no more session ID left. I know that a website needs a lot of visits and fame to achieve that, but imagine we get those hits. What would happen then?
you're a bit off on your figures x01.. a session ID is a 128-bit key. That's 340,282,366,920,938,463,463,374,607,431,768,211,456 unique values.

That's plenty for most applications.. however, because the ID is actually run (at least partly) off a pseudo-random number, it has far more chances of repeating, however.. if the seed is good enough, and the interval between repeats on the random are long enough (the entropy can be made longer with other filters, however they are not entirely accurate), and the decay of the ID's is quick enough, then you have very very very very very little chance of ever generating the same ID within a usage.
x01
Forum Newbie
Posts: 4
Joined: Wed Aug 03, 2005 3:38 pm

Post by x01 »

feyd wrote:you're a bit off on your figures x01.. a session ID is a 128-bit key. That's 340,282,366,920,938,463,463,374,607,431,768,211,456 unique values.
Damn Windows calculator. :twisted:

Well, I just wanted to know if:

a) Session ID auto-expire after $X time. If $X > 0, echo $X; :D

b) If I randomly get an existing Session ID, would it be considered by law, has hacking?


On my site, I register the Session ID on the DB, with maximum 1 hour session time, and compare the PHP-Session ID with the DB, so I think that is a solution that is 99.9% secure?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

trusting anything coming from a user is not secure.. ever. It's all a trade off, requiring someone to send user and password every page they want to perform a registered users only action on can be more secure, but becomes annoying to the users. Storing their credentials in a cookie can help, but can you really trust it? Do you really care? Is the data on the "secured" area that important? Certain things like changing their email address should require resubmission of their password, along with logging of requests for such actions (including flood control)...

This is just the tip of the iceberg, depending on what your actual end action requires..
Post Reply