session id
That I think could benefit from increase exposure here.
The initial question was, whats the chance of generating the same session id twice.
If we posit that the session id is derived from a 32 bit integer we have about 4E9 possible session identifiers.
Treating this as a birthday collision problem we have an equation like
Code: Select all
x:=2^32 -1
p(n):=1 - ї x! / ((x-n)!(x^n)]Code: Select all
$x=pow(2,32)-1;
$temp=1.0;
for ($i=0;$i<1000000;$i++) {
$temp*=($x-$i);
$temp/=$x;
$p=1.0-$temp;
if (($i %10000)==0) echo "$i --- $p";
}
// Code probably has slightly serious accumulation of
// round-off errors, but general shape and ratio should be
// good enough for proof of concept and general "scale"Thus if you have ~65K simultaneous sessions, the session collision chance is extremely high and an attacker could guess an id one time in two.
Its very interesting that the percentage of the population required for p(n)=0.5 decreases with increasing population size. Its about 1/15 in the birthday collision (23/365) but ~15/1M in this case (65K/4B).
The idea about manually checking if you've issued a given session id before issueing it is only a slight help if you have a large number of active sessions. It would stop you from accidentally causing a collision, but won't protect from the ease of a random guesss matching.
Changing to using a pair of 32bit tokeen is of course an option. I've run upto 5,000,000 and its still down in the 1E-07 probability range of collision then.
So in short, for "normal" sites the 32 bit session id should be safe; but I would hope that very large sites take some form of additional protection.