Hi,
For two random strings s1 and s2, the probability that hash(hash(s1))==hash(hash(s2) is higher than the probability of hash(s1)==hash(s2).
Thats right, that the probability for hash(hash(s1)) == hash(hash(s2)) is higher, but you need the same time to find a collosion for hash(hash()) as for a single hash()-call (when you uses a secure hash algorithm).
To calculate this is easy. Guess hash() is a realy good hash function.
The probability of hash(s1) == hash(s2) with s1 != s2 is 1 to 2^n, where n is the length of the hash.
The probability of hash(hash(s1)) == hash(hash(s2)) is 1/2^n (if hash(s1) == hash (s2)) plus (1-2^n)*1/2^n (if hash(s1) != hash(s2) but hash(hash(s1)) == hash(hash(s2))), so the probability is ~ 1 to 2^(n-1).
But you need the double time to calculate hash(hash()) than to calculate hash().
it's generally NOT a good idea to use more than just one hash.
That's wrong!
If you wanna find a string s so that hash(s) == hash(password), it's much harder to find such a string if you have called the hash function multiple times ( hash(hash(hash(...(pw)...) ), because in the most cases a bad password was selected (who uses a password with 20 random chars? No one, most users uses a password with a length of 8 and just a-z0-9) .
The reason is, that the attacker have to calculate the hash multiple times to test one possible password, so if you call the hash function e.g. 2^16 times, the attack would be slowed down by the factor of ~65 000. Instead of 1 day the attacker than needs 65 000 days.
So you add 15/16 bit extra stength to the password.
To crack a 40 bit password with a key strengthing methode (e.g. with the 2^16-th iterate of hash) is as hard to crack a 55/56 bit password which was hashed just with a single call of hash().
I can recommend this paper:
Secure Applications of Low-Entropy Keys - J. Kelsey, B. Schneier, C. Hall, and D. Wagner
see especial §4:
Let H() be a secure, collision-resistant one-way hash function [...]
The basic scheme works as follows:
X0 = H(Kshort; S).
For i = 1 to 2^t:
Xi = H(Xi-1).
Klong = X2^t .
We are computing the 2^t-th iterate of H().
[...]
Considering each of our design criteria in turn, we can make the following
statements of security:
Time to Compute: One crucial requirement is that naive iteration must be
the fastest way to compute X2^t (x) for random inputs x. In other words,
there must be no shortcuts available to implement H2^t (). Fortunately, we
have some theoretical results which assure us that such shortcuts can't exist,
if H has sufficient resistance to collision attacks
[...]
Theorem 1. If H() is a strongly collision-resistant hash function that admits
a naive birthday attack, then computing the i-th iterate of H requires
time at least i=2.
the theorem says that iterating the hash function 2^t times adds
at least t - 1 bits of extra "strength" to the key.
[...]
7.
The hash-based construction was shown to have the property that
a method for speeding up the exhaustive search of the low-entropy variable
translates into a significant weakness in the underlying hash function.
Of course, key stretching is no substitute for long keys. Key stretching only
reduces a short key's resistance to certain attacks.
So if you use a secure hash algorithm, this methode could be used to increase the strength of the user passwords.
If your users do not use weak passwords, a key strengthing methode is senseless, but if not...
Please see also
RFC 2898 - Password-Based Cryptography Specification
5.1 PBKDF1
PBKDF1 applies a hash function, which shall be MD2 [6], MD5 [19] or
SHA-1 [18], to derive keys. The length of the derived key is bounded
by the length of the hash function output, which is 16 octets for MD2
and MD5 and 20 octets for SHA-1. PBKDF1 is compatible with the key
derivation process in PKCS #5 v1.5.
PBKDF1 is recommended only for compatibility with existing
applications since the keys it produces may not be large enough for
some applications.
PBKDF1 (P, S, c, dkLen)
Options: Hash underlying hash function
Input: P password, an octet string
S salt, an eight-octet string
c iteration count, a positive integer
dkLen intended length in octets of derived key,
a positive integer, at most 16 for MD2 or
MD5 and 20 for SHA-1
Output: DK derived key, a dkLen-octet string
Steps:
1. If dkLen > 16 for MD2 and MD5, or dkLen > 20 for SHA-1, output
"derived key too long" and stop.
2. Apply the underlying hash function Hash for c iterations to the
concatenation of the password P and the salt S, then extract
the first dkLen octets to produce a derived key DK:
T_1 = Hash (P || S) ,
T_2 = Hash (T_1) ,
...
T_c = Hash (T_{c-1}) ,
DK = Tc<0..dkLen-1>
3. Output the derived key DK.
PBKDF1
PBKDF2 is the replacement for PBKDF1, but the schema is nearly the same:
PBKDF2