Re: [article draft] Password hashing howto and hownotto
Posted: Fri Sep 19, 2008 9:30 pm
As far as I'm aware there are two canonical ways of storing a hashed password (without consideration of any challenge-response authentication scheme... which is generally avoided in web apps because it implies use of Javascript crypto which isn't particularly good):
i) Store the pair Hash( salt || password ), salt in the database where the salt is unique for each password stored - this is an essential constraint (we're back to using /dev/urandom and probably concatenate with microtime as well). Using an additional constant salt is equivalent to a keyed hash, and is only useful if you can actually keep the key secret (which is unlikely if the webserver and database are on the same host). But in this instance I do tend to agree with you - its an easy way to get an additional layer of security, and have myself used this form in several implementations I've coded.
ii) Use something like PKCS#5 PBKDF2, again you'll be passing as an input "salt || password". This method (if the iteration count is high enough) makes it exeedingly difficult to mount an exhaustive search of the password space - even for relatively poor passwords.
Now, (ii) or similar is preferable and is used in products such as PGP (as far as I know) and PasswordSafe. Unfortunately, the (deliberate) time penatly may be too much for a webserver, and enforcing minimum password complexity with (i) may be more viable.
Note: I don't think (could be wrong, its been known to happen) there is *any* loss of entropy with applying the hash function multiple times (e.g. you're "double hashing", or PBKDF), although I would probably need to spend some time to do a proof and check. MD5, for example, pads its input with zeros to the 448th bit then appends a 64-bit value indicating the length to give a 512-bit input block. Given all the functions within MD5 are bijective, and that the only variable quantity in the repeated input is the 128-bit previous output (the rest are constant) - then H o H( x) should also be bijective, so H o H o .... o H( x ) is also bijective and hence no loss of entropy. Most MD-style hash functions also do not produce small cycles particularly easily, despite being poor on other fronts.
I liked that you'd written an article on this, it needs to be done. Even coders of very popular software can repeatedly get it wrong: http://www.lightbluetouchpaper.org/2008 ... erability/
i) Store the pair Hash( salt || password ), salt in the database where the salt is unique for each password stored - this is an essential constraint (we're back to using /dev/urandom and probably concatenate with microtime as well). Using an additional constant salt is equivalent to a keyed hash, and is only useful if you can actually keep the key secret (which is unlikely if the webserver and database are on the same host). But in this instance I do tend to agree with you - its an easy way to get an additional layer of security, and have myself used this form in several implementations I've coded.
ii) Use something like PKCS#5 PBKDF2, again you'll be passing as an input "salt || password". This method (if the iteration count is high enough) makes it exeedingly difficult to mount an exhaustive search of the password space - even for relatively poor passwords.
Now, (ii) or similar is preferable and is used in products such as PGP (as far as I know) and PasswordSafe. Unfortunately, the (deliberate) time penatly may be too much for a webserver, and enforcing minimum password complexity with (i) may be more viable.
Note: I don't think (could be wrong, its been known to happen) there is *any* loss of entropy with applying the hash function multiple times (e.g. you're "double hashing", or PBKDF), although I would probably need to spend some time to do a proof and check. MD5, for example, pads its input with zeros to the 448th bit then appends a 64-bit value indicating the length to give a 512-bit input block. Given all the functions within MD5 are bijective, and that the only variable quantity in the repeated input is the 128-bit previous output (the rest are constant) - then H o H( x) should also be bijective, so H o H o .... o H( x ) is also bijective and hence no loss of entropy. Most MD-style hash functions also do not produce small cycles particularly easily, despite being poor on other fronts.
I liked that you'd written an article on this, it needs to be done. Even coders of very popular software can repeatedly get it wrong: http://www.lightbluetouchpaper.org/2008 ... erability/