I'ts OK - I figured it out
But still your previous post and you 4-th post requirements don't match ... I really can't understand what you are trying to achieve
PS: I'm not quite sure but *maybe* the strpos() test function I used in the first example is closer to what you are looking for... It seems to me that you are looking for maximum unique substr ...
Last edited by VladSun on Thu Jan 31, 2008 4:51 pm, edited 2 times in total.
There are 10 types of people in this world, those who understand binary and those who don't
Ok, maybe some "purpose" information would be helpful. I'd like to be able to reduce full SHA-1 hashes down to abbreviated, unique versions, i.e. 86f7e437faa5a7fce15d1ddcb9eaeaea377667b8 to 86f7e437f. So, let's suppose we have a database of pre-existing hashes, I'd like to retrieve both the hash and the compact version.
So, this means a number of important conditions:
1. Hashes must be calculated on the fly, or re-calculated with every new hash inserted into the table. This is because any new hash added to the database gives the possibility of a new collision and a short ID that is now invalid because it is ambiguous.
2. We must be able to retrieve the original hash with the shortened hash (ideally, we'd be able to get multiple hashes for this function, so we can present a disambiguation request if a user specifies an ambiguous prefix).
3. Abbreviated hashes should have a minimum length, to prevent excessive collisions.
Hm... in order to have all of these conditions satisfied you need to store in the DB both the shortened hashes and the full length original hashes ... So, it doesn't make sense to have the shortened version, while you have the full version of the hash (and it is a must).
There are 10 types of people in this world, those who understand binary and those who don't
It's more convenient for display. In this particular case, we're referring to a distributed revision control system, where an auto increment ID isn't feasible and changesets are referred to using hashes.