Suppose you were at liberty to decide how hashed passwords were to be stored in a DBMS. Are there obvious weaknesses in a scheme like this one?
To create the hash value stored in the DBMS, take:
- A value that is unique to the DBMS server instance as part of the salt,
- And the username as a second part of the salt,
- And create the concatenation of the salt with the actual password,
- And hash the whole string using the SHA-256 algorithm,
- And store the result in the DBMS.
This would mean that anyone wanting to come up with a collision should have to do the work separately for each user name and each DBMS server instance separately. I'd plan to keep the actual hash mechanism somewhat flexible to allow for the use of the new NIST standard hash algorithm (SHA-3) that is still being worked on.
The 'value that is unique to the DBMS server instance' need not be secret - though it wouldn't be divulged casually. The intention is to ensure that if someone uses the same password in different DBMS server instances, the recorded hashes would be different. Likewise, the user name would not be secret - just the password proper.
Would there be any advantage to having the password first and the user name and 'unique value' second, or any other permutation of the three sources of data? Or what about interleaving the strings?
Do I need to add (and record) a random salt value (per password) as well as the information above? (Advantage: the user can re-use a password and still, probably, get a different hash recorded in the database. Disadvantage: the salt has to be recorded. I suspect the advantage considerably outweighs the disadvantage.)
There are quite a lot of related SO questions - this list is unlikely to be comprehensive:
- Encrypting/Hashing plain text passwords in database
- Secure hash and salt for PHP passwords
- The necessity of hiding the salt for a hash
- Clients-side MD5 hash with time salt
- Simple password encryption
- Salt generation and Open Source software
- Password hashes: fixed-length binary fields or single string field?
I think that the answers to these questions support my algorithm (though if you simply use a random salt, then the 'unique value per server' and username components are less important).