When a user registers itself on our site, we follow some well-defined security standards:
- generate a salt
- hash(salt + password)
- store the hashed password and the salt in the DB
Recently, a hot topic is "password topologies". For example, we should prevent users from generating a password like Ullllldd
(Uppercase, 5 times lowercase, two digits) because they're so common and more easy to crack by brute force if the attacker just focuses on that topology. We would now like to generate information regarding the most common topologies used on our site. Obviously we cannot use the hashed password information inside our DB - we cannot recover the password information from that hash. We had the idea of keeping a topology table that gets filled whenever a new user registers. for example:
User "joe" registers with password "pass". we see if the topology "uuuu" already exists. If it doesn't, add it to the database table with count = 1. if it does, increase the count by 1. This is of course a bit risky: if our db gets compromised, our attacker suddenly knows our most common topologies!
So my question is: how to collect and store password topology information without creating a new security risk?