OK, let me put this bluntly: if you're putting user data, or anything derived from user data into a cookie for this purpose, you're doing something wrong.
There. I said it. Now we can move on to the actual answer.
What's wrong with hashing user data, you ask? Well, it comes down to exposure surface and security through obscurity.
Imagine for a second that you're an attacker. You see a cryptographic cookie set for the remember-me on your session. It's 32 characters wide. Gee. That may be an MD5...
Let's also imagine for a second that they know the algorithm that you used. For example:
md5(salt+username+ip+salt)
Now, all an attacker needs to do is brute force the "salt" (which isn't really a salt, but more on that later), and he can now generate all the fake tokens he wants with any username for his IP address! But brute-forcing a salt is hard, right? Absolutely. But modern day GPUs are exceedingly good at it. And unless you use sufficient randomness in it (make it large enough), it's going to fall quickly, and with it the keys to your castle.
In short, the only thing protecting you is the salt, which isn't really protecting you as much as you think.
But Wait!
All of that was predicated that the attacker knows the algorithm! If it's secret and confusing, then you're safe, right? WRONG. That line of thinking has a name: Security Through Obscurity, which should NEVER be relied upon.
The Better Way
The better way is to never let a user's information leave the server, except for the id.
When the user logs in, generate a large (128 to 256 bit) random token. Add that to a database table which maps the token to the userid, and then send it to the client in the cookie.
What if the attacker guesses the random token of another user?
Well, let's do some math here. We're generating a 128 bit random token. That means that there are:
possibilities = 2^128
possibilities = 3.4 * 10^38
Now, to show how absurdly large that number is, let's imagine every server on the internet (let's say 50,000,000 today) trying to brute-force that number at a rate of 1,000,000,000 per second each. In reality your servers would melt under such load, but let's play this out.
guesses_per_second = servers * guesses
guesses_per_second = 50,000,000 * 1,000,000,000
guesses_per_second = 50,000,000,000,000,000
So 50 quadrillion guesses per second. That's fast! Right?
time_to_guess = possibilities / guesses_per_second
time_to_guess = 3.4e38 / 50,000,000,000,000,000
time_to_guess = 6,800,000,000,000,000,000,000
So 6.8 sextillion seconds...
Let's try to bring that down to more friendly numbers.
215,626,585,489,599 years
Or even better:
47917 times the age of the universe
Yes, that's 47917 times the age of the universe...
Basically, it's not going to be cracked.
So to sum up:
The better approach that I recommend is to store the cookie with three parts.
function onLogin($user) {
$token = GenerateRandomToken(); // generate a token, should be 128 - 256 bit
storeTokenForUser($user, $token);
$cookie = $user . ':' . $token;
$mac = hash_hmac('sha256', $cookie, SECRET_KEY);
$cookie .= ':' . $mac;
setcookie('rememberme', $cookie);
}
Then, to validate:
function rememberMe() {
$cookie = isset($_COOKIE['rememberme']) ? $_COOKIE['rememberme'] : '';
if ($cookie) {
list ($user, $token, $mac) = explode(':', $cookie);
if (!hash_equals(hash_hmac('sha256', $user . ':' . $token, SECRET_KEY), $mac)) {
return false;
}
$usertoken = fetchTokenByUserName($user);
if (hash_equals($usertoken, $token)) {
logUserIn($user);
}
}
}
Note: Do not use the token or combination of user and token to lookup a record in your database. Always be sure to fetch a record based on the user and use a timing-safe comparison function to compare the fetched token afterwards. More about timing attacks.
Now, it's very important that the SECRET_KEY
be a cryptographic secret (generated by something like /dev/urandom
and/or derived from a high-entropy input). Also, GenerateRandomToken()
needs to be a strong random source (mt_rand()
is not nearly strong enough. Use a library, such as RandomLib or random_compat, or mcrypt_create_iv()
with DEV_URANDOM
)...
The hash_equals()
is to prevent timing attacks.
If you use a PHP version below PHP 5.6 the function hash_equals()
is not supported. In this case you can replace hash_equals()
with the timingSafeCompare function:
/**
* A timing safe equals comparison
*
* To prevent leaking length information, it is important
* that user input is always used as the second parameter.
*
* @param string $safe The internal (safe) value to be checked
* @param string $user The user submitted (unsafe) value
*
* @return boolean True if the two strings are identical.
*/
function timingSafeCompare($safe, $user) {
if (function_exists('hash_equals')) {
return hash_equals($safe, $user); // PHP 5.6
}
// Prevent issues if string length is 0
$safe .= chr(0);
$user .= chr(0);
// mbstring.func_overload can make strlen() return invalid numbers
// when operating on raw binary strings; force an 8bit charset here:
if (function_exists('mb_strlen')) {
$safeLen = mb_strlen($safe, '8bit');
$userLen = mb_strlen($user, '8bit');
} else {
$safeLen = strlen($safe);
$userLen = strlen($user);
}
// Set the result to the difference between the lengths
$result = $safeLen - $userLen;
// Note that we ALWAYS iterate over the user-supplied length
// This is to prevent leaking length information
for ($i = 0; $i < $userLen; $i++) {
// Using % here is a trick to prevent notices
// It's safe, since if the lengths are different
// $result is already non-0
$result |= (ord($safe[$i % $safeLen]) ^ ord($user[$i]));
}
// They are only identical strings if $result is exactly 0...
return $result === 0;
}