How to create a hash function for use in noise generation

Question

I am fooling around with noise functions lately, namely perlin noise and simplex noise. I do not have any questions about the noise algorithms since I've got them up and running, but I have some question about the hash method that ken perlin uses:

Some context:

So in these noise functions you need a (pseudo-)random value for each coordinate. In Ken Perlins implementation he uses a lookup table to get these values:

static const U8 perlin_hash_values[] = {
151,160,137, 91, 90, 15,131, 13,201, 95, 96, 53,194,233,  7,225,
140, 36,103, 30, 69,142,  8, 99, 37,240, 21, 10, 23,190,  6,148,
247,120,234, 75,  0, 26,197, 62, 94,252,219,203,117, 35, 11, 32,
57,177, 33, 88,237,149, 56, 87,174, 20,125,136,171,168, 68,175,
74,165, 71,134,139, 48, 27,166, 77,146,158,231, 83,111,229,122,
60,211,133,230,220,105, 92, 41, 55, 46,245, 40,244,102,143, 54,
65, 25, 63,161,  1,216, 80, 73,209, 76,132,187,208, 89, 18,169,
200,196,135,130,116,188,159, 86,164,100,109,198,173,186,  3, 64,
52,217,226,250,124,123,  5,202, 38,147,118,126,255, 82, 85,212,
207,206, 59,227, 47, 16, 58, 17,182,189, 28, 42,223,183,170,213,
119,248,152,  2, 44,154,163, 70,221,153,101,155,167, 43,172,  9,
129, 22, 39,253, 19, 98,108,110, 79,113,224,232,178,185,112,104,
218,246, 97,228,251, 34,242,193,238,210,144, 12,191,179,162,241,
81, 51,145,235,249, 14,239,107, 49,192,214, 31,181,199,106,157,
184, 84,204,176,115,121, 50, 45,127,  4,150,254,138,236,205, 93,
222,114, 67, 29, 24, 72,243,141,128,195, 78, 66,215, 61,156,180,

151,160,137, 91, 90, 15,131, 13,201, 95, 96, 53,194,233,  7,225,
140, 36,103, 30, 69,142,  8, 99, 37,240, 21, 10, 23,190,  6,148,
247,120,234, 75,  0, 26,197, 62, 94,252,219,203,117, 35, 11, 32,
57,177, 33, 88,237,149, 56, 87,174, 20,125,136,171,168, 68,175,
74,165, 71,134,139, 48, 27,166, 77,146,158,231, 83,111,229,122,
60,211,133,230,220,105, 92, 41, 55, 46,245, 40,244,102,143, 54,
65, 25, 63,161,  1,216, 80, 73,209, 76,132,187,208, 89, 18,169,
200,196,135,130,116,188,159, 86,164,100,109,198,173,186,  3, 64,
52,217,226,250,124,123,  5,202, 38,147,118,126,255, 82, 85,212,
207,206, 59,227, 47, 16, 58, 17,182,189, 28, 42,223,183,170,213,
119,248,152,  2, 44,154,163, 70,221,153,101,155,167, 43,172,  9,
129, 22, 39,253, 19, 98,108,110, 79,113,224,232,178,185,112,104,
218,246, 97,228,251, 34,242,193,238,210,144, 12,191,179,162,241,
81, 51,145,235,249, 14,239,107, 49,192,214, 31,181,199,106,157,
184, 84,204,176,115,121, 50, 45,127,  4,150,254,138,236,205, 93,
222,114, 67, 29, 24, 72,243,141,128,195, 78, 66,215, 61,156,180};

The coordinates are mapped to the values like this:

return perlin_hash_values[perlin_hash_values[y & 255] + x & 255];

I have some problems with that: It eats memory, and repeats very fast, and is not seedable (at least the reference implementation. I guess you could just use the 3D Perlin noise with the z value used as the seed) and I wanted to know if one could go faster. (probably not since the array is likely in cache and its hard to beat 2 memory reads and 3 instructions). I'm doing this mostly for fun and to learn some stuff.

My question is:

What would be a good approach to replace this method with an algorithm? The desirable properties would probably be that it's fast, causes no visual artifacts and doesn't repeat as fast as this array.

I first thought of using a standard hash function, but I discovered 2 properties that are not really needed from them: Hash functions usually take arbitrary long inputs, which is not required here. They also try to avoid collisions (the same output for different inputs) which is not required here either. At least if I am right in thinking that there is probably no strong correlation between visual artifacts and collision. These two factors make standard hash functions seems unnecesairily complicated. All that is needed is a mapping from N dimensions to 1 dimension that looks random.

What I have tried so far:

Adopting the xorshift* pseudo-random number generator: I set the internal 64 bit state to be the value of x and y concatenated.

F32 xor_shift_star_adaption(S32 x, S32 y, U32 seed) {
    union concat_S32
    {
        S32 s[2];
        U64 u;
    };
    concat_S32 temp;
    temp.s[0] = x;
    temp.s[1] = y;
    xorshift_star_64 xor;
    xor.state = temp.u;
    return xor.get_random_number();
}

It produces a LOT of artifacts I have not enough reputation to post images :/. I suspect that pseudo-random number generators are not well suited for this task since I heard that they usually need some time to "warm up". The internal state advances itsself, and the longer it had time to do it the higher the quality of the random numbers get. With this approach, I prevent the prng from "warming up". On the other hand: The state can be thought of as a linear sequence of numbers. If I initialize the prng with a number that is a long way in this linear sequence, should I not get "high quality" numbers immediately?

Using this super simple one liner someone suggested (now not a one liner anymore):

F32 internet_oneliner(S32 x, S32 y, U32 seed){
    F32 xf = x;
    F32 yf = y;

    F32 dot = xf * 12.9898 + yf * 78.233;
    F32 sin = sinf(dot)  * 43758.5453;
    F32 fract = sin - floorf(sin);
    return fract;
}

It performs pretty well, but has some expensive operations in it.

Using the Murmur 3 hash function, which seems to be (?) the best "normal" hash function I could find. (I wont post the code here, its big). It also performs really well but also takes a very long time to get the job done.

Trying to do some xors and bitshifting stuff with primes, because I think thats how that prngs work :D. As you can imagine, it looks horrible (closer to abstract art than random noise). Creating a prng seems pretty impossible without some knowledge about them.

So, do you have any suggestions on what to try next? I think generally speaking I am more looking for a hash function than a prng, because it is more in line with the goals I am trying to achieve here.

(Im sorry for the lack of links and images, the reputation system is really screwing me over here)

Why don't you just use xorshift* the regular way, so without putting the coordinates in it? — harold
Because that would make it just an rng. An advantage of a noise function is that you get the same value for a given coordinate. If I just used the sequence of the numbers that a prng gives me, the value of a point would not be dependant on the point, but on the sequence of points that came before it. — pulp_user
Beware which hash functions you're thinking about: some of them are carefully designed to run as slowly as possible. — Richard
How about standard C++11 random generators, like Mersenne Twister or Knuth-B generator? — yeputons
I haven't heard about knuth-b yet, so I might give that a try. Concerning the Mersenne Twister: It would probably be 10 times slower than the xorshift* (it is a beast) which is undesirable. I picked xorshift because I read it was a good prng (on par with Mersenne). On the other hand, prng tests are usually not really testing my case here: resetting the prng for every number you want from it. — pulp_user

geza geza · Accepted Answer · 2017-06-16T22:35:08

Two possible solutions:

a) Use a 2D random table. If it fits into L1 cache, it will be fast.

b) Use an integer hash: http://burtleburtle.net/bob/hash/integer.html

For example, this:

uint32_t hash( uint32_t a)
    a = (a ^ 61) ^ (a >> 16);
    a = a + (a << 3);
    a = a ^ (a >> 4);
    a = a * 0x27d4eb2d;
    a = a ^ (a >> 15);
    return a;
}

Use (hash((y<<8)+x)&0xff).

And even, as it is a full 32-bit hash, and you only need 8-bit output, you can find a faster variant. Maybe just removing some operations from this hash function :) (and you can try not just the lowest 8-bits, but the highest 8-bits, maybe it has better properties)

How to create a hash function for use in noise generation

Some context:

My question is:

What I have tried so far:

1 Answers