3
votes

I have a 32bit random value (say 631).

0...0000001001110111

Each of these bits is a flag. I would like to return a random flag from these bits in, if at all possible, O(1) operation. How would I select the bit position 0, 1, 2, 4, 5, 6 or 9 (or it's corresponding value 1, 2, 4, 16, 32, 64, 512) from the given value 631? Preferably with as least possible bias towards some bits.

Things I came up with:

    • Shift value right random number of bits (max. 10 in this case)
    • See if LSB is set
      • If it is: got a bit position (last shifted number of bits); done
      • if not:
        • If resulting value == 0; start over
        • If resulting value != 0, go back to shifting random bits again

Above is not O(1) and possibly need multiple iterations if we happen to only 'hit' the 0 bits.

    • Mask (and) with random value
    • Repeat until power of 2 is left or start over when value is 0.

Again, above is not O(1) unfortunately.

I'm pretty sure this must be possible with some bithisfting/masking/magic somehow...


Edit:

As CodeCaster suggested; this will give me all set bit values:

int myvalue = 631;
var matchingmasks = Enumerable.Range(0, 32)
                              .Select(i => 1 << i)
                              .Where(i => (myvalue & i) == i)
                              .ToArray();

From the resulting array I can pick a random element and I'll have found my 'random' bit (flag) from the given value. However, this still requires a (hidden, because Linq) for-loop, 'brute-forcing' each possible bit, memory allocations for the resulting array etc.

6
Any random generation algorithm (including the simplest one) will generate a random value in O(1) - user586399
@:deleted: I don't want the binary representation; please read the question. I want to pick a (one) bit from the set bits and either have it's position (e.g. bit 3) OR it's "value" (e.g. 4 for bit 3) @Kilanny: I don't want a random number; I want a random bit from a given number. - J. Doe
If you have the binary representation then you could manipulate that to get the bits, or the position of the bits. IF I understand correctly you have 631 want as the input and you want the output to be 1, 2, 4 , 16, 32 - Donald Jansen
@DonaldJansen: you could manipulate that to get the bits: yes; the question is how. (also: 64 and 512 are a possible output you left out for the given value of 631). - J. Doe
Also: avoid the temptation to say "that's not the fastest possible". The fastest possible is not your goal. You are probably unwilling to spend even a small amount like ten million dollars to develop custom hardware to solve this problem as fast as possible. Your goal is "fast enough given my budget and other constraints". Since we know neither your performance requirements nor your budget, we don't know what meets that goal. - Eric Lippert

6 Answers

6
votes

First off, I would suggest doing this the easy, straightforward, obvious way that you suggest in the question: make an array of values, choose an element at random. Yes this allocates memory and whatnot. Optimize for the code being readable and correct first; only when you have a demonstrated performance problem should you optimize it.

If you do want to optimize it down to bit twiddling, this page is my go-to resource: http://graphics.stanford.edu/~seander/bithacks.html

The algorithms you'll want here are:

  • first, pick your favourite algorithm for determining the Hamming weight -- that is, "how many bits are on?" Call that number n.
  • Now pick a random number r from 1 to n
  • Now read the algorithm called "select the bit position with the given count". This takes your number r and gives you the bit position of the rth true bit starting from the high end. The code given on the page is for longs; it should be straightforward to modify it for ints.

I note that a key feature of many of these algorithms is that they are branchless. When you are trying to wring the last ounce of performance out of an algorithm, remember that every "if" kills performance. An "if" means that there is code in the cache that is NOT running, because you branched away from it, and therefore you are making a cache miss more likely. An "if" means there is an opportunity for the branch predictor to make the wrong choice. At the CLR level, every "if" means more basic blocks, which means more work for the jitter to do its flow analysis. And so on.

3
votes

You can simply create the masks on beforehand, and select the masks that match the source value:

uint source = 631;
uint[] masks = Enumerable.Range(0, 32).Select(i => (uint)1 << i).ToArray();
uint[] matchingMask = masks.Where(m => (m & source) == m).ToArray();

Now matchingMask contains the values that make up your source value, in this case: 1, 2, 4, 16, 32, 64, 512.

Then from matchingMask you can select a random element.

If you want the bit position instead, you could use the indexing Select() overload like this:

var matchingMask = masks.Select((m, i) => new { Index = i, Mask = m}) 
                        .Where(m => (m.Mask & source) == m.Mask)
                        .ToArray();
1
votes

This is actually sort-of possible. There's a 64-bit solution here.

I've converted it to the following C# code. It's O(1) because the number of operations is not dependent on the number of bits set:

public static uint SelectRandomSetBit(ulong v, Random rng)
{
    ulong a = v - ((v >> 1) & ~0UL / 3);
    ulong b = (a & ~0UL / 5) + ((a >> 2) & ~0UL / 5);
    ulong c = (b + (b >> 4)) & ~0UL / 0x11;
    ulong d = (c + (c >> 8)) & ~0UL / 0x101;
    ulong t = ((d >> 32) + (d >> 48));
    int   n = (int)((d * (~(ulong)0 / 255)) >> (64 - 1) * 8);
    ulong r = (uint) rng.Next(1, n+1);
    ulong s = 64;

    s -= ((t - r) & 256) >> 3;
    r -= (t & ((t - r) >> 8));
    t = (d >> (int)(s - 16)) & 0xff;
    s -= ((t - r) & 256) >> 4;
    r -= (t & ((t - r) >> 8));
    t = (c >> (int)(s - 8)) & 0xf;
    s -= ((t - r) & 256) >> 5;
    r -= (t & ((t - r) >> 8));
    t = (b >> (int)(s - 4)) & 0x7;
    s -= ((t - r) & 256) >> 6;
    r -= (t & ((t - r) >> 8));
    t = (a >> (int)(s - 2)) & 0x3;
    s -= ((t - r) & 256) >> 7;
    r -= (t & ((t - r) >> 8));
    t = (v >> (int)(s - 1)) & 0x1;
    s -= ((t - r) & 256) >> 8;

    return (uint)(s-1);
}

Here's how I tested it:

Random rng = new Random();
ulong number = 0x0101010101010101;
int[] bits = new int[64];

for (int i = 0; i < 1000000; ++i)
    ++bits[SelectRandomSetBit(number, rng)];

for (int i = 0; i < 64; ++i)
    Console.WriteLine($"bit {i} was returned {bits[i]} times.");

You would expect to see every 8th bit returned approximately the same number of times, and none of the other bits returned. This is indeed what happens.

I leave converting this to 32 bit as an interesting exercise. ;)

(This is probably an unnecessary optimisation in any case: A simple loop to count the bits and then randomly choose one would probably be fast enough...)

0
votes

This should be really & simply O(1):

byte b = 123;
Random r = new Random();
int bitNumber = r.Next(32);
var bit = (b & (1 << bitNumber-1)) != 0;

Check this

0
votes

Seems like a homework problem... But, the solution is possible because you have only 32 bits to look for and 32 known-ahead of time values for each position. If I'm not mistaken, these are actually the same (a mask with the second bit set has the value "2" when interpreted as an integer).

What you do is build your 32 entry array with the prepared bit mask that will return only that bit.

The array lookup is O(1) since the speed is constant regardless of which bit you are retrieving. At this point you & and compare with the original mask as you would have done when using bit shifting and the final result will be O(1) still.

Note though that while this is O(1) it may not be faster than bit shifting. The arrays are 32*4 bytes of memory, so 128 bytes. That's not huge but it's not tiny either. You would need to run a simple test to confirm that executing up to 32 bit shift instructions takes more time than retrieving an item from the array (my guess is that the array is faster, but I could be wrong).

0
votes

What about a lookup table?

public static class RandomExtensions
{
    public static uint GetRandomBitOf( this Random rand, uint mask )
    {
        if( mask == 0 ) return 0;
        var lo = smLookup[mask & 0xFFFF];
        var hi = smLookup[mask >> 16];
        int i = rand.Next( lo.Length + hi.Length );
        return i < lo.Length ? (uint) lo[i] : (uint) hi[i - lo.Length] << 16;
    }

    static RandomExtensions()
    {
        smLookup = new ushort[65536][];

        for( int i = 0; i < smLookup.Length; ++i )
        {
            ushort j = (ushort) i;
            smLookup[i] = Enumerable
                .Range( 0, 16 )
                .Select( b => (ushort) ( 1 << b ) )
                .Where( b => ( j & b ) != 0 )
                .ToArray();
        }
    }

    private static ushort[][] smLookup;
}

I'm not sure where this ranks with regards to performance amongst the other answers. I'm just adding this answer mostly for completeness in terms of possible implementations.