1
votes

I have eight 8-bit values stored in a 64-bit integer. The MSB of each byte can either be 1 or 0, and the rest of their bits are all 0. Example:

MSB 10000000 00000000 10000000 ... 10000000 00000000 00000000 LSB

I now need to find the index of first byte that has its bit set. First meaning that we search from the least significant direction. In the above example the result would be 2.

Using de Bruijn we could scan for the first set bit and divide by 8 to get its byte index.

Here's my question: de Bruijn is generic, it works for any input. But in my use case we are limited to bytes having only their MSB set. Is it possible to optimize for this case?

The implementation is in C++. I can't use any intrinsics or inline assembly (_BitScanForward64(), __builtin_clzll etc).

2
C++20 added countr_zero and popcount which would be the best solution (if possible)aqrit

2 Answers

0
votes

(Edit) Isolate the lowest set bit x &= (-x) then see How to find the position of the only-set-bit in a 64-bit value using bit manipulation efficiently? which is examining this exact problem (despite the title).

The answers below are slightly more general.


A couple cycles of latency could be saved over the de Bruijn bitscan by eliminating the table lookup.

uint64_t ByteIndexOfLowestSetBit(uint64_t val) {
    assert(val != 0);
    const uint64_t m = UINT64_C(0x0101010101010101);
    return ((((val - 1) ^ val) & (m - 1)) * m) >> 56;
}

Use trailing bit manipulation to get a mask covering the lowest set bit and below. Set each byte covered by the mask to 1. Count how many 1 bytes we have by prefix-summing them horizontally. We now have placed a 1-based byte index into the most significant byte of the u64 word. Shift the count to the bottom and subtract 1 to get a 0-based index. However, we don’t want the -1 on the critical path... so instead subtract 1 from m so we never count the least significant byte in the total.


The problem of finding the highest set MS1B is more complicated because we don't have any bit-manipulation tricks to isolate the bit wanted. In that case, Extract Bits with a Single Multiplication, use them as an index into a table. If a input value of zero is not allowed then the value of the least significant byte either doesn't matter or is non-zero. This allows the use of a lookup table with 7-bit indices instead of 8-bits.

Adapt as needed.

uint64_t ReversedIndexOf_Highest_Byte_With_LSB_Set (uint64_t val) {
    static const unsigned char ctz7_tab[128] = {
        7, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
        4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 
        5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 
        4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 
        6, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 
        4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 
        5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 
        4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 
    };
    assert(val != 0);
    assert((val & 0xFEFEFEFEFEFEFEFEULL) == 0);
    val = (val * UINT64_C(0x0080402010080402)) >> 57;
    return ctz7_tab[val];
}
0
votes

Here a simple way.

int LeastSignificantSetBitByteIndex(long value)
{
    if((value & 0x80) != 0) return 0;
    if((value & 0x8000) != 0) return 1;
    if((value & 0x800000) != 0) return 2;
    if((value & 0x80000000L) != 0) return 3;
    if((value & 0x8000000000L) != 0) return 4;
    if((value & 0x800000000000L) != 0) return 5;
    if((value & 0x80000000000000L) != 0) return 6;
    if((value & 0x8000000000000000L) != 0) return 7;
    return -1;
}
    
int MostSignificantSetBitByteIndex(long value)
{
    if((value & 0x8000000000000000L) != 0) return 0;
    if((value & 0x80000000000000L) != 0) return 1;
    if((value & 0x800000000000L) != 0) return 2;
    if((value & 0x8000000000L) != 0) return 3;
    if((value & 0x80000000L) != 0) return 4;
    if((value & 0x800000) != 0) return 5;
    if((value & 0x8000) != 0) return 6;
    if((value & 0x80) != 0) return 7;
    return -1;
}