The format is still a bit vague, but here's a way to do something like this. I'm making some assumptions that make the problem easier, namely:
- At most 32 bits are extracted at once (so it fits in a
Number without weird hacks)
- Bits are in an
Uint32Array (or compatible storage, as long as it has 32 bits per element)
- The least significant bit of the 0th entry of the array is number 0 overall
- The bit string represented this way is
... + tobits(array[1]) + tobits(array[0]) for example [ 0, 256 ] represents 00000000000000000000000100000000_00000000000000000000000000000000 (the underscore indicates the boundary between the pieces). Maybe that's the wrong way around, it can be changed but this way is simple.
The ith bit is in the i >> 5-th (aka i / 32 but with integer division) word, at offset i & 31 (aka i % 32) within that word. That's what makes this order simple to work with.
By the first assumption, at most 2 entries/words in the array are spanned by the range, so there are only two cases:
- The bottom of the range is in one word and the top is in the next.
- The range is wholly contained in a single word. Touching a second word should be avoided, as it might be beyond the end of the array. Also even if the second word can be touched, it wouldn't be as easy to deal with the excess bits because shift counts are taken modulo 32 so
high << 32 wouldn't do the trick.
In code (not tested)
function extractRange(bits, begin, end) {
// extract bits [begin .. end],
// begin==end extracts a single bit
var beginIdx = begin >> 5;
var endIdx = end >> 5;
var beginOfs = begin & 31;
var endOfs = end & 31;
var len = end - begin + 1;
var msk = -1 >>> (32 - len);
console.assert(len > 0 && len <= 32);
if (beginIdx == endIdx) {
// begin and end are in the same word
// discard the bits before the begin of the range and mask off the high bits
return ((bits[endIdx] >>> beginOfs) & msk) >>> 0;
}
else {
var low = bits[beginIdx];
var high = bits[endIdx];
// the high word contains the high bits of the result, in its lowest bits
// the low word contains the low bits of the result, in its highest bits:
// xxxxhhhh_llllxxxx
// high word must be shifted left by the number of bits in the low word
var bitsInLow = 32 - beginOfs;
return (((low >>> beginOfs) | (high << bitsInLow)) & msk) >>> 0;
}
}
Examples:
[0xdeadbeef, 0xcafebabe] means that the string is really 0xcafebabedeadbeef (in bits)
extractRange([0xdeadbeef, 0xcafebabe], 0, 31).toString(16) =
deadbeef
extractRange([0xdeadbeef, 0xcafebabe], 4, 35).toString(16) =
edeadbee
extractRange([0xdeadbeef, 0xcafebabe], 8, 39).toString(16) =
bedeadbe
extractRange([0xdeadbeef, 0xcafebabe], 60, 63).toString(16) =
c
extractRange([0xdeadbeef, 0xcafebabe], 30, 33).toString(16) =
b // ...ed... in binary 11101101, taking the middle 4 bits, 1011 = b
For non-contiguous ranges, you could extract every individual range and then append them. I don't think there is a nicer way in general.