1
votes

I have a sparse bitset that may be many millions or even billions of bits wide. Assuming that the bitset is already efficiently compressed, and assume that I can also already efficiently query the bitset to see exactly how many bits are set in in some given range (ie, position and length).

Given this, can I efficiently find the position of the k'th set bit in the bitset, or else efficiently give an indication that it doesn't exist? A description of the algorithm that is programming language neutral would be ideal. Assume that I cannot change any of the internals of the bitset implementation... that is, the only things I can do with the bitset are to query its total width, and to ask it how many bits are set in any given range.

1
Are there any stipulations on how big the range can be when checking how many bits are set? It might help if you tell us the time complexity of that operation. - arshajii
Your description suggests a modifed binary search: recursive bisection narrows the interval in which the k'th bit exists. You need the number of set bits 'left' to the currently tested interval as an additional parameter (you know this number from previous recursion levels). - collapsar

1 Answers

3
votes

If you can efficiently query the number of bits set in every range, you can perform binary search on #set_bits(0,i) to find the first index where this value equals k.

It will take O(log(n)*f(n)), where f(n) is the complexity of #set_bits(0,i) op.