Suppose the BitSet
from java.util.BitSet;
is used. The objective is to quickly find all the bit values that are set to true
. These values are not in order and without a particular pattern. The maximum index of the BitSet
will be 2^31 - 48
. The total number of bits that will be set to true
is (2^31 - 48)/2
. In other words, there are two billion bits that can be true
/false
, how can I find all the true
bits efficiently?
Each time a bit is set to true
, a run is required to visit all the true
bits in the BitSet
. You can see why looping through all the 2^31 - 48
bits every time isn't as efficient when it comes to performance.
Here is a solution that doesn't fit my need: create an int[] indices
of size (2^31 - 48)/2
and every time a bit i
is set to true
, store the value i
in the next available slot in indices
. While this helps in achieving the request, it would add about 32 * (2^31 - 48)/2
bits into memory which is around 4.3Gb.
The focus in on performance and repetitive computation. Using input/output files or something other than BitSet
is not desired.
What is the fastest approach to achieve the desired behaviour? Or... what is a sufficiently quick approach that also uses significantly less memory?
int[] indices
): instead of using absolute addresses, you could store the index-delta to the next index as a byte, or depending on the propability, on even less bits. And if it so happens that the delta is bigger than what you can store, have an indicator value that indicates a bigger jump and needs more data read, basically like UTF-8 does bytes-to-char conversions. - JayC667stream
which produces the objective - have you looked at its implementation? docs.oracle.com/javase/8/docs/api/java/util/… - Andy