0
votes

If I understand correctly, once an item is put inside a guava bloom filter, mightContain will always return true. If the filter returns false on mightContain, then the value has never been put inside the filter. What I'm wondering is for the values that might be a false positive at a given moment, as more values are put in, the once false-positives might become true-negatives later on (if they are not put in, of course).

Something like this:

GuavaBloomFilter<Integer> bf = new GuavaBloomFilter<>(blah, blah);
# if I start checking, none of the values should return tru at the monent
System.out.println(bd.mightContain(5)); // false
System.out.println(bd.mightContain(10)); // false
System.out.println(bd.mightContain(15)); // false
# fine
# let's put in a value now
bf.put(10);
System.out.println(bd.mightContain(5));
System.out.println(bd.mightContain(10)); // true, every time from now on
System.out.println(bd.mightContain(15));

On the last 3 checks, when checking for 10, it will always return true. For 5 and 15, it might return true. Suppose that for 5 we get false (never put inside), for 15 we get a false positive.

So, we continue:

bf.put(5);
System.out.println(bd.mightContain(5)); // true, every single time from now on
System.out.println(bd.mightContain(10)); // true, every time from now on
System.out.println(bd.mightContain(15));

So.... now, when checking for 5, we will always get true. Is it possible that because of the state change inside the bloom filter, the result for checking 15 which was previously a false-positive, might return a true-negative value?

1

1 Answers

1
votes

For a true Bloom filter, the bits only ever go from 0 to 1, never back - so the result of a mightContain call can only ever go from false to true, never back, because mightContain returns true if a certain subset of all bits are 1, and once they're 1 they'll stay 1.

Guava's implementation is indeed a true Bloom filter, since the BloomFilter.put method (source) delegates to Strategy.put (source), an interface implemented in BloomFilterStrategies (source). The Bloom filter's bits are stored in a LockFreeBitArray named bits, and the strategy only calls its bitSize, set and get methods. Of those, only set changes bits (source), and it only uses the bitwise 'or' operator | to change them. This can never change a 1 back to a 0.

So, it is indeed impossible for a value which was previously a false-positive to later become a true-negative.