0
votes

I have to count unique entries from a stream of transactions using Redis. There will be at least 1K jobs trying to concurrently check if the transaction is unique and if it is, put the the transaction type as key and the value is an incremented counter. This counter is again shared by all threads. If all threads do

  • Check if key exists. exists(transactionType)
  • Increment the counter. val count = incr(counter)
  • Set the new value. setnx(transactionType, count)

This creates two problems.

  • Increments the counter unnecessarily, as the count can be updated by one of the threads.
  • Have to perform an exists, increment and then insert. (3 operations) Is there a better way of doing this increment and update of counter if the value does not exist.
    private void checkAndIncrement(String transactionType, Jedis redisHandle) {
        if(transactionType != null) {
            if(redisHandle.exists(transactionType) ^ Boolean.TRUE) {
                long count = redisHandle.incr("t_counter");
                redisHandle.setnx(transactionType, "" + count);
            }
        }
    }
    

EDIT:

Once a value is created as say T1 = 100, the transaction should also be identifiable with the number 100. I would have to store another map with counter as key and transaction type as value.

1

1 Answers

0
votes

Two options:

  1. Use a hash, HSETNX to add keys to the hash (just set the value to 1 or "" or anything), and HLEN to get the count of keys in the hash. You can always start over with HDEL. You could also use HINCRBY instead of HSETNX to additionally find out how many times each key appears.

  2. Use a hyperloglog. Use PFADD to insert elements and PFCOUNT to retrieve the count. HyperLogLog is a probabilistic algorithm; the memory usage for a HLL doesn't go up with the number of unique items the way a hash does, but the count returned is only approximate (usually within about 1% of the true value).