0
votes

I'm quite new to Apache Ignite so please be gentle. My question is simple. If I have a replicated cache using Apache Ignite. I write to this cache key 123. My cluster has 10 nodes.

First question is:

Does replicated cache mean that before the "put" call comes back the key 123 must be written to all 10 nodes? Or does the call come back immediately and the replication is done behind the scenes?

Second question is: Lets say key 123 is written on Node 1. It's now being replicated to all other nodes. However a few microseconds later Node 2 tries to write key 123 with a different value. Do I now have a race condition? Or does Ignite somehow handles this situation in such a way where Node 2's attempt to write key 123 won't happen until Node 1's "put" has replicated across all nodes?

For some context, what I'm trying to build is a de-duplication system across a cluster of API machines. I was hoping that I would be able to create a hash of my API request (with only values that make the request unique) and write it to the Ignite Cache. The API request would only proceed if the cache does not already contain the unique hash (possibly created by a different API instance). Of course the cache would have an eviction policy to evict these cache keys after a few seconds because they won't be needed anymore.

1

1 Answers

1
votes

REPLICATED cache is the same as PARTITIONED with infinite number of backups and some optimizations. So it has primary partitions that distributed across nodes according to affinity function.

Now when you perform update, request comes to primary node, and primary node, in it's turn, updates all backups. Property CacheConfiguration.setWriteSynchronizationMode() is responsible for the way in which entries will be updated. By default it's PRIMARY_SYNC, which means that thread which calls put() will wait only for primary partition update, and backups will be updated asynchronously. If you set it to FULL_SYNC, thread will be released only when all backups updated.

Answering your second question, there will not be a race condition, because all requests will come to primary node.

Additionally to your clarification, if backup node wasn't updated yet, get() request will go to primary node, so in PRIMARY_SYNC mode you'll never get null if primary partition has a value.