2
votes

hazelcast in-memory map lost data when some nodes suddenly died.

For example, in a 3-node cluster, we have a shared map, considering the map has 3 entries (A,B,C), with default backup=1, those 3 entries equaly distributed to nodes, for example it could be:

Entry A: node 1, node 2
Entry B: node 2, node 3
Entry C: node 1, node 3.

When the node2 and node3 suddenly died at the same time before the node1 can finish repartition, then the entry B lost forever.

Any idea to overcome this problem:

  1. We tried to change the map backup count from 1 to 3, it worked for above 3-node case, but the problem still could exist at higher number cluster in some case.
  2. Is there any way to force this map backup on every node, e.g, set the backup-count to node number?
  3. We don't persistent the data to storage.
2
if more than <backup> nodes fail within a short period of time then you are screwed - you need to either: give them time to rebalance, up the backup count, or work with data that you can recreate from persistent storage. - vikingsteve

2 Answers

2
votes

In Hazelcast 3 you can configure the number of synchronous and asynchronous backups.

But if you want to have high availability, you need to have add more machines; the more you add, the higher the availability will be.

We have functionality for offloading to a database for example using the MapStore and MapLoader interface. We do not have (yet) out of the box persistence to disk.

0
votes

As of 2.1, their Distributed Queue supported custom backup counts and the Queues are backed up by distributed maps. Not sure if it's still supported in the latest version. Please take a look here: http://www.hazelcast.com/docs/2.1/manual/single_html/#Queue