3
votes

I am currently doing a POC for developing a distributed, fault tolerant, ETL ecosystem. I have selected Hazelcast for for my clustering (data+notification) purpose. Googling through Hazelcast resources took me to this link and it exactly matches how I was thinking to go about, using a map based solution.

I need to understand one point. Before that, allow me to give a canonical idea of our architecture:

Say we have 2 nodes A,B running our server instance clustered through hazelcast. One of them is a listener accepting requests (but can change on a fail over), say A.

A gets a request and puts it to a distributed map. This map is write-through backed by a persistent store and a single memory backup is configured on nodes.

Each instance has a local map entry listener, which on entry added event, would (asynchronous/queuing) process that entry and then remove it from the distributed map.

This is working as expected.

Question:

Say 10 requests have been received and distributed with 5 on each nodes. 2 entries on each node has been processed and now both instance crashes.

So there are total 6 entries present in the backing datastore now.

Now we bring up both the instances. As per documentation - "As of 1.9.3 MapLoader has the new MapLoader.loadAllKeys API. It is used for pre-populating the in-memory map when the map is first touched/used"

We implement loadAllKeys() by simply loading all the key values present in the store.

  1. So does that mean there is a possibility where, both the instances will now load the 6 entries and process them (thus resulting in duplicate processing)? Or is it handled in a synchronized way so that loading is done only once in a cluster?

  2. On server startup I need to process the pending entries (if any). I see that the data is loaded, however the entryAdded event is not fired. How can make the entryAdded event fire (or any other elegant way, by which I will know that there are pending entries on startup)?

Requesting suggestions.

Thanks, Sutanu

1
For the 2nd issue as of now, I am having a method which would "get" the "local" keySets and process them. This is a onetime invocation just after Hazelcast is initialized. I was wondering if this is the best I could do. - sutanu dalui

1 Answers

1
votes
  1. on initialization, loadAllKeys() will be called which will return all 6 keys in the persistent store. Then each node will select the keys it owns and load them only. So A might load 2 entries, while B loads the remaining 4.

  2. store.load doesn't fire entry listeners. How about this: right after initialization, after registering your listener, you can get the localEntries and process the existing ones.