0
votes

I am trying to create an application based on Zookeeper with CuratorFramework. The application must be able to run in quorum on more nodes. Every instance of app has embedded instance of Zookeeper server and clients. The nodes are essembled in quorum successfully. Every node write an EPHEMERAL node to /workers/active/node1 ("active" is PERSISTENT znode created by leader). Because the Zookeeper detect very slowly failure of node and ephemeral node had disappeared after session expiration when client was connected to localhost instance of zookeeper server, I have decided to connect client of NodeA to cluster with connection string "NodeB, NodeC". NodeB with connection string "NodeA, Node C" and NodeC with "NodeA, NodeB". It causes, that cluster is much quicklier in detection of node failure. I added watcher on each node, to detect NodeChildren event on /workers/active. This watcher has a special instance of CuratorFramework client connected to localhost zookeeper server. I have done that this way, because callback is registered only to server where client registered it. The problem is, that solution is not stable and I don't know why. Sometimes everything works correctly, but after that, I loose znode in /workers/active, but all nodes are running or the state in /workers/active is correct, but NodeChildren callback doesn't work even if it worked correctly few seconds ago... What can I do wrong? I have tried everything...

1

1 Answers

0
votes

I found a sulution.

In my case is the best option to use PersistentEphemeral node from CuratorFramework recipes for node registering.

For callbacks that detect added/removed nodes is best to use PathChildrenCache from CuratorFramework recipes and prepand callback to it