0
votes

We are trying to replicate the WebSphere Traditional (5/6/7/8/9) behaviour about session persistance for servlets and http, but with Hazelcast and Tomcat. Let me explain... WebSphere, even when configured as client to a replication domain, keeps a local register of session data. And this local register works fine even if the server processes that should keep replicated data are shutdown from the very first moment. That is, you start the client, and session persistence works within the servlet container. Obviously, you cannot expect to recover your session in another servlet container if the first one crashes, but your applications work anyway.

On the other hand, Hazelcast client on Tomcat containers expect the Hazelcast server (at least one member of the cluster) to be up and running to initialize. If no cluster member is available, initialization fails, and ... web applications in the Tomcat servlet container do not start right. They won't answer any request.

Furthermore, once initialization fails, only way to recover is to shutdown and re-start the tomcat web containers (once a hazelcast cluster member is online).

This behaviour is a bit harsh on system administrators: no one can guarantee that a backup service as distributed session persistence is online all time. That means that launching a Tomcat client becomes a risky task, with a single point of failure by design, which is undesirable.

Now, maybe I overlooked something, maybe I got something wrong. So, ¿Did someone ever managed to start a Hazelcast client without servers, and how? For us, the difference is decisive: if we cannot make the web container start with the hazelcast server offline, then we must keep going on with WebSphere.

We have been trying it on a CentOS 7.5 on Virtual Box 5.2.22, and our Tomcat version is 8.5. Hazelcast client and server is 3.11.1/2.

<group>
    <name>Integracion</name>
    <password></password>
</group>

<network>
  <cluster-members>
    <address>hazelcastsrv1/address>
    <address>hazelcastsrv2</address>
  </cluster-members>
</network>

Sadly, we expect exactly what we get: the reading of the Hazelcast manual suggest that offline servers won't allow tomcat to serve applications. But we cannot beleive what we read, because it makes the library unsafe in a distributed context. We expect to be wrong, and that there are good news around the corner.

2

2 Answers

0
votes

Hazelcast is not "a single point of failure by design". The design is to avoid a single point of failure. Data is mirrored across the nodes by default.

It's a data grid, you run as many nodes as capacity and resilience requires, and they cluster together.

If you need 3 nodes to be up for successful operations, and also anticipate that 1 might go down, then you need to run 4 in total. Should that 1 failure happen, you have a cluster surviving that is big enough.

0
votes

Power-on/Power-off order is not relevant in Hazelcast, as long as you are providing remaining nodes, during power-off, enough time to let repartitioning complete. For example, in a 4 nodes cluster, if you take out 1 node and give the other 3 room to complete repartitioning then you dont loose the data. If you take out 2 nodes together then the cluster will be without the data whose backup was stored on 1 of the 2 nodes you took out.

For starting up, the startup sequence is not relevant as each node owns certain set of partitions that are determined based on consistent hashing. And this ownership continues to change even if there are nodes leaving/joining a running cluster.