0
votes

Let us say I've two server nodes in one data center DC1 and two more server nodes in another data center DC2. Two data centers have some network delay.

Now I'm using SQL select statements on caches which are replicated. Now those caches' write synchronization mode is FULL_SYNC.

Now at a time we have working clients nodes only in one DC but not both. Let's say we have two clients in DC1.

So total nodes is 6 (2 client nodes and 2 server nodes in DC1 and 2 server nodes in DC2).

Our use case is such a way that..

  1. 2 clients should query only 2 server nodes in DC1 and not the other 2 servers in DC2.
  2. All the cache queries should be in FULL_SYNC with 2 server nodes in DC1 and DC1-DC2 should be done in ASYNC mode.
  3. A doubt I got which is, if in client's node discoveryspi, if I (X,Y) ip list as server nodes ips, would the queries always reach X,Y even though the entire topology contains X,Y,Z as server nodes?

Please someone provide us the solution for this.

Note: I saw one GridGain's capability for cluster-cluster replication but that comes under paid version. I am looking for a solution in the community edition.

1

1 Answers

0
votes
  1. A doubt I got which is, if in client's node discoveryspi, if I (X,Y) ip list as server nodes ips, would the queries always reach X,Y even though the entire topology contains X,Y,Z as server nodes?

No, DiscoverySPI is used only for the connecting to the cluster, after that, client node will be working with all nodes in the cluster.

  1. All the cache queries should be in FULL_SYNC with 2 server nodes in DC1 and DC1-DC2 should be done in ASYNC mode.

It's not possible to do this, only one synchronization mode can be used for one cache in the cluster.

  1. 2 clients should query only 2 server nodes in DC1 and not the other 2 servers in DC2.

It's not possible to do this for cache operations, but you can do this for computing operations - you can send a job to a certain node with a primary or backup copy in DC1 and it will take the local partition. But compute creates some overhead compared to the plain cache operations if it used only for getting the entries.

So, as you mentioned, the best way here is the DataCenter Replication, which is available as a part of GridGain, because, based on your requirements, you need 2 separate clusters here.