2
votes

Our application is running in a Cassandra Cluster (version 2.0.3) of ten nodes with two data centers where DC1 (has 5 nodes) is a local dc which servers the data to the users and DC2 (has 5 nodes) is just a backup dc which will be used for recovery.

Now we have added a new data center DC3 which is also another backup dc with 5 nodes. As of now, DC1 and DC2 have the replication factor of 2. We are going to alter our existing keyspaces with DC3 too to have replication factor of 2. The scenario is that any DC1 node can communicate with any DC2 node and any DC3 node. But the communication between DC2 nodes and DC3 nodes were not being established.

That is, DC1 is connected to DC2 and DC3.
DC2 is connected to DC1 alone but not connected to DC3.
DC3 is connected to DC1 alone but not connected to DC2.

  • If we perform "nodetool status" command from any DC1 node, all the three DC nodes are showing the status us "UN (up and normal)".
  • If we perform "nodetool status" command from any DC2 node, all DC1 and DC2 nodes are showing as "UN" but all the DC3 nodes are showing as "DN (down and normal)"
  • If we perform "nodetool status" command from any DC3 node, all DC1 and DC3 nodes are showing as "UN" but all DC2 nodes are showing as "DN"

We are yet to alter the keyspace and yet to run nodetool rebuild (with DC1) on all the nodes of DC3.

Please clarify,

1) We are going issue the "alter keyspace" commad from a DC1 node. As there is no communication between DC2 and DC3, will the "alter keyspace" end up with any issues? Else, Will the "alter keyspace" be applied to entire cluster correctly without any issues regardless of no connectivity among DC2 and DC3?

2) In order to rebuild the data to all the nodes of new dc DC3, we are going run "nodetool rebuild" (with DC1) on every node of DC3. Here we believe the data streaming will be among DC1 and DC3 alone. With the disconnection of DC2 from DC3, can we execute rebuild on every DC3 node?

1

1 Answers

1
votes

So when you have a connectivity from DC1 to DC2 and DC3 respectively, and there is a replication which you have set to 2, then the data which comes into DC1 will automatically get copied to DC2 and DC3 based on your keyspace definition.

  1. When you Alter a keyspace for example like this

ALTER KEYSPACE "YourKeyspace" WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 ,'datacenter2': 2 };

OR

ALTER KEYSPACE "YourKeyspace" WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 ,'datacenter3': 2 };

it depends on the Datacenter and its replication factor which you want to set for that keyspace using the Alterkeyspace command.

If there is no connectivity between DC2 to DC3, any alter to DC1 keyspace can be guided by you in the ALTER keyspace command.

2) A nodetool rebuild on each node of DC3 is the correct approach for copying data from DC1. This will not get affected by DC2 at all. For reading more about this you can use the following link from Datastax : https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsRebuild.html