Add nodes to existing Cassandra cluster

Question

We currently have a 2 node Cassandra cluster. We want to add 4 more nodes to the cluster, using the rack feature. The future topology will be:

node-01 (Rack1)
node-02 (Rack1)
node-03 (Rack2)
node-04 (Rack2)
node-05 (Rack3)
node-06 (Rack3)

We want to use different racks, but the same DC.

But for now we use SimpleStrategy and replication factor is 1 for all keyspaces. My plan to switch from a 2 to a 6 node cluster is shown below:

Change Endpoint snitch to GossipingPropetyFileSnitch.
Alter keyspace to NetworkTopologyStrategy...with replication_factor 'datacenter1': '3'.

According to the docs, when we add a new DC to an existing cluster, we must alter system keyspaces, too. But in our case, we change only the snitch and keyspace strategy, not the Datacenter. Or should I change the system keyspaces strategy and replication factor too, in the case of adding more nodes and changing the snitch?

are racks will be really used? is it AZ in AWS, or separate physical racks? What is the current snitch configured? — Alex Ott
if they aren't physical racks, why you need them? You could get more problems when extending cluster next time, as you'll need to add same number of nodes to each rack — Alex Ott

Aaron Aaron · Accepted Answer · 2020-05-26T12:47:43

First, I would change the endpoint_snitch to GossipingPropertyFileSnitch on one node and restart it. You need to make sure that approach works, first. Typically, you cannot (easily) change the logical datacenter or rack names on a running cluster. Which technically you're not doing that, but SimpleStrategy may do some things under the hood to abstract datacenter/rack awareness, so it's a good idea to test it.

If it works, make the change and restart the other node, as well. If it doesn't work, you may need to add 6 new nodes (instead of 4) and decommission the existing 2 nodes.

Or should I change the system keyspaces strategy and replication factor too?

Yes, you should set the same keyspace replication definition on the following keyspaces: system_auth, system_traces, and system_distributed.

Consider this situation: If one of your 2 nodes crashes, you won't be able to log in as the users assigned to that node via the system_auth table. So it is very important to ensure that system_auth is replicated appropriately.

I wrote a post on this some time ago (updated in 2018): Replication Factor to use for system_auth

Also, I recommend the same approach on system_traces and system_distributed, as future node adds/replacements/repairs may fail if valid token ranges for those keyspaces cannot be located. Basically, using the same approach on them prevents potential problems in the future.

Edit 20200527:

Do I need to launch the nodetool cleanup on old cluster nodes after the snitch and keyspace topology changes? According to docs "yes," but only on old nodes?

You will need to run it on every node, except for the very last one added. The last node is the only node guaranteed to only have data which match its token range assignments.

"Why?" you may ask. Consider the total percentage ownership as the cluster incrementally grows from 2 nodes to 6. If you bump the RF from 1 to 2 (run a repair), and then from 2 to 3 and add the first node, you have a cluster with 3 nodes and 3 replicas. Each node then has 100% data ownership.

That ownership percentage gradually decreases as each node is added, down to 50% when the 6th and final node is added. But even though all nodes will have ownership of 50% of the token ranges:

The first 3 nodes will still actually have 100% of the data set, accounting for an extra 50% of the data that they should.
The fourth node will still have an extra 25% (3/4 minus 1/2 or 50%).
The fifth node will still have an extra 10% (3/5 minus 1/2).

Therefore, the sixth and final node is the only one which will not contain any more data than it is responsible for.

Add nodes to existing Cassandra cluster

1 Answers