1
votes

Say, we have a Cassandra cluster of 2 nodes. Data with key range [A-D] is inserted into the cluster. Roughly, we can suppose that node 1 stores data with key range [A-B] and node 2 stores data with key range [C-D]. Some time later, we add 2 more nodes. For balancing, partitions should be re-assigned, right? We now expect that each node stores data for exactly 1 key. Does Cassandra re-assign then move existing data to the new node (e.g. existing data with key B from node 1 to node 3)? And how?

1

1 Answers

0
votes

Cassandra uses vnodes or virtual nodes by default. Each node does not have one single range (ie [A-C]) but hundreds (256 by default, num_tokens in cassandra.yaml). Depending on your version these token ranges are assigned by random or in earlier versions distributed to maximize equal distribution. This way if one node falls down or if you add a node, all the nodes in the cluster will be next to one of that nodes ranges to share the burden.