0
votes

We had a cassandra cluster with 2 nodes in the same datacenter with a keyspace replication factor of 2 for keyspace "newts". If i ran nodetool status i could see that the load was somewhat the same between the two nodes and each node sharing 100%.

I went ahead and added a third node and i can see all three nodes in the nodetool status output. I increased the replication factor to three since i now have three nodes and ran "nodetool repair" on the third node. However when i now run nodetool status i can see that the load between the three nodes differ but each node owns 100%. How can this be and is there something im missing here?

nodetool -u cassandra -pw cassandra status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns (effective)  Host ID                               Rack
UN  84.19.159.94  38.6 GiB   256          100.0%            2d597a3e-0120-410a-a7b8-16ccf9498c55  rack1
UN  84.19.159.93  42.51 GiB  256          100.0%            f746d694-c5c2-4f51-aa7f-0b788676e677  rack1
UN  84.19.159.92  5.84 GiB   256          100.0%            8f034b7f-fc2d-4210-927f-991815387078  rack1

nodetool status newts output:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns (effective)  Host ID                               Rack
UN  84.19.159.94  38.85 GiB  256          100.0%            2d597a3e-0120-410a-a7b8-16ccf9498c55  rack1
UN  84.19.159.93  42.75 GiB  256          100.0%            f746d694-c5c2-4f51-aa7f-0b788676e677  rack1
UN  84.19.159.92  6.17 GiB   256          100.0%            8f034b7f-fc2d-4210-927f-991815387078  rack1
2
Which version of cassandra? And what is the output of "nodetool status newts"? Did you ran "nodetool cleanup" after adding the nodes?Mandraenke
After all new nodes are running, run nodetool cleanup on each of the previously existing nodes to remove the keys that no longer belong to those nodes. Wait for cleanup to complete on one node before running nodetool cleanup on the next node.Rocherlee
Cassandra 3.11.3, cqlsh 5.0.1. Nodetool status newts output is updated in the OP. I did not run nodetool cleanup, i thought that was only needed when removing nodes?nillenilsson
@Rocherlee i ran nodetool cleanup on the two previous existing nodes and the command finished within 10 seconds on each node and the output of nodetool status did not change.nillenilsson

2 Answers

1
votes

As you added a node and there are now three nodes and increased your replication factor to three - each node will have a copy of your data and so own 100% of your data.

The different volume for "Load" can result from not running nodetool cleanup after adding your third node on the two old nodes - old data in your sstables won't be removed when adding the node (but later after a cleanup and/or compaction):

Load - updates every 90 seconds The amount of file system data under the cassandra data directory after excluding all content in the snapshots subdirectories. Because all SSTable data files are included, any data that is not cleaned up, such as TTL-expired cell or tombstoned data) is counted.

(from https://docs.datastax.com/en/cassandra/3.0/cassandra/tools/toolsStatus.html)

0
votes

You just run nodetool repair on all 3 nodes and run nodetool cleanup one by one on existing nodes then restart the node one after another seems it works.