We have always wondered why one of our clusters is showing that an Analytics node owns data. I have edited, ips, tokens, and host ids for readability
% nodetool status
Datacenter: Cassandra
=====================
Status=Up/Down|/ State=Normal/Leaving/Joining/Moving
-- Address Load Owns Host ID Token Rack
UN 172.32.x.x 46.83 GB 18.5% someguid 0 rack1
UN 172.32.x.x 60.26 GB 33.3% anotherguid ranbignumber rack1
UN 172.32.x.x 63.51 GB 14.8% anothergui ranbignumber rack1
Datacenter: Analytics
=====================
Status=Up/Down|/ State=Normal/Leaving/Joining/Moving
-- Address Load Owns Host ID Token Rack
UN 172.32.x.x 28.91 GB 0.0% someguid 100 rack1
UN 172.32.x.a 30.41 GB 33.3% someguid ranbignumber rack1
UN 172.32.x.x 17.46 GB 0.0% someguid ranbignumber rack1
So does the Analytics node with ip 172.32.x.a actually own data? If so do we need to back it up? Also would decommissioning the node move the data back into the appropriate nodes?
This is the node that I am referring to from the above nodetool status that is in the Datacenter Analytics:
UN 172.32.x.a 30.41 GB 33.3% someguid ranbignumber rack1
Again the questions (updated with answers provided below).
- Do we need to backup this node up? Answer: YES
- Should this node have data? Answer: YES, otherwise analytics performance will be impacted.
- If it should not have data will nodetool decommission move the data back into the other nodes? Answer: NO replication strategy drives this
Here is the update for
% nodetool status our_important_keyspace
Datacenter: Cassandra
=====================
Status Address Load Owns (effective)
UN 2 63.16 GB 81.5%
UN 1 47.21 GB 33.3%
UN 3 59.87 GB 85.2%
Datacenter: Analytics
=====================
Status Address Load Owns (effective)
UN 3 17.74 GB 33.3%
UN 2 30.62 GB 33.3%
UN 1 29.21 GB 33.3%
Backing up Analytics today - awesome answer, and probably saved us a TON of pain.