We had a cluster of 6 nodes with replication of 3. 4 new nodes were added just by starting them up and letting them join the cluster, then restarting all nodes to update the seed list. So every node has all other seeds updated and replication factor is still 3. Nodetool status shows all nodes to be UN (new nodes went to UN status fast) and describecluster shows all nodes joined the same schema. Nodetool status also shows old nodes have a lot of data and new nodes very little, just the new data coming in I guess. During the update we did not add auto_bootstrap param to cassandra.yaml, as far as I know default value is true. Token ranges are redistributed so each node owns about 10% of the range. Sorry but not able to copy paste/ screenshot this.
After adding the nodes everything seemed fine, then I noticed some data missing from 3 column families. Maybe there could be more missing, but couldn't spot it at the moment.
End goal is to migrate all the data from 6 old nodes do 4 new nodes and decommission the old nodes.
Questions:
If auto_bootstrap value is true, shouldn't the data be streamed from other nodes to new nodes? Will node be in UJ status until that is completed? My new nodes went to UN pretty fast, doesn't seem like the data is being streamed to them.
Why is my data missing? Shouldn't queries find the nodes with the data and just get it from there?
- Most important, how to get the data back? Must still be somewhere on disk. Closest answer I could find is:
You should perform nodetool rebuild on the new nodes after you add them with auto_bootstrap: false
But this is for the case when auto_bootstrap is false. Will rebuild/ repair help?
- What is best way to proceed to get all the data in 4 new nodes and decommission 6 old nodes? My plan was to decommission the old nodes one by one, distributing the data that way.
Cassandra version: 2.0.17 Using astyanax latest version, think it's 3.90