How to restore Cassandra shapshot after changing number of nodes in the cluster

Question

Let's assume we had the 3-node cluster with the following nodes: node1, node2, node3. One day we created a snapshot for entire cluster and copied snapshot tables from each node to external backup server.

Some time has passed and now the cluster is grown: instead of 3 nodes we have 5 nodes. Besides that, one of the original nodes no longer exists, so the cluster now looks as follows: node1, node3, node4, node5, node6.

How to properly restore the snapshot data into the changed cluster? Am I right that the only way to do it in the mentioned case is to use sstableloader?

If it is true, how to initiate the restore process providing that snapshots are placed on backup server where Cassandra is not installed? Do I need to install sstableloader there or I can launch it remotely?

How fast sstableloader will restore the data?

Marko Švaljek Marko Švaljek · Accepted Answer · 2017-04-06T20:32:08

Your best bet would be to read upon the following:

http://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

In short the sstable loader will be aware of the topology of the cluster to which you are pushing the data. So the data will in the end end up on the correct nodes.

If for some reason it might be more feasible for you to get the sstables over to the nodes. You just might put them to the correct folders, then run reload or restart ... and after that a cleanup, but compared to the sstable loader, it's just not as elegant.

How to restore Cassandra shapshot after changing number of nodes in the cluster

1 Answers