1
votes

I have a Single Node Cassandra Cluster which has around 44gb of data on it(/var/lib/cassandra/data/my_keyspace). The current storage is 1 tb and I need to migrate all the data to another VM which will have the same setup(single node cluster). My data-node has data being pushed to it every second so I can't afford any downtime(Some sensors are pushing time-series data).

Keyspace :- CREATE KEYSPACE my_keysopace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}  AND durable_writes = true;

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack

UN  127.0.0.1  43.4 GiB   256          100.0%            e0ae36db-f639-430c-91ad-6af3ffb6f906  rack1

After a bit of research I decided it's best to add the new node to existing cluster and then let the old node stream all the data and after streaming is done, decommission the old node.

Source :- https://docs.datastax.com/en/archived/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html

  1. Configure old node as seed node for the new node
  2. Add new node node to the ring(auto_bootstrap = true)
  3. Once the status is UN for both nodes, run nodetool cleanup on old node
  4. Decommission the old node

My only concern is will I be facing any data loss/ is this approach appropriate ? Please let me know if I am missing anything here

Thanks

1

1 Answers

0
votes

Firstly, disclaimer, using a single node of C* voids the purpose of the distributed database. Minimal cluster size tends to be 3 so some nodes can go offline without downtime (I'm sure you've seen this warning before). Now with that out the way, let's discuss the process.

  1. Configure old node as seed node for the new node

Yep.

1.5. (Potentially missing step) The step you're missing is the consistency level of your queries needs to be verified. I see you're using replication_factor 1 for all keyspaces in use so make sure you're using a CONSISTENCY_LEVEL of ONE for your queries.

  1. Add new node node to the ring(auto_bootstrap = true)

Sounds good. Make sure you've configured various ports / listen_address etc.

  1. Once the status is UN for both nodes,

Once you reach UN double-check that the client isn't seeing any consistency errors.

3.5. run nodetool cleanup on old node

3.5. (Redundant step) You don't need to run nodetool cleanup. You won't care about left over data from the decommissioned node, as all the data will be moved to the new node replacing it.

  1. Decommission the old node

Yep.

  1. (Missing step) You'll have to modify the new node to see itself as a seed once you've decomissioned the old node or it wont be able to re-start.