0
votes

I have an old Cassandra cluster that I want to get rid of, and want to transfer data from only few selected tables from old cluster to a new one that I have created. I have tried using Cassandra's COPY command on a table that has about 15 million rows (approx 20 columns for each row). When I try to import data from the csv file to the same table in our new cluster, I am getting this response constantly :

Failed to import 20 rows: WriteTimeout - Error from server: code=1100 [Coordinator node timed out waiting for replica no des' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'required_resp onses': 1, 'consistency': 'ONE'}, will retry later, attempt 1 of 5

Apparently, this approach is not working. Is there a way that I can stream only some tables from one cluster to another? Note, although we have millions of rows, the data is not that huge. The biggest table I have is about 2.5 GB.

They keyspace is currently configured to use SimpleStrategy. Will using NetworkTopologyStrategy help? I should point out that I only want to stream data from few tables, leaving other tables out.

2

2 Answers

0
votes

I would suggest using sstableloader for this job. Just FYI you can also use nodetool snapshot to make copies of the tables you want and scp them where ever you need them.

On another note, it is never a good idea to use Simple Strategy in any kind of production. NetworkTopologyStrategy is a good alternative.

0
votes

I have successfully used the strategy you are using for copying data from one cluster to another.

In general restoring from snapshot is recommended. But when the use case is not to restore whole data to a new cluster, but only to transfer few not so big tables, COPY FROM and then COPY TO is simple effective strategy.

Stick to your strategy and focus only on the error you are getting.

I would suggest try using smaller batch size.

  • cqlsh $host -e "use $keyspace; COPY $keyspace.$table FROM '${file}' WITH MAXBATCHSIZE='1'";