Cassandra nodetool repair options

Question

I have a 15 node cluster with RF 3 (using vnodes). We are ingesting data into the 15 nodes from multiple clients. It turns out that one of the nodes has been down for a couple of days and it's now almost 200 GBs behind, the other nodes have approx 380 GB.

What sort of nodetool repair would you recommend here? I know that the nodetool repair operation is CPU intensive and this might affect the rate at which the clients would be ingesting into the cluster. There seems to be several nodetool repair operations such as -snapshot, -par, etc and I was wondering if any of these options would better suit my current scenario.

I'm trying to run the repair with the least performance hit possible on the cluster.

Thanks, mskh

Aaron Aaron · Accepted Answer · 2014-07-09T13:26:14

Unless you have already taken a snapshot to repair from, the -snapshot option won't do you any good.

Do you have multiple datacenters? If so, you could do a nodetool repair -local, which would only repair your node from nodes in its local datacenter. This is a good way to repair a node without affecting overall cluster performance.

Otherwise Rock's suggestion of repairing only the first partition range (in parallel) is worth trying, as well.

Cassandra nodetool repair options

2 Answers