4
votes

Cassandra nodetool has a command called cleanup:

cleanup [keyspace][cf_name]

Triggers the immediate cleanup of keys no longer belonging to this node. This has roughly the same effect on a node that a major compaction does in terms of a temporary increase in disk space usage and an increase in disk I/O. Optionally takes a list of column family names.

My questions are:

  1. When will a node having keys not belonging to it?
  2. When should I issue a cleanup?
  3. Should I do cleanup regularly (e.g. once per week)?
1

1 Answers

7
votes

When will a node having keys not belonging to it?

When you have added new nodes to the cluster, decreased replication factor or moved tokens.

When should I issue a cleanup?

After one of the above operations, if you need to save disk space. There is no harm in delaying running it - there is a performance impact and the only reason to is to save disk space.

Should I do cleanup regularly (e.g. once per week)?

No, only if you need to save space after one of the above operations.