0
votes

I have a Cassandra database used to persist the last hour of a steady stream of messages. TTL on each row is set to 1 hour. Querying DB confirms that old records are gone, but disk utilization keeps going up. It sometimes drops a little (due to compaction, I assume), but overall trend over about a week is growing disk usage, until the disk is full, at which point it stops accepting data.

Killing the process and restarting cleans up a little, but it starts at about 60G disk utilization on about 8-9G of actual data.

Trying to run ./nodetool compact just hangs there.

Where is the disk consumption coming from?

1
are you sure nodetool compact "just hangs"? it can take a long time to run. you should see progress by running nodetool compactionstats in a different console and compaction log messages in cassandra's system log. when it is finished, you will probably see a big drop in used disk space - Gryphius
How often do you run a repair? - Dirk Lachowski
@Gryphius I'll give it a shot, but I've left it sitting there for about half an hour and it has not come back. - Lirm

1 Answers

2
votes

TTL doesn't mean that your data vanishes from the disk. What it actually does it creates a tombstone which indicates that the record was deleted. This tombstone has to stick around incase another node did not receive the order to delete or suffered a network partition. Tombstones will not be removed until GC_GRACE seconds has expired which is by default 10 days. This means your data is going to stick around until that expiration occurs. This delay occurs so that you will have time to perform a repair prior to the tombstones finally being removed keeping dead data from being resurrected from a replica.

http://wiki.apache.org/cassandra/DistributedDeletes