I have a Cassandra database used to persist the last hour of a steady stream of messages. TTL on each row is set to 1 hour. Querying DB confirms that old records are gone, but disk utilization keeps going up. It sometimes drops a little (due to compaction, I assume), but overall trend over about a week is growing disk usage, until the disk is full, at which point it stops accepting data.
Killing the process and restarting cleans up a little, but it starts at about 60G disk utilization on about 8-9G of actual data.
Trying to run ./nodetool compact just hangs there.
Where is the disk consumption coming from?
nodetool compactionstatsin a different console and compaction log messages in cassandra's system log. when it is finished, you will probably see a big drop in used disk space - Gryphius