1
votes

My table has time window compaction strategy enabled (TWCS), for some reason I have a lot of SStables just with tombstones.

When I run a manual compaction on a single sstable, it does not get removed. If I run a reapir it will join all the sstables on a single one, which breaks TWCS.

According to the documentation on the nodetool scrub command:

Scrub automatically discards broken data and removes any tombstoned rows that have exceeded gc_grace period of the table.

Will this join all the sstables?

1

1 Answers

3
votes

Short answer: the scrub is not joining the sstables.

Long answer: keep reading.

I have checked the code in Cassandra 3.11.2, but the code is similar on 3.0 and 2.2.

The sstables are scrubbed in parallel, using the compaction threads, each thread scrubbing one sstable.

As you can see in ColumnFamilyStore.java the scrub command is ran using the CompactionManager threads.

The interesting function to inspect is parallelAllSSTableOperation. All live sstables (excluding the ones marked as suspects - for instance because of some exceptions during compaction) belonging to the table are marked as compacting, all compactions running on that table are paused and the operation is executed against each sstable, in parallel.

In the case of scrub, the operation is scrubOne which calls the Scrubber.scrub(). This one obsoletes the old sstable and creates a new sstable that contains the live rows.

At the end of parallelAllSSTableOperation the list of sstables marked as compacting should be empty and the operation is successfull. No join of sstables is performed.

So, you can see that the scrub tool is invasive: it obsoletes old sstables, discarding tombstones and keeping the live rows in new sstables.

I hope this helps and I didn't miss anything :).