Desired behaviour
I'm trying to configure cassandra cdc in a way that the commitlogsegments are flushed periodically to the cdc_raw directory (let's say every 10 seconds).
Based upon documentation from http://abiasforaction.net/apache-cassandra-memtable-flush/ and from https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configCDCLogging.html I found:
memtable_flush_period_in_ms – This is a CQL table property that specifies the number of milliseconds after which a memtable should be flushed. This property is specified on table creation.
and
Upon flushing the memtable to disk, CommitLogSegments containing data for CDC-enabled tables are moved to the configured cdc_raw directory.
Putting those together I would think that by setting memtable_flush_period_in_ms: 10000 cassandra flushes it's CDC changes to disk every 10 seconds, which is what I want to accomplish.
My configuration
Based upon aforementioned and my configuration I would expect that the memtable gets flushed to the cdc_raw directory every 10 seconds. I'm using the following configuration:
cassandra.yaml:
cdc_enabled: true
commitlog_segment_size_in_mb: 1
commitlog_total_space_in_mb: 2
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
table configuration:
memtable_flush_period_in_ms = 10000
cdc = true
Problem
The memtable is not flushed periodically to the cdc_raw directory, but instead gets flushed to the commitlogs directory when a certain size threshold is reached.
In detail, the following happens:
When a commitlogsegment reaches 1MB, it's flushed to the commitlog directory. There is a maximum of 2 commitlogs in the commitlog directory (see configuration commitlog_total_space_in_mb: 2). When this threshold is reached, the oldest commitlog file in the commitlog directory is moved to the cdc_raw directory.
Question
How to flush Cassandra CDC changes periodically to disk?