0
votes

Let’s say I have an A-Event KStream aggregated into an A-Snapshot KTable and a B-Event KStream aggregated into a B-Snapshot KTable. Neither A-Snapshot nor B-Snapshot conveys null values (delete events are instead aggregated as state attribute of the snapshot). At this point, we can assume we have a persisted kafka changelog topic and a rocksDB local store for both the A-KTable and B-KTable aggregations. Then, my topology would join the A-KTable with B-KTable to produce a joined AB-KStream. That said, my problem is with the A-KTable and B-KTable materialization lifecycles (both the changelog topic and the local rocksdb store). Let’s say A-Event topic and B-Event topic retention strategies were set to 2 weeks, is there a way to side effect the kafka internal KTable materialization topic retention policy (changelog and rocksDB) with the upstream event topic delete retention policies? Otherwise, can we configure the KTable materialization with some sort of retention policy that would manage both the changelog topic and the rockdb store lifecycle? Considering I can’t explicitly emit A-KTable and B-KTable snapshot tombstones? I am concerned that the changelog and the local store will grow indefinitely,..,

1

1 Answers

2
votes

At the moment, KStream doesn't support out of the box functionality to inject the cleanup in Changelog topics based on the source topics retention policy. By default, it uses the "Compact" retention policy.

There is an open JIRA issue for the same : https://issues.apache.org/jira/browse/KAFKA-4212

One option is to inject tombstone messages but that's not nice way.
In case of windowed store, you can use "compact, delete" retention policy.