Flink keyed state clean up for incremental rocksdb checkpointing

Question

We have a flink job that would persist large keyed state in rocksdb backend. We are using incremental checkpointing strategy. As time goes by, the size of the state become a problem. We have checked the state ttl solution but it does not support incremental rocksdb states.

What would be the best approach for this problem if I really need incremental checkpoint?

David Anderson David Anderson · Accepted Answer · 2018-12-13T14:32:06

One approach that is often used is to manipulate the state in some kind of ProcessFunction, and use a timer to clear the state when it is no longer needed -- e.g., if it hasn't been accessed for several hours. ProcessFunctions are able to have both event-time and processing-time timers, so you can choose whichever is more appropriate for your use case.

See the expiring state exercise on the Flink training site for an example of using timers to clear state.

Flink keyed state clean up for incremental rocksdb checkpointing

1 Answers