0
votes

I am a beginner on Flink streaming. I am looking at processing events at around 5000 incoming per second, and need to look up an event window of the past 3 days. My question is: where does Flink store its Window data? Would I be limited by the size of RAM? At 5000 per second and 2000 bytes per event, I am looking at a very large storage requirement for a three day Window.

1

1 Answers

1
votes

Flink offers quite a option plethora regarding storing the temporary "3 day window data" (commonly referred to as "window state"). By default, it is stored in memory (limited by the JobManager memory), but as you mentioned it can grow pretty quickly depending on the size of the window. Therefore, Flink supports saving a copy (a snapshot) of the state to disk, a process called checkpointing. To achieve this, you should configure a state backend e.g. RocksDB.

More on this:

[1] https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/state/

[2] https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/state/checkpointing.html

[3] https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html#the-memorystatebackend