1
votes

I was going through lot of blogs and stack overflow answers , but i am not clear about the Flink memory management. In few blogs i found "Memory Manager Pool" and "Rocksdb". I am using rocksdb and i assume all my state is stored in that db.

Here are my doubts..

  • How the memory management process handled in streaming ?
  • what is difference between Memory management in streaming and batch ?
  • Difference between "Memory Manager Pool" and "back end state (Rcokdb")
  • In streaming, what you mean by "Flink Managed Memory" ? does include the memory required by RacksDb cache and buffers ?
2

2 Answers

3
votes

Streaming

When you use RocksDBStatebackend all KeyedState (ValueState, MapState, ... and Timers) is stored in RocksDB. OperatorState is kept on the Heap. OperatorState is usually very small, and seldomly used directly by a Flink developer.

For Flink 1.10+, managed memory includes all memory used by RocksDB. Flink makes sure that RocksDB's memory usage stays within the limits of the assigned managed memory. Use taskmanager.memory.managed.fraction to tune how much memory you give to RocksDB. Usually, you can give all memory but 500MB to RockSDB.

Batch

Batch Programs do not use a Statebackend. Managed memory is used for off-heap joins, sorting, etc. The memory configurations like taskmanager.memory.managed.fraction are the same for batch and streaming.

0
votes

As per Flink documents memory management in Streaming and batch handled differently