1
votes

I got a statement below:

"Depending on your state backend, Flink can also manage the state for the application, meaning Flink deals with the memory management (possibly spilling to disk if necessary) to allow applications to hold very large state."

https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/state/state_backends.html

Does it mean that only when the state backends is configured to RocksDBStateBackend, the state would keep in memory and possibly spilling to disk if necessary?

However if configured to MemoryStateBackend or FsStateBackend, the state only keep in memory and would never be spilled to disk.

1

1 Answers

5
votes

Yes in general you are right. Only with RocksDBStateBackend there will be spilling data to disk.

In case of both MemoryStateBackend and FsStateBackend the state is always kept in TaskManagers memory and thus must fit in there. The difference between those two backends is the way they checkpoint data.

  • In case of MemoryStateBackend the checkpoint data is sent to JobManager and kept also in memory there.

  • The FsStateBackend stores data upon checkpoint in FileSystem and sends only small metadata to JobManager (or in HA scenario stores in metadata folder)

Therefore for any production use-cases the RocksDBStateBackend is highly encouraged. More in-depth information you can find here.