3
votes

As far as I know handles Kafka Streams its States localy in memory or on disc or in a Kafka topic because all the input date is from a partition, where all the messages are keyed by a defined value. Most of the time the computations can be done without knowing the state of other Processors. If so, you have another Streams instance whichs calculsates the result. Like in this picture:

enter image description here

Where exactly does Flink store its States? Can Flink also store the states locally or does it always publish them always to all instances (tasks)? Is it possible to configure Flink so that it stores the States in a Kafka Broker?

1

1 Answers

4
votes

Flink also uses local stores (that can be keyed), similar to Kafka Streams. However, it does not write state into Kafka topics.

For fault-tolerance, it takes so-called "distributed snapshots", that are stored in a configurable state backend (eg, HDFS).

Check out the docs for more details: