Flink Window and State Maintenance

Question

I'm working on apache flink for data streaming and I have few questions. Any help is greatly appreciated. Thanks.

1) Are there any restrictions on creating tumbling windows. For example, if I want to create a tumbling window per user id for 2 secs and let’s say if I have more than 10 million user id's would that be a problem. (I'm using keyBy user id and then creating a timeWindow for 2 secs)? How are these windows maintained internally in flink?

2) I looked at rebalance for round robin partitioning. Let’s say I have a cluster set up and if I have a parallelism of 1 for source and if I do a rebalance, will my data be shuffled across machines to improve performance? If so is there a specific port using which the data is transferred to other nodes in the cluster?

3) Are there any limitations on state maintenance? I'm planning to maintain some user id related data which could grow very large. I read about flink using rocks db to maintain the state. Just wanted to check if there are any limitations on how much data can be maintained?

4) Also where is the state maintained if the amount of data is less? (I guess in JVM memory) If I have several machines on my cluster can every node get the current state version?

Fabian Hueske Fabian Hueske · Accepted Answer · 2016-09-24T18:23:28

If you keyBy your stream on user, Flink will internally partition the stream by users. Hence, the users are distributed across a set of parallel subtasks. The parallelism of the window operator controls the load on each parallel subtask. Handling 10 million users should be no problem if you assign enough machines and configure the parallelism of the program appropriately.
Yes, rebalance() will shuffle over the network if your job runs on multiple machines. With default configuration the data port is automatically chosen. If you need a fixed port, you can use the taskmanager.data.port key to configure it.
The state size limitations depend on the configured state backend. With the RocksDB state backend, the limit is the size of your local filesystem, i.e., RocksDB spills data to disk. In case you hit this limit, you can increase the parallelism because each worker usually handles the key of multiple keys.
It depends on the implementation of the state backend where the state is persisted (disk or memory). I would assume that also the RocksDB state backend which writes to disk caches some data in memory. Please note that operator state is not globally accessible, i.e., each parallel subtask of an operator has only access to its own local state and cannot read or write the state of another subtask of the same operator.

Flink Window and State Maintenance

1 Answers