We have a flink application that has a map operator at the start. The output stream of this operation is routed to multiple window functions using filters. The window functions all have a parallelism of 1. We form a union of the output of the window functions and pass it to another map function and then send it to a sink.
We need the parallelism of both the map functions to take the parallelism of the environment. This happens as expected and the parallelism of the window function does turn out to be 1. We have set 1 slot per task manager.
The issue is that all the window function tasks end up going to only the 1st task manager when we set the parallelism of the environment to greater than 1. The events end up going to this task manager alone and end up causing a bottleneck. Is there a way to distribute the window function task across multiple task managers when we have parallelism > 1? Will doing a rebalance()
help?