https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html
I'm reading this doc of Flink and I can't quite understand the part of Execution Environment Level well.
Let's use the example of WordCount.
So if I code env.setParallelism(3);
in this example, does it mean that I will have three parallel pipelines of Source + map() --- keyBy()/window()/apply() --- Sink
? What makes me confused is if I have three Sink
s, how could I get the result correctly?
If there is only one Sink
, I think there won't be any issues. I mean no matter how many Source + map()
I have, the only Sink
can produce one result. But now I have three Sink
s...
// Case 1
Source + map() --- keyBy()/window()/apply() ----\
Source + map() --- keyBy()/window()/apply() --- Sink (the only Sink will merge the outputs coming from three pipelines and produce only one result)
Source + map() --- keyBy()/window()/apply() ----/
// Case 2
Source + map() --- keyBy()/window()/apply() --- Sink
Source + map() --- keyBy()/window()/apply() --- Sink
Source + map() --- keyBy()/window()/apply() --- Sink
// There are three sinks, how could I get the result?
So we shouldn't use setParallelism()
in this example or I misunderstood something?