I just got the example below for the parallelism and have some related questions:
The setParallelism(5) is setting the Parallelism 5 just to sum or both flatMap and sum?
Is it possible that we can set the different Parallelism to different operators such as flatMap and sum respectively ?such as set Parallelism 5 to sum and 10 to flatMap .
Based on my understanding ,keyBy is partitioning the DataStream to logical Stream\partitions based on the different keys, and suppose there are 10,000 different key values, so there are 10,000 different partitions , then how many threads would deal with the 10,000 partitions? Just 5 threads? How about if we didn't set the setParallelism(5) ?
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/parallel.html
final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
DataStream<String> text = [...]
DataStream<Tuple2<String, Integer>> wordCounts = text
.flatMap(new LineSplitter())
.keyBy(0)
.timeWindow(Time.seconds(5))
.sum(1).setParallelism(5);
wordCounts.print();
env.execute("Word Count Example");