We are using Flink 1.2.1, and we are consuming from 2 kafka streams by union one stream to another and process the unioned stream. e.g. stream1.union(stream2) However, stream2 has more than 100 times more volume than the stream1, and we are experiencing is there are huge consuming lag(more than 3 days of data) for stream2, but very little lag in stream1. We have already 9 partitions, but 1 as Parallelism, would increase paralelism solve the consuming lag for stream2, or we shouldn't do union in this case at all.
2 Answers
The .union()
shouldn't be contributing to the time lag, AFAIK.
And yes, increasing parallelism should help, if in fact the lag in processing is due to your consuming operators (or sink) being CPU constrained.
If the problem is with something at the sink end which can't be helped by higher parallelism (e.g. you are writing to a DB, and it's at its maximum ingest rate), then increasing the sink parallelism won't help, of course.
Yes, try increasing the parallelism for the stream2 source - it should help:
env.addSource(kafkaStream2Consumer).setParallelism(9)
At the moment you have a bottleneck of 1 core, which needs to keep up with consuming stream2 data. In order to fully utilise parallelism of Kafka, FlinkKafkaConsumer parallelism should be >= the number of topic partitions it is consuming from.
TimeCharacteristic
for the execution environment? – kkrugler