5
votes

I'm using Kafka Streams to do concurrent work on a Kafka topic.

The stream is of the following form

stream(topic)
 .map(somefunction)
 .through(secondtopic)

I've set KStreams to have 15 worker threads, but it seems like the work isn't being balanced between threads correctly (or not at all). Might there be something wrong with my setup? I was expecting that the work would be evenly distributed among the worker threads, but it seems like that's not the case.

snapshot from jvisualvm

1
How many partitions are available in your topic?Kamal Chandraprakash
jvisualvm snapshot link is broken.Kamal Chandraprakash

1 Answers

12
votes

You can only have as many threads as there are input Kafka topic partitions.

The messages within one partition are handled by a single thread to provide a total order over messages delivery.

Actually, in KafkaStreams input topic partitions are evenly distributed across tasks not messages.

So, the work is well balanced between threads only if messages are well balanced between partitions.

To get more information about the threading model have a look at the Confluent documentation