0
votes

does anybody knows of a possible reason of slowing down messages processing when more Kafka brokers are added to the cluster?

The situation is the following:

1 setup: In a Kafka cluster of 3 brokers I produce some messages to 50 topics (replication factor=2, 1 partition, ack=1), each has a consumer assigned. I measure the avg time to process 1 message (from producing to consuming).

2 setup: I add 2 more Kafka brokers to the cluster - they are created by the same standard tool, so have the same characteristics like cpu/ram, and the same Kafka configs. I create 50 new topics (replication factor=2, 1 partition, ack=1) - just to save my time and not doing replicas reassignment. So the replicas are spread over the 5 brokers. I produce some messages only to the new 50 topics and measure the avg processing time - it became slower in almost 1/3.

So I didn't change any settings of producer, consumers or brokers (except for listing 2 new brokers in the config of Kafka and zookeeper), and can't explain the performance drop. Please point me to any config option/log file/useful article that would help to explain this, and thank you so much in advance.

1

1 Answers

0
votes

In a Kafka cluster of 3 brokers I produce some messages to 50 topics

In the first setup, you have 50 topics with 3 brokers.

I add 2 more Kafka brokers to the cluster. I create 50 new topics

In the second setup, you have 100 topics with 5 brokers.

Even supposing scaling should be linear, 100 topics should contain 6 brokers but not 5


So the replicas are spread over the 5 brokers

  1. Here, how the replicas are spread also matters. A broker may be serving 10 partitions as leader, another broker may be serving 7 and so on. This being the case, a particular broker may have more load compared to other brokers. This could be the cause for slow down.
  2. Also, when you have replication.factor=2, what matters here is whether acks=all or acks=1 or acks=0. If you have put acks=all, then all the replicas must acknowledge the write to the producer which could slow it down.
  3. Next is the locality and configuration of the new brokers, under what machine configurations they are running, their CPU config, RAM, processor load, network between the old brokers, new brokers and clients are also worth considering.
  4. Moreover, if your application is consuming a lot of topics, it necessarily would have to make requests to a lot of brokers since the topic partitions are spread among different brokers. Utilizing one broker to the fullest (CPU, memory etc) vs utilizing multiple brokers can be benchmarked.