1
votes

I am designing high throughput system, where I'm gonna have several producers.

My topics will be partitioned. Producers will be sending records as key-value pairs.

Keys will be used to partition the data.

Consumers will be organized in consumer groups (they will be assigned the same group id so that they could simultaneously consume messages from the same topic, but from different partitions).

Kafka guarantees the order of messages within a single partition.

Consumers will be assigned their fair share of partitions.

The only thing that worries me, my partition key won't be distributing messages in the round-robin fashion and some partitions may be busier than others.

Q.: May uneven partitions impact performance of a Kafka cluster in any way? Are there any red flags?

I understand that some consumers will have more work to do, but that is not my main concern. Any help in this matter will be appreciated.

2

2 Answers

3
votes

To the previous good answer I'd like to add that even the replication factor can have an impact on your use case.

Follower nodes of the busy partition could be leader nodes for other partitions so other than copying a lot of messages from the busy partition they have to handle incoming messages from producers for their leader partitions. So even these nodes will be really busy to handle an heavy load.

1
votes

May uneven partitions impact performance of a Kafka cluster in any way?

Each partition has one server which acts as the "leader" and zero or more servers which act as "followers". The leader handles all read and write requests for the partition while the followers passively replicate the leader. So if your producer send mostly message to one partition, it will give a lot work to the the leader of this partition.If it is message flood to one partition, it will lag the write, slow down the node.

Are there any red flags?

quoted from here:

Kafka uses Yammer Metrics for metrics reporting in both the server and the client. This can be configured to report stats using pluggable stats reporters to hook up to your monitoring system. The easiest way to see the available metrics is to fire up jconsole and point it at a running kafka client or server; this will allow browsing all metrics with JMX.

There are maybe some metrics could reflect this slow down problem. :

Time the request waits in the request queue

Time the request is processed at the leader

Time the request waits for the follower

Time the request waits in the response queue