5
votes

We are running a 16 nodes kafka cluster on AWS, each node is a m4.xLarge EC2 instance, with 2TB EBS(ST1) disk. Kafka version is 0.10.1.0, we have about 100 topics at the moment. Some busy topics will have about 2 billion events every day, some low volume topics will only have thousands per day.

Most of our topics use an UUID as the partition key when we produce the message, so the partitions are quite evenly distributed.

We have quite a lot consumer consume from this cluster using consumer group. Each consumer has a unique group id. Some consumer group commit offsets every 500ms, some will commit offsets in sync as soon as it finishes processing a batch of messages.

Recently we observed a behaviour that some of the brokers are far busier than the others. With some digging, we find out, it is actually quite a lot traffic go to "__consumer_offsets", thus we created a tool to see the high watermark of each partitions in "__consumer_offsets", which reveal that the partitions are very uneven distributed.

Based on this link "Consumer offset management in Kafka"

It seems it is an intended behaviour, each consumer group only have one leader, thus committed offsets all need to go to this leader, and also only use “group.Id” to decide the partition.

Given the fact that we have some consumers consume from those very busy topics, thus the commit offsets will cause a lot traffic to "__consumer_offsets" topic on the broker that handle the consumer group.

My questions are :
1. Is there a way we can make sure that the consumer groups that consume from busy topics doesn't fall on to the same broker? Don’t' want to create a hotspot.

  1. For consumers that consumer from busy topics (topics have billions messages per day), is it a good idea to use consumer group?

Thanks in advance

1
Hey johnny , did you happen to find the resolution/answer for the same. I myself am struggling with the same issue of imbalanced message ins for a particular broker. stackoverflow.com/questions/49607708/…?palash kulshreshtha
No I don't, but I am intend to commit offsets to some key value store, like couchbase moving forward. but your issues seems quite different as mine, mine is on the _consumer_offset topicJohnny
actually imbalanced message in rates drilled down to excessive writes on single partition of __consumer_offsets topic.palash kulshreshtha
your question was the answer to my question, thanks :)palash kulshreshtha

1 Answers

0
votes

About question#1, no way at least in Kafka-1.0.0, which compute the partition this way "Utils.abs(groupId.hashCode) % groupMetadataTopicPartitionCount". So for the same group id will fall into the same partition.

For question#2, first, if your consumer can catch up with the speed of producer. it's ok to use single consumer. if the consumer lag is continue increasing then you should considering use group to speed up. You should remember that the max consumer in one group is limited by the partition number of topic which you consuming from. Second, the group can be used as HA solution from the view of consumer also.