0
votes

I have a use case where a message needs to broadcasted to all nodes in a horizontally scalable, stateless, application cluster and I am considering Kafka for it. Since each node of the cluster will need to receive ALL messages in the topic, each node of the cluster needs to have its own consumer group.

One can assume here that the volume of messages is not so high that each node cannot handle all messages.

To achieve this with Kafka, I would end up using the instanceId (or some unique identifier) of the consumer process as the consumer group id when consuming from the topic. This will push the number of consumer groups high. As redeployments are done, new consumer groups will start.

  • How many active consumer groups can I have at maximum at any given time? Will number of consumer groups become a bottleneck before other bottlenecks (like bandwidth etc) kick in?
  • There will be churn of active consumer groups upon frequent deployment of consumer application. Will this churn over long periods of time in consumer groups scale/sustain for Kafka?
1
You don't need to set a group.id. You don't need a consumer group to consume.Igor Soarez

1 Answers

1
votes

Self Answer to my question: One solution that came from further research is to use the kafka assign() API instead of the subscribe() API to consume. The former does not need a consumer group. I just configure every node to consume messages from all the partitions of the topic.

Acknowledgement to Igore Soarez who seeded the idea of not needing consumer groups to consume in comments.