69
votes

We are planning to write a Kafka consumer(java) which reads Kafka queue to perform an action which is in the message.

As the consumers run independently, will the message is processed by only one consumer at a time? Else all the consumers process the same message as they have their own offset in the partition.

Please help me understand.

3
looks like kafka doesn't have queues. it has only topicsgstackoverflow
All kafka topics are ordered sets - in other words, they are queues.Rodney P. Barbati
Kafka topics are not queues, because once a message is consumed from a topic, it stays there(unless its lifetime has expired) and the offset moves to the next, whereas for a queue, once a message is consumed, the message is removed from that queue. Ordered sets is also by partitions only.jumping_monkey

3 Answers

143
votes

It depends on Group ID. Suppose you have a topic with 12 partitions. If you have 2 Kafka consumers with the same Group Id, they will both read 6 partitions, meaning they will read different set of partitions = different set of messages. If you have 4 Kafka cosnumers with the same Group Id, each of them will all read three different partitions etc.

But when you set different Group Id, the situation changes. If you have two Kafka consumers with different Group Id they will read all 12 partitions without any interference between each other. Meaning both consumers will read the exact same set of messages independently. If you have four Kafka consumers with different Group Id they will all read all partitions etc.

27
votes

I found this image from OReilly helpful:

kafka

Within same group: NO

  • Two consumers (Consumer 1, 2) within the same group (Group 1) CAN NOT consume the same message from partition (Partition 0).

Across different groups: YES

  • Two consumers in two groups (Consumer 1 from Group 1, Consumer 1 from Group 2) CAN consume the same message from partition (Partition 0).
17
votes

Kafka will deliver each message in the subscribed topics to one process in each consumer group. This is achieved by balancing the partitions between all members in the consumer group so that each partition is assigned to exactly one consumer in the group. Conceptually you can think of a consumer group as being a single logical subscriber that happens to be made up of multiple processes.

In simpler words, Kafka message/record is processed by only one consumer process per consumer group. So if you want multiple consumers to process the message/record you can use different groups for the consumers.