0
votes

I have multiple consumers all with the same group.id listening for a particular topic. The topic has one partition.

It is my understanding that consumers from the same consumer group (identified by identical group.id) would get messages in a round robin fashion such that a message only is handled by a single consumer. The consumers are running in different Windows Services on different machines.

The consumer is written in C# and based on Confluent's Apache Kafka .NET client.

The configuration looks like:

        var config = new Dictionary<string, object>
        {
            {"group.id", "MyConsumerGroupId"},
            {"enable.auto.commit", true},
            {"auto.commit.interval.ms", 5000},
            {"log.connection.close", false},
            {"session.timeout.ms", 30000},
            {"heartbeat.interval.ms", 5000},
            {"queued.min.messages", 1000},
            {"partition.assignment.strategy", "roundrobin"},
            {"bootstrap.servers", _kafkaCluster},
            {
                "default.topic.config", new Dictionary<string, object>
                {
                    {"auto.offset.reset", "largest"}
                }
            }
        };

However I do experience that all consumers gets the same messages. From the consumer I log info about the message received and here I see multiple log entries with same message, topic, offset, and partition.

Is this the expected behaviour?

2
Actually, it is weird you are getting the same message given all consumers belongs to the same groupid. But, when you say: "The consumers are running in different Windows Services on different machines.", what exactly does that mean? - dbustosp

2 Answers

1
votes

I think you are misunderstanding the relationship between partitions and consumers. Essentially 1 consumer will read data from just 1 partition.

Below I show the relationship between Consumers and Partitions in images extracted from Kafka: The Definitive Guide which I highly recommend you to read, specially Chapter 4: Kafka Consumers.

Exactly one-one relationship between consumers and partitions

Image below show 1 Consumer reading from multiple Partitions. In case one new consumer is registered to the system, then the load will be balanced so that both consumers will be reading data from 2 different partitions.

enter image description here

The last image below show what happen when the number of Consumers is greater than the number of partitions. Essentially 1 consumer will be idle.

enter image description here

> partition.assignment.strategy

Remember that we have Consumers which belong to a Consumer group. That particular flag will decide the strategy to use to assign Consumers to Topic partition. There are 2 strategies by default: Range and RoundRobin.

0
votes

In a group, there can only be 1 consumer assigned to a partition (and receiving messages).

If your consumers (all in the same group) are only subscribed to a topic with a single partition, only one of them will receive messages from it. All other consumers will be idle, ready to take over in case the assigned consumer terminates or crashes or more partitions are created.

The roundrobin configuration is for the partition assignment, not for messages.

What you've described seeing does not sound right.

Are you sure the consumer are all in the same group ? Can you check what the state is according to Kafka by running:

/bin/kafka-consumer-groups.sh --zookeeper ZOOKEEPER --describe --group MyConsumerGroupId