1
votes

If I create

  • 2 kafka consumer instances
  • passing same properties
  • subscribe on same topic

Will these 2 Consumer instances (at diff group Id), have similar partition structures, or could be different ?

i.e, if I do .assignment() will I get same result at both


My actual problem statement, where I will be using this validation

In my application, I am attaining offset of broker, at a particular state(This is being done through my 1st kafka consumer object).

Later, I am creating the 2nd kafka consumer object, and using this to iterate over the topic, seeking from the earlier offset attained.

(So if the supposition mentioned at question is false, my logic will fail)

2
I still can't understand what you want to achieve? parallel processing? set manually starting offset? change it dinamically?Bartosz Wardziński
@BartoszWardziński Fetch the last offset of topic at some stage of my application. At a later stage(probably ill have more messages populated at my topic by now), read new entries at this topic from this saved offsetAditya Rewari
How it was written at one of the answers you can use group management for that - it will be achieved automatically - for freeBartosz Wardziński

2 Answers

1
votes

Let me clear this up.

Kafka has topics, and the consumers can subscribe to them. Each topic has partitions (which you can define when you create them). When there are multiple partitions for a given topic, each topic partition is assigned to a consumer in a consumer group. If you have more consumers than the number of partitions, those extra consumers become idle.

If you want your two Kafka consumers to consume messages separately, you have to add them into two different consumer groups. If you have a single Kafka consumer within a Consumer group, all the partitions are assigned to that consumer.

So if you want to get the same result for the two consumers, you may add them into two different consumer groups.

1
votes

Let's say you have 10 partitions in the topic for which you are doing subscribe after creating the consumer object. When you create the first consumer object and start to perform the poll operation, All the 10 partitions will be assigned to this consumer object.

When you create the second consumer object, the consumer group co-ordinator would realize that some other consumer has joined in the group and hence a rebalance would trigger. Depending on the partition assignment strategy used, some of the partitions would be assigned to the second consumer. In the default case, 5 of the partition would be taken from the first consumer and assigned to the second consumer. Now each consumer will have 5 partitions each.

So the partition structure would change after creating the second consumer and after the poll operation from the second consumer.

EDIT1: If you do the .assignment() on both the consumer, after you have started to consume from both the consumer. You will get the same result.

EDIT2:: If you have two different consumer group id, and if there is only 1 consumer in both the group, then yeah the partition structure would be the same.

If you have multiple consumers under a consumer group, but both of them have the same number of consumers in the consumer group(say 3 consumer in each consumer group) and the consumer partition strategy used is also the same, then the partition structure would be same.

If you have multiple numbers of consumers in both the group but different numbers in each consumer group(say first consumer group has 2 consumers and second consumer group has 3 consumers), then as you can guess the partition structure would be different.