1
votes

lets say my consumer is polling from a broker that has have multiple topics and each topic has multiple partitions. I have a total of 5 consumers in the same consumer group. If each of my consumer does a poll, what will be the order of data that will be return?

Example my first consumer is assigned to the following partition:

topicA - partition 0

topicA - partition 1

topicB - partitions 0

topicC - partitions 3

topicD - partitions 5

My question is that in that single 1 poll will i receive all available messages from that topic/partition before moving to the next topic/partition in order? meaning for example:

In a single poll loop, I received this in order...

Behavior A

topicA - partition 1 - received message from offset 1000...2000

topicA - partition 0 - received message from offset 500...700

topicB - partition 0 - received message from offset 100...150

topicC - partitions 3 - received message from offset 5500...6000

topicD - partitions 5 - received message from offset 0...100

or in that single 1 poll loop, it is possible to receive this message order? That same partition and topic will be split (topicA partition 1, topicA - partition 0 and topicC - partitions 3)

Behavior B

topicA - partition 1 - received message from offset 1000...1499

topicA - partition 0 - received message from offset 500...520

topicA - partition 1 - received message from offset 1500...2000

topicB - partition 0 - received message from offset 100...150 - same as behavior A no split

topicC - partitions 3 - received message from offset 5500...5799

topicA - partition 0 - received message from offset 521...700

topicD - partitions 5 - received message from offset 0...100 - same as behavior A no split

topicC - partitions 3 - received message from offset 5800...6000

I want to know about this behavior and if it guarantee and will be consistent to be behavior A or B or can be configure. I have search this up but couldn't find it anywhere in doc or question asked before. I have also tested it myself and it seems to always be behavior A but I want to confirm it. Thanks and appreciate for any help in advance.

1

1 Answers

0
votes

Unfortunately, with multiple partitions ordering of the messages is not preserved. From the Apache Documentation https://kafka.apache.org/082/documentation/ :

Kafka only provides a total order over messages within a partition, not between different partitions in a topic. Per-partition ordering combined with the ability to partition data by key is sufficient for most applications. However, if you require a total order over messages this can be achieved with a topic that has only one partition, though this will mean only one consumer process.