2
votes

In the Kafka documentation:

Kafka handles this differently. Our topic is divided into a set of totally ordered partitions, each of which is consumed by one consumer at any given time. This means that the position of consumer in each partition is just a single integer, the offset of the next message to consume. This makes the state about what has been consumed very small, just one number for each partition. This state can be periodically checkpointed. This makes the equivalent of message acknowledgements very cheap.

Yet, following their quick start guide in that same document, I was easily able to:

  1. Create a topic with a single partition
  2. Start a console-producer
  3. Push a few messages
  4. Start a consumer to consume --from-beginning
  5. Start another consumer --from-beginning

And have both consumers successfully consume from the same partition.

But this seems at odds with the documentation above?

1

1 Answers

4
votes

When using different consumer groups, consumers can consume the same partitions easily. You may consider group ids as different applications consuming a Kafka topic. Multiple different applications might want to use the data in a Kafka topic differently and thus not to conflict with other applications. That's why two consumers may consume one partition (in fact the only way how two consumers can consume one partition).

And when you start a console consumer it randomly generates a group id for it (link) thus these consumers are doing exactly what I just wrote.