1
votes

I'm new to Kafka and I'm trying to understand partitions. The general explanations online have not helped so I want to build a simple example to understand.

Lets say we have:

  • 2 topics
    • Colours with 2 partitions
    • Numbers with 2 partitions
  • 1 broker
  • No replication

In this case, is the diagram below how the data will look? Here the data is input to partitions round robin, both partitions take some of the data from both topics. If this is accurate, how do consumers get each next value? They would need to move from partition to partition, but they could find data from a different topic.

(diagram below is inaccurate!) enter image description here

UPDATE: Based on comments I think this is more accurate: enter image description here

3
partitions are per topic - you seem to have mixed them up?9bO3av5fw5

3 Answers

1
votes

Topics are divided into partitions, where each partition will only have data for a single topic.

From "Kafka: The Definitive Guide":

Topics are additionally broken down into a number of partitions

So your diagram should show two topics with a partition each.

Obviously the data from multiple topics ends up being mixed on disk (good explanation here).

0
votes

Please have a look at this detailed explanation. It has some really good diagrams that show entire architecture from zookeeper to broker structure.

For log compaction and other low level design stuff, please go through these.

0
votes

Hope these images help you to understand. Sometimes images are better than words.

enter image description here enter image description here enter image description here