1
votes

I have started to learn Kafka and during the learning, I have faced some confusion with the working process of Kafka and I am going to share with you those confusions and expect clarification.

(1.) created 3 brokers with 3 replication-factor and 3 partitions

enter image description here

when I push the message to the broker then the message will be received by one of the leader partitions in a broker and send it to its replicas.

which way is it used to pass a message with replicas? (approach-1 or approach-2 or another way)

  1. approach-1

enter image description here

  1. approach-2enter image description here

(2.) If I create one broker with 3 partitions

enter image description here

then the message will be received by the leader partition and it belongs to the leader then what is the use of the other 2 partitions?

2

2 Answers

2
votes

which way is it used to pass a message with replicas?

It will be approach-1. If you write some messages to partition 2 of a topic, the same message will be replicated on the replicated topics also in partition 2.

If I create one broker with 3 partitions then the message will be received by the leader partition and it belongs to the leader then what is the use of the other 2 partitions?

There seems to be a confusion about the difference between "partitions" and "replicas". Those are two completely different things. I have written an answer about this in another post. The key points are:

"partitions": Data within a topic is split into partitions. Increasing the number of partitions will increase the parallelism and therefore the throughput of your application as you can have at most one consumer within a ConsumerGroup reading a partition.

"replication": A replicated partition contains exactly the same data of the leader. So the same message is stored multiple times. This ensures durability as the same message is located on different brokers. In case of a broker failure, Kafka can switch the leader and provide the replicated message to its clients. In case you have 3 partitions but only a replication factor of 1, then if that one broker goes down all your data (from all partitions) are gone.

0
votes

@mike gave a very good answer. For beginners who are confused with question 2, here is an explanation.

The concept of partition leader will only make sense if you have multiple replications. The leadership is across replications, not partitions. In the question post's visualization, the leadership is horizontally across brokers, not vertically across partitions in one broker.

Another way to think of scenario 2 is that every partition without replication is a leader of itself. You have three non-replicated partition, then each one is equally important in general.