0
votes

It sounds like replicas will not be allotted to other brokers in case of broker failures. I created simple test set up with 3 brokers. I created a topic with partition = 13, replicas = 3.

I brought down one broker(broker 1) , I see "ISR" and "Leader" got updated to reflect that fact ( Though, the replica list still shows broker id of the broker that got shut down just now ) .

I bootstrapped another brand new broker with id 4. At this point, I was assuming, the kafka could create replicas on this broker of the above topic which doesn't seem so, any reason why?

So, why is that Kafka designed NOT to create the replica on other available machines if one of the brokers is down that holds the replicas. It just merely switches leader flag

PS : I understand from the documentation that - replicas will NOT auto heal by themselves. But, what is the reasoning behind as the implicit assumption in distributed system is to create the replicas on available machines to circumvent the fact the some of the replicas are not available.

Looking closely through the documentation

The Kafka cluster will automatically detect any broker shutdown or failure and elect new leaders for the partitions on that machine.

Confirms that, kafka will not do anything with respect to creating additional replicas if broker(s) down.

  1. what was the reasoning behind replica is not created in any available machine ?

  2. will it not be created at all? If yes, replica count might be different than original count ?

enter image description here

1

1 Answers

2
votes

That's correct, by design Kafka does not "auto heal".

Moving replicas onto new brokers can be an expensive operation. A partition can contain terabytes of data so copying between brokers can add a huge load to the cluster. This used bandwidth would not be available to users.

If you are using enough replicas, there is no user impact when a broker is down. Also, Kafka expects brokers to return after a failure. So instead of bootstrapping a new broker from zero, getting the original broker in sync when it returns is a much cheaper operation.

There are tools (like Cruise Control) that can automatically "auto heal" Kafka in some conditions. Also if you are expecting some brokers to be down for prolonged periods of time, you can move partitions to other brokers to avoid missing a replica. The decomissioning brokers section in the docs covers it.