0
votes

I have a use case I want to set up a Kafka cluster initially at the starting I have 1 Kafka Broker(A) and 1 Zookeeper Node. So below mentioned are my queries:

  • On adding a new Kafka Broker(B) to the cluster. Will all data present on broker A will be distributed automatically? If not what I need to do distribute the data.

  • Not let's suppose somehow the case! is solved my data is distributed on both the brokers. Now due to some maintenance issue, I want to take down the server B.

    • How to transfer the data of Broker B to the already existing broker A or to a new Broker C.
  • How can I increase the replication factor of my brokers at runtime

  • How can I change the zookeeper IPs present in Kafka Broker Config at runtime without restarting Kafka?

  • How can I dynamically change the Kafka Configuration at runtime

Regarding Kafka Client:

  • Do I need to specify all Kafka broker IP to kafkaClient for connection? And each and every time a broker is added or removed does I need to add or remove my IP in Kafka Client connection String. As it will always require to restart my producer and consumers?

Note:

Kafka Version: 2.0.0
Zookeeper: 3.4.9
Broker Size : (2 core, 8 GB RAM) [4GB for Kafka and 4 GB for OS]
2
Did you try kafka.apache.org/documentation/#operations? Lots of this is covered there.Robin Moffatt
Yes, I tried that after reading that these points were not clear.Abhimanyu
If you need to change broker property such as Zookeeper string, you need to bounce it. Not everything can be done dynamically or at runtimeOneCricketeer

2 Answers

1
votes

To run a topic from a single kafka broker you will have to set a replication factor of 1 when creating that topic (explicitly, or implicitly via default.replication.factor). This means that the topic's partitions will be on a single broker, even after increasing the number of brokers.

You will have to increase the number of replicas as described in the kafka documentation. You will also have to pay attention that the internal __consumer_offsets topic has enough replicas. This will start the replication process and eventually the original broker will be the leader of every topic partition, and the other broker will be the follower and fully caught up. You can use kafka-topics.sh --describe to check that every partition has both brokers in the ISR (in-sync replicas).

Once that is done you should be able to take the original broker offline and kafka will elect the new broker as the leader of every topic partition. Don't forget to update the clients so they are aware of the new broker as well, in case a client needs to restart when the original broker is down (otherwise it won't find the cluster).

0
votes

Here are the answers in brief:

  • Yes, the data present on broker A will also be distributed in Kafka broker B

  • You can set up three brokers A, B and C so if A fails then B and C will, and if B fails then, C will take over and so on.

  • You can increase the replication factor of your broker you could create increase-replication-factor.json and put this content in it:

    {"version":1, "partitions":[ {"topic":"signals","partition":0,"replicas":[0,1,2]}, {"topic":"signals","partition":1,"replicas":[0,1,2]}, {"topic":"signals","partition":2,"replicas":[0,1,2]} ]}

To increase the number of replicas for a given topic, you have to:

Specify the extra partitions to the existing topic with below command(let us say the increase from 2 to 3)

bin/kafktopics.sh --zookeeper localhost:2181 --alter --topic topic-to-increase --partitions 3

There is zoo.cfg file where you can add the IP and configuration related to ZooKeeper.