0
votes

We have five node cluster running in production with 3 zookeepers - all are VMs. We have to restart the cluster often for some hardware patching.

We have written an ansible script to shutdown the cluster in the following order,

  1. Stop Kafka connect (1, 2, 3 nodes sequentially) by killing the process
  2. Stop Kafka (1, 2, 3, 4, 5 nodes sequentially) using kafka-server-stop.sh
  3. Stop Zookeeper (1, 2, 3 nodes sequentially) using zookeeper-server-stop.sh

After patching, start script will do the following

  1. Start Zookeeper (1, 2, 3 nodes sequentially) using zookeeper-server-start.sh
  2. Start Kafka (1, 2, 3, 4, 5 nodes sequentially) using kafka-server-start.sh
  3. Start Kafka connect (1, 2, 3 nodes sequentially) using connect-distributed.sh

The issue is with the #3 step of start script, we have kept a hard coded delay about 10 mins before executing #3 (starting kafka connect) to make kafka cluster is fully up and running. But sometimes, some of the nodes in the cluster take more time to start, hence kafka connect start up fails even after the delay - In this case we have to wait for 30 mins and try restarting the connect manually again.

Is there any way to make sure that all nodes in the cluster is up and running, before I start the other processes?

Thanks in Advance.

Hard coded delay does not work, we can't keep on changing the delay with some assumption

2

2 Answers

1
votes

Once all brokers have been started we can use following cmds, to check whether they have formed a cluster or not.

  • From kafka-1 run the following command against the rest of the brokers, i.e. i = 2, 3, 4 and 5:

    • nc -vz kafka-i 9092 [It should return connection succeeded]
  • tail the server.log in each broker node. It should give the info about the cluster.

  • From Kafka bin directory, You can periodically run ./zookeeper-shell.sh zk_host:zk_port and execute ls /brokers/ids. It should gives you five entries, e.g. [0, 1, 2, 3, 4] if all 5 brokers have registered to the zookeeper.

One dirty (less involved) hack might be to create a test topic with 5 partitions, and wait until each broker gets 1 partition to itself.

1
votes

From the Java API, you can use AdminClient#describeCluster, which can return you the list of nodes in the cluster currently known to the Controller.


Also, do not do broker rolling restarts like that. You should first identify which is the Contoller, then shut that down last.