A Kafka broker is gracefully shutdown, and incorrect metadata was passed to the Kafka connect client

Question

To maintain the server, one of the 20 brokers was shutdown gracefully, but all kafka-connect cluster (sink) died with the following NPE error. Replication-factor of all topics was more than 2, there were 50 topics and 200 partitions. Checking up the error and the Kafka library source code, it seems that the error occurred when the Connect client cached the metadata including the broker node id set and partition info set information received from the broker.

How can this happen, and how to deal with it in the future? (the Version of Broker and Client is v2.3.1)

Mickael Maison Mickael Maison · Accepted Answer · 2020-05-25T11:09:15

This is a bug. The Connect cluster should not be negatively impacted by a broker shutting down and it should not throw an NPE.

Please open a ticket in https://issues.apache.org/jira/projects/KAFKA/issues/. It's also best it you paste the stack trace as text instead of an image.

A Kafka broker is gracefully shutdown, and incorrect metadata was passed to the Kafka connect client

1 Answers