To maintain the server, one of the 20 brokers was shutdown gracefully, but all kafka-connect cluster (sink) died with the following NPE error. Replication-factor of all topics was more than 2, there were 50 topics and 200 partitions. Checking up the error and the Kafka library source code, it seems that the error occurred when the Connect client cached the metadata including the broker node id set and partition info set information received from the broker.
How can this happen, and how to deal with it in the future? (the Version of Broker and Client is v2.3.1)