I have a 3 member kafka-cluster setup, the __consumer_offsets topic has 50 partitions.
The following is the result of describe command on:
root@kafka-cluster-0:~# kafka-topics.sh --zookeeper localhost:2181 --describe
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:1 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 1 Leader: -1 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 2 Leader: 0 Replicas: 0 Isr: 0
Topic: __consumer_offsets Partition: 3 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 4 Leader: -1 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 5 Leader: 0 Replicas: 0 Isr: 0
...
...
Member's are node 0, 1 and 2.
As it's obvious, The partitions in replica=2, have no leader set for them, and their leader=-1
I'm wondering what caused this issue, I restarted the 2nd member kafka service, but I never thought it would have this side effect.
Also right now, all nodes have been up for hours, this is the result of ls broker/ids:
/home/kafka/bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is disabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[0, 1, 2]
Also, there are many topics in the cluster, and node 2 is not leader for any of them, and wherever it only has the data (replication-factor=1, and partition hosted on this node), leader=-1, as obvious from below.
Here, node 2 is in the ISR, but never a leader, since replication-factor=2.
Topic:upstream-t2 PartitionCount:20 ReplicationFactor:2 Configs:retention.ms=172800000,retention.bytes=536870912
Topic: upstream-t2 Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: upstream-t2 Partition: 1 Leader: 0 Replicas: 2,0 Isr: 0
Topic: upstream-t2 Partition: 2 Leader: 0 Replicas: 0,1 Isr: 0
Topic: upstream-t2 Partition: 3 Leader: 0 Replicas: 1,0 Isr: 0
Topic: upstream-t2 Partition: 4 Leader: 1 Replicas: 2,1 Isr: 1,2
Topic: upstream-t2 Partition: 5 Leader: 0 Replicas: 0,2 Isr: 0
Topic: upstream-t2 Partition: 6 Leader: 1 Replicas: 1,2 Isr: 1,2
Here, node 2 is the only partition some chunks of data are hosted on, but leader=-1.
Topic:upstream-t20 PartitionCount:10 ReplicationFactor:1 Configs:retention.ms=172800000,retention.bytes=536870912
Topic: upstream-t20 Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: upstream-t20 Partition: 1 Leader: -1 Replicas: 2 Isr: 2
Topic: upstream-t20 Partition: 2 Leader: 0 Replicas: 0 Isr: 0
Topic: upstream-t20 Partition: 3 Leader: 1 Replicas: 1 Isr: 1
Topic: upstream-t20 Partition: 4 Leader: -1 Replicas: 2 Isr: 2
Any help with how to fix the leader not being elected is greatly appreciated.
Also It's great to know any possible implications this might have on how my brokers behave.
EDIT ---
Kafka Version: 1.1.0 (2.12-1.1.0) Space is available, like 800GB of free disk. log files are pretty normal, on node 2, below is the last 10 lines of the log file. please let me know if there's anything in particular I should look for.
[2018-12-18 10:31:43,828] INFO [Log partition=upstream-t14-1, dir=/var/lib/kafka] Rolled new log segment at offset 79149636 in 2 ms. (kafka.log.Log)
[2018-12-18 10:32:03,622] INFO Updated PartitionLeaderEpoch. New: {epoch:10, offset:6435}, Current: {epoch:8, offset:6386} for Partition: upstream-t41-8. Cache now contains 7 entries. (kafka.server.epoch.LeaderEpochFileCache)
[2018-12-18 10:32:03,693] INFO Updated PartitionLeaderEpoch. New: {epoch:10, offset:6333}, Current: {epoch:8, offset:6324} for Partition: upstream-t41-3. Cache now contains 7 entries. (kafka.server.epoch.LeaderEpochFileCache)
[2018-12-18 10:38:38,554] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-12-18 10:40:04,831] INFO Updated PartitionLeaderEpoch. New: {epoch:10, offset:6354}, Current: {epoch:8, offset:6340} for Partition: upstream-t41-9. Cache now contains 7 entries. (kafka.server.epoch.LeaderEpochFileCache)
[2018-12-18 10:48:38,554] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-12-18 10:58:38,554] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-12-18 11:05:50,770] INFO [ProducerStateManager partition=upstream-t4-17] Writing producer snapshot at offset 3086815 (kafka.log.ProducerStateManager)
[2018-12-18 11:05:50,772] INFO [Log partition=upstream-t4-17, dir=/var/lib/kafka] Rolled new log segment at offset 3086815 in 2 ms. (kafka.log.Log)
[2018-12-18 11:07:16,634] INFO [ProducerStateManager partition=upstream-t4-11] Writing producer snapshot at offset 3086497 (kafka.log.ProducerStateManager)
[2018-12-18 11:07:16,635] INFO [Log partition=upstream-t4-11, dir=/var/lib/kafka] Rolled new log segment at offset 3086497 in 1 ms. (kafka.log.Log)
[2018-12-18 11:08:15,803] INFO [ProducerStateManager partition=upstream-t4-5] Writing producer snapshot at offset 3086616 (kafka.log.ProducerStateManager)
[2018-12-18 11:08:15,804] INFO [Log partition=upstream-t4-5, dir=/var/lib/kafka] Rolled new log segment at offset 3086616 in 1 ms. (kafka.log.Log)
[2018-12-18 11:08:38,554] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
Edit 2 ----
Well I've stopped the leader zookeeper instance and now the 2nd zookeeper instance is elected as the leader! with this, the un-chosen leader issue is now resolved!
I don't know what might have gone wrong though, so any idea about "why changing zookeeper leader fixes the un-chosen leader issue" is very much welcome!
Thanks!