Kafka topic has partitions with leader=-1 (Kafka Leader Election), while node is up and running

Question

I have a 3 member kafka-cluster setup, the __consumer_offsets topic has 50 partitions.

The following is the result of describe command on:

root@kafka-cluster-0:~# kafka-topics.sh --zookeeper localhost:2181 --describe
Topic:__consumer_offsets    PartitionCount:50   ReplicationFactor:1 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
    Topic: __consumer_offsets   Partition: 0    Leader: 1   Replicas: 1 Isr: 1
    Topic: __consumer_offsets   Partition: 1    Leader: -1  Replicas: 2 Isr: 2
    Topic: __consumer_offsets   Partition: 2    Leader: 0   Replicas: 0 Isr: 0
    Topic: __consumer_offsets   Partition: 3    Leader: 1   Replicas: 1 Isr: 1
    Topic: __consumer_offsets   Partition: 4    Leader: -1  Replicas: 2 Isr: 2
    Topic: __consumer_offsets   Partition: 5    Leader: 0   Replicas: 0 Isr: 0
    ...
    ...

Member's are node 0, 1 and 2.

As it's obvious, The partitions in replica=2, have no leader set for them, and their leader=-1

I'm wondering what caused this issue, I restarted the 2nd member kafka service, but I never thought it would have this side effect.

Also right now, all nodes have been up for hours, this is the result of ls broker/ids:

/home/kafka/bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is disabled

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[0, 1, 2]

Also, there are many topics in the cluster, and node 2 is not leader for any of them, and wherever it only has the data (replication-factor=1, and partition hosted on this node), leader=-1, as obvious from below.

Here, node 2 is in the ISR, but never a leader, since replication-factor=2.
Topic:upstream-t2   PartitionCount:20   ReplicationFactor:2 Configs:retention.ms=172800000,retention.bytes=536870912
    Topic: upstream-t2  Partition: 0    Leader: 1   Replicas: 1,2   Isr: 1,2
    Topic: upstream-t2  Partition: 1    Leader: 0   Replicas: 2,0   Isr: 0
    Topic: upstream-t2  Partition: 2    Leader: 0   Replicas: 0,1   Isr: 0
    Topic: upstream-t2  Partition: 3    Leader: 0   Replicas: 1,0   Isr: 0
    Topic: upstream-t2  Partition: 4    Leader: 1   Replicas: 2,1   Isr: 1,2
    Topic: upstream-t2  Partition: 5    Leader: 0   Replicas: 0,2   Isr: 0
    Topic: upstream-t2  Partition: 6    Leader: 1   Replicas: 1,2   Isr: 1,2


Here, node 2 is the only partition some chunks of data are hosted on, but leader=-1.
Topic:upstream-t20  PartitionCount:10   ReplicationFactor:1 Configs:retention.ms=172800000,retention.bytes=536870912
    Topic: upstream-t20 Partition: 0    Leader: 1   Replicas: 1 Isr: 1
    Topic: upstream-t20 Partition: 1    Leader: -1  Replicas: 2 Isr: 2
    Topic: upstream-t20 Partition: 2    Leader: 0   Replicas: 0 Isr: 0
    Topic: upstream-t20 Partition: 3    Leader: 1   Replicas: 1 Isr: 1
    Topic: upstream-t20 Partition: 4    Leader: -1  Replicas: 2 Isr: 2

Any help with how to fix the leader not being elected is greatly appreciated.

Also It's great to know any possible implications this might have on how my brokers behave.

EDIT ---

Kafka Version: 1.1.0 (2.12-1.1.0) Space is available, like 800GB of free disk. log files are pretty normal, on node 2, below is the last 10 lines of the log file. please let me know if there's anything in particular I should look for.

[2018-12-18 10:31:43,828] INFO [Log partition=upstream-t14-1, dir=/var/lib/kafka] Rolled new log segment at offset 79149636 in 2 ms. (kafka.log.Log)
[2018-12-18 10:32:03,622] INFO Updated PartitionLeaderEpoch. New: {epoch:10, offset:6435}, Current: {epoch:8, offset:6386} for Partition: upstream-t41-8. Cache now contains 7 entries. (kafka.server.epoch.LeaderEpochFileCache)
[2018-12-18 10:32:03,693] INFO Updated PartitionLeaderEpoch. New: {epoch:10, offset:6333}, Current: {epoch:8, offset:6324} for Partition: upstream-t41-3. Cache now contains 7 entries. (kafka.server.epoch.LeaderEpochFileCache)
[2018-12-18 10:38:38,554] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-12-18 10:40:04,831] INFO Updated PartitionLeaderEpoch. New: {epoch:10, offset:6354}, Current: {epoch:8, offset:6340} for Partition: upstream-t41-9. Cache now contains 7 entries. (kafka.server.epoch.LeaderEpochFileCache)
[2018-12-18 10:48:38,554] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-12-18 10:58:38,554] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-12-18 11:05:50,770] INFO [ProducerStateManager partition=upstream-t4-17] Writing producer snapshot at offset 3086815 (kafka.log.ProducerStateManager)
[2018-12-18 11:05:50,772] INFO [Log partition=upstream-t4-17, dir=/var/lib/kafka] Rolled new log segment at offset 3086815 in 2 ms. (kafka.log.Log)
[2018-12-18 11:07:16,634] INFO [ProducerStateManager partition=upstream-t4-11] Writing producer snapshot at offset 3086497 (kafka.log.ProducerStateManager)
[2018-12-18 11:07:16,635] INFO [Log partition=upstream-t4-11, dir=/var/lib/kafka] Rolled new log segment at offset 3086497 in 1 ms. (kafka.log.Log)
[2018-12-18 11:08:15,803] INFO [ProducerStateManager partition=upstream-t4-5] Writing producer snapshot at offset 3086616 (kafka.log.ProducerStateManager)
[2018-12-18 11:08:15,804] INFO [Log partition=upstream-t4-5, dir=/var/lib/kafka] Rolled new log segment at offset 3086616 in 1 ms. (kafka.log.Log)
[2018-12-18 11:08:38,554] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)

Edit 2 ----

Well I've stopped the leader zookeeper instance and now the 2nd zookeeper instance is elected as the leader! with this, the un-chosen leader issue is now resolved!

I don't know what might have gone wrong though, so any idea about "why changing zookeeper leader fixes the un-chosen leader issue" is very much welcome!

Thanks!

Which Kafka version ? Is there space available on broker 2 drive ? What do log files say ? If it's a production cluster, you should consider increasing replication factor for __consumer_offsets. — Gery
Did you explicitly set broker.id in the server.properties? How did you install Kafka? Maybe broker 2 property file vs the others can help someone reproduce the problem — OneCricketeer
@SpiXel grep for error in logfiles. Identify controller and check its controller.log, maybe leader election task crashed. Check if broker 2 is still visible from other brokers. — Gery
@cricket_007 Yes, they're explicitly set in server.properties. Kafka's binary is downloaded from the website and extracted to user's home directory, everything is run by the provided scripts. Diff of config files is limited to: broker.id and advertised.listeners. Restarting the leader zookeeper service (changing leaders) actually fixed the election process, I'm having a hard time figuring out why the cluster might have entered that state though. — SpiXel
FWIW, I would suggest increasing the replication factor of the offsets topic — OneCricketeer

Dennis Jaheruddin Dennis Jaheruddin · Accepted Answer · 2019-07-15T11:32:38

Though the root cause was never identified, it seems that the asker did find a solution:

I've stopped the leader zookeeper instance and now the 2nd zookeeper instance is elected as the leader! with this, the un-chosen leader issue is now resolved!

Kafka topic has partitions with leader=-1 (Kafka Leader Election), while node is up and running

1 Answers