1
votes

I've been struggling with the 3 node Kafka cluster setup. I have looked at all the SO answers and seem to be doing everything right. However, Zookeeper fails to synchronize and therefore kafka servers don't connect.

Here is my zookeeper config

dataDir=/home/kafka/zookeeper/data
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=50
server.1=0.0.0.0:2888:3888
server.2=139.59.80.73:2888:3888
server.3=139.59.80.76:2888:3888
initLimit=5
syncLimit=2

On each of the other servers I have set the server.id to 0.0.0.0 as suggested in one of the SO answers. So server.2 will have 0.0.0.0 on the second machine. I have double checked the myid file in the data directory to have the corresponding ids as well.

Even after waiting for a while zookeeper services don't sychronize and I keep seeing these exceptions:

2017-07-31 12:40:49,110] WARN Cannot open channel to 1 at election address /139.59.80.4:3888 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.NoRouteToHostException: No route to host (Host unreachable)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:562)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:538)
    at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:452)
    at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:433)
    at java.lang.Thread.run(Thread.java:748)

What bothers me is that I can telnet to each other machine on port 2181 but zookeeper service fails to connect. Totally perplexed. Any help will be greatly appreciated.

3

3 Answers

1
votes

Turns out to be a port issue. The following ports 2888 and 3888 have to be open. I had disabled IPTables on Linux but that never seemed to work. I moved to AWS and opened the two ports, zookeeper started fine.

0
votes

I have installed zookeeper in ensemble mode on 4 machine cluster on RedHat Linux. I was having the same issue, after trying everything form stackoverflow what i did was turning off iptables and firewall on all machines but still there was error, after a few thoughts i tired enabling passwordless ssh for all machines and this solved my problem now my zookeper's are up and running on all 4 machines

0
votes

It's a quite old question but I have had the same issue and want to share my solution. I had this problem in Centos 7.

First of all you can telnet to ip_addr:3888 to be sure to not reaching that host/port:

telnet ip_addr 3888
telnet ip addr 2888

Secondly you can stop firewall and then telnet again to be sure that it is a firewall problem:

sudo systemctl stop firewalld.service

If it works after that, then it is absolutely a firewall issue. To solve the problem you should open the ports:

 sudo firewall-cmd --zone=public --add-port=3888/tcp --permanent
 sudo firewall-cmd --zone=public --add-port=2888/tcp --permanent
 sudo firewall-cmd --reload