How to choose zookeeper and regionserver

Question

What is a good practice for setting regionserver and zookeeper quorum ?

I have a small hadoop cluster with 16 nodes. Following the example given in http://hbase.apache.org/book/example_config.html I choose as regionserver the 16 nodes and a subset of these nodes as zookeeper.

But when one job is launched by a node which is not in the list corresponding to hbase.zookeeper.quorum I get the following error :

13/08/23 15:40:05 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error) 13/08/23 15:40:05 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 13/08/23 15:40:05 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 13/08/23 15:40:05 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session 13/08/23 15:40:05 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid 13/08/23 15:40:05 INFO util.RetryCounter: Sleeping 2000ms before retry #1...

So it tries to conncet for 600 sec and then return

Task attempt_xxx failed to report status for 60 seconds. Killing!

After a few attempts it changes node and if by chance the new node belongs to the zookeeper list then the job finishes with succes.

Is this normal?

I ended up adding all nodes to the zookeeper list but I would like to know if it is a good practice. Also is there anycase where the list of regionserver should differ from the node list?

Thank you

jtravaglini jtravaglini · Accepted Answer · 2013-08-23T14:18:56

No, it doesn't look like what you're doing is a good practice. For a 16 RS cluster, 1 ZK node should be just fine.

Check out the ZK Admin guide:

For the ZooKeeper service to be active, there must be a majority of non-failing machines that can communicate with each other. To create a deployment that can tolerate the failure of F machines, you should count on deploying 2xF+1 machines. Thus, a deployment that consists of three machines can handle one failure, and a deployment of five machines can handle two failures. Note that a deployment of six machines can only handle two failures since three machines is not a majority. For this reason, ZooKeeper deployments are usually made up of an odd number of machines.

Although it doesn't say it there, a ZK cluster should be no bigger than 7 nodes. Given the recommendation of an odd number of nodes, that leaves the options of 1, 3, 5, and 7. Again for a smallish cluster like yours, 1 should suffice, but 3 will give you resiliency. 5 is probably overkill. 7 definitely is.

Also, looking at the error you pasted:

java.net.ConnectException: Connection refused

This would appear to indicate either:

Hadoop misconfiguration: you pointed to the wrong server/port, or the service is not currently running, or more likely -
Network misconfiguration, such as a firewall like iptables running

How to choose zookeeper and regionserver

1 Answers