When we use zookeeper for coordination of a set of Hbase nodes , then zookeeper can be housed on the same nodes as Hbase nodes or it MUST be housed on a seperate cluster? Also one thing I am not able to understand clearly is when are Zookeeper znodes created and what is the purpose of a zookeeper znode? The Zookeeper official site says these are part of the zookeeper filesystem, so what exactly is the zookeeper znode used to store?Is it configuration properties, application data or exactly what content?
3 Answers
You can run zookeeper on the same nodes as hbase, but for performance reasons you may want to run zookeeper on separate nodes. The hbase docs say,
You can run a ZooKeeper ensemble that comprises 1 node only but in production it is recommended that you run a ZooKeeper ensemble of 3, 5 or 7 machines; the more members an ensemble has, the more tolerant the ensemble is of host failures. Also, run an odd number of machines. There can be no quorum if the number of members is an even number. Give each ZooKeeper server around 1GB of RAM, and if possible, its own dedicated disk (A dedicated disk is the best thing you can do to ensure a performant ZooKeeper ensemble). For very heavily loaded clusters, run ZooKeeper servers on separate machines from RegionServers (DataNodes and TaskTrackers).
You can see some of the ways hbase uses zookeeper here.
To answer the following,
I am not able to understand clearly is when are Zookeeper znodes created and what is the purpose of a zookeeper znode? so what exactly is the zookeeper znode used to store?Is it configuration properties, application data or exactly what content?
Zookeeper ZNodes are part of Zookeeper state and is the data node of Zookeeper. It is a folder as well as stores data. You can store small amount of data on these znode. All znodes store data, and all znodes except for ephemeral znodes, can have children. Zookeeper clients can manipulate the znode and data through the ZooKeeper API. Read Zookeeper tutorial and client API examples article to know more
Zookeeper uses standard UNIX notation for znode paths. For example,the znode path /A/B/C to denote the path to znode C, where C has B as its parent and B has A as its parent.
There are three types of ZNodes:
Regular: Clients manipulate regular znodes by creating and deleting them explicitly.
Ephemeral: Clients create such znodes, and they either delete them explicitly, or let the system remove them automatically when the session that creates them terminates.
Sequential: These Znodes when created, gets a unique number (sequence) suffixed to its name.
The Hbase master server creates the zookeeper znode /hbase . This is then used for hbase daemons to coordinate. Even the name of the active Hbase master is stored here. If the hbase master dies, the backup hbase master overwrites the contents of the znode so clients and region servers know about the new master. Apart from this, region info is maintained in zookeeper znodes as well.