I have been doing a lot of reading about HBase lately and I am little confused as to the role of HMaster and Zookeeper in the architecture of HBase.
- When a client requests for data, who gets that request? Assuming this is the first request. I understand subsequent requests can be directly made to region servers. But for that to happen, locations of meta files need to be retrieved and then a get or scan needs to run on the specific meta table in the region server.
The reason I ask is, if I am using Java I would use HConnectionManager class to create a connection. It looks like HConnectionManager already has a cache of region locations available. The reason the cache is built will be when some number of requests are made earlier, but what if the cache isn't there and this is the first request.
Who takes the first HBase request, will it be the zookeeper quorum? We are submitting the hbase-site.xml file for the HBaseConfiguration class.
- Also I am a little confused about how do we define a "client"?
The other thing I read was the meta information gets cached on the "client", is this true even in case of HBase REST? Will the client here be the HMaster or the actual user who is making the REST call. If so doesn't it expose a security threat if metadata is exposed to client.