I've been working on HBase from a couple of week, still in design state for my project with an ongoing POC. Now before i ask my query let me give brief description of what i've inferenced.
The basic unit of horizontal scalability in HBase is called a Region. Regions are a subset of the table’s data and they are essentially a contiguous, sorted range of rows that are stored together. when regions become too large after adding more rows, the region is split into two at the middle key, creating two roughly equal halves.
The multi-map structure of a HBase table can thus be summarized as key -> family -> column -> timestamp -> value.
HBase, internally, keeps special catalog tables named -ROOT- and .META. within which it maintains the current list, state, and location of all regions afloat on the cluster. The -ROOT- table holds the list of .META. table regions. The .META. table holds the list of all user-space regions. Entries in these tables are keyed by region name, where a region name is made of the table name the region belongs to, the region’s start row, its time of creation, and finally, an MD5 hash of all of the former
Numbers of rows that can be stored in a region depends upon threshold value defined for a region i.e. this is something what i believe can be given manually.
SO what i want to do is :-
If a table with USERID , ROLE & YEAR is their with lets say millions of tuples. I want to create two layers. One layer with region nodes differentiated on year's range. lets say one region stored data from 1990 - 1995 , another stores data from 1996 - 2000 and so on. & second layer having differentiated on roles. for example one region node keeps data for admin (id -1), another for users(id -2) and so on. Each layer has its own region server and mapped in meta table and meta table managed by ZOOKEEPER. Refer below figure for further clarification :-
Perhaps more than one zookeepers may work in sync managed by another zookeeper above them.
So this is the design what i'll be proposing and i want to inquire about its feasibility