2
votes

I have cluster setup of Hbase:

1 HMaster Node and 3 Region Servers

I would like to know that: When we insert the multiple rows in table, than how Hbase split the records across multiple regionServer ?

Does Hfile has sorted key value records(rowKey:cf:TimeStamp) ?

If yes than How Hbase maintain the sorted key order in Transaction Table.

I read that META table mentain the table information like (Table_name, Region(StartKey-EndKey)) is it correct ?

1
I got your point upto some extend. If you can give me some idea on this example than it would be great. i.e I have table T1 which has rowKey as studentID(Integer). I have 1 Master and 3 Region server(R1,R2,R3). Lets assume that I split the record as 300 row. Now multiple people entering the records in T1 with key between [1-1000]. So now who is mainting the sorted order of Key in Hfile ? I read that META table has info like [T1,R1[1-300], [T1,R2[301-600], is it correct ? if yes than who take care for this entry ?Amit Nagar

1 Answers

1
votes

I'm a bit confused by your questions, but when you insert multiple rows into tables, lookups are made to the .META. table to find which region should get the mutations and the client then sends it to the corresponding hbase regionserver.

HFiles are indeed sorted files with keyvalues, which look more like

<keylength> <valuelength> <rowlength> <row> <columnfamilylength> <columnfamily> <columnqualifier> <timestamp> <keytype> <value>

http://hbase.apache.org/book/hfilev2.html

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/KeyValue.html

Sorting occurs in all compactions i.e. when you add a row it is added to the memtable, which once filled with enough rows in the memtable it will sort them and dump them to a HFile(i.e Merge compaction). When multiple HFiles exist for a region HBase will merge them all together in a sorted fashion (called Major Compaction).

The META Table maintains Region information, such as table name, region start key, end key, and which server is serving it.